Thursday, November 18, 2010

The Science of Secrecy - Part I

I have decided to repost a few of my favorite blogs that I have posted on my workplace blog. This blog was posted on 04-FEB-2008, after I was heavily influenced by Simon Singh's The Codebook.

Warning: It's quite long. You might have to cancel that movie-watching idea and try reading this.

~~~***~~~

The Science of Secrecy - Part I

Dear citizen of the Internet,

I have a few questions for you. A few important questions. The answers for which, you have taken for granted or neglected.

1. How secure is your gmail or yahoomail(or any other) password on the Internet?
2. How secure are your emails on the Net? Do you believe that your mails are NOT being read by complete strangers?
3. Do you think gmail or yahoomail admins CANNOT read your personal mails? Or even worse, some faraway hacker (who is in no way related to gmail or yahoomail) can read your mail?
4. Do you think you are the only one who knows your e-banking password? Or you think someone else cannot transfer your money to their accounts?
5. Or in the most general sense, Do you think the Internet is totally secure beyond doubts? Is your so-called "Privacy" completely guaranteed?

Well, if these set your mind to contemplate, Fear not! The Internet is secure. But we, netizens, have totally failed to appreciate and recognise a 2000 year old field of study that has given us today such freedom and privacy in this world wide jungle called Internet.

No field of study has undergone as many controversies, intellectual breakthroughs, military espionage and a fierce race for superiority internally among 2 groups. This field of study has created and conquered kingdoms, has affected the wealth and army of empires, provoked and stopped wars and finally has opened up a global economy through E-Commerce.

This blog is about the mysterious science of secrecy - Cryptography!

The Science of Secrecy - Part 1

Allow me to take you through this amazing science from the times of Caesar to Commerce on the Internet.

First let's brush up on the basics of cryptography with an example and an analogy. It is important that you understand a few terms now to enable further understanding. The letters in bold are very important.

Alice and Bob are 2 common people(just like us) who want to share a secret message between them. Let us say that Alive wants to pass on the message "I love apples" to Bob. This "I love apples" is called the plaintext. Now a stranger called Eve wants to intercept and understand their message.

Alice and Bob suspect that someone might want to intercept their message. So Alice uses an encryption system to encrypt her plaintext. An encryption system is a techinque used to scramble the plaintext into something that cannot be understood by a third person.

For example, Alice uses this technique:

She takes her message letter by letter and replaces them with some other letter of the alphabet. Her choice is to replace the letters by the next letter in the alphabet i.e, she shifts each letter by one place.

So "I LOVE APPLES" becomes "J MPWF BQQMFT". Now, "J MPWF BQQMFT" becomes the ciphertext. Alice sends the ciphertext to Bob who can reverse the ciphertext back to the plaintext(I LOVE APPLES) provided he knows the encryption technique and the key. Alice's encryption technique was to replace the letters of the plaintext by someother letter of the alphabet. Bob must know this to reverse the ciphertext back to the plaintext.

But is that enough? No! He needs to know which set of letters to replace with. Alice has shifted the alphabet by one letter:

Plain alphabet : A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Cipher alphabet : B C D E F G H I J K L M N O P Q R S T U V W X Y Z A


Alice has used the letter in the bottom row corresponding to the one in the top row. So shifting by one letter is the key. So Bob needs to know two things to reverse the ciphertext back to the plaintext.

1. The Encryption System and
2. The key she has used.

Knowing these, the reversing operation is easy for Bob. So Alice encrypts the plaintext to the ciphertext using an encryption system and a key. And Bob decrypts the ciphertext back to the plaintext using the encryption system and key. So far, We've learned the essence of modern cryptography.

Now lets see Eve's role. Eve wants to know what Alice sent to Bob. Eve can capture the ciphertext - "J MPWF BQQMFT". But she's clueless about what it means. From what we've learned, she needs both the encryption system and the key to know what it means. But, if she can, in some way, know what it means without knowing the encryption system and key, she becomes a codebreaker or cryptanalyst.

So cryptyanalysis or codebreaking is the process of finding out the plaintext from the ciphertext without knowing the encryption system and the key. That method could be bruteforce attack, eaves-dropping, spying and at times by sheer brilliance in the form of frequency analysis.

So Alice's responsibility is to keep both of them secret to have secure communication with Bob. But the choice of the encryption system and key is supreme for strong, secure and unbreakable communication. Let us see why:

The strength of secure communication lies on the choice of
1. Encryption system and
2. Key.

Obviously because, if they are known or found out easily, then the plaintext could be easily found out. So Lets see if Alice's communication is strong enough to withstand attack by codebreakers.

1. Key : Let us assume that Eve(or any codebreaker) knows Alice's encryption system i.e., that she always shifts the alphabet by a few characters to the left. If this is known to Eve, then she could work out the key easily. If you noticed, the shifting can only be done in 25 different ways. So Eve can just try for a maximum of 25 times to get the plaintext. So this encryption system (shifting the alphabet by a few letters) is prone to bruteforce attack as the number of possible keys is very less (just 25).

2. Encryption System : Actually this is what the codebreakers try to find out first. Without knowing the encryption system, the number of possible keys cannot be found. So keeping a secret encryption system is the first step towards a secure communication. Keeping the encryption system as a secret has a few practical difficulties:

First, If the security of communication is solely dependent on the encryption system alone, it becomes a big overhead to keep it secret in the first place. Because imagine that Alice wants to communicate secretly with 100 people. She can use 100 different encryption system. But if she uses the same one, a codebreaker can easily deduce what system she is using.

Or simply the probability of the encryption system being known is very high when the number of persons using it increases. Simply put, a secret encryption system used between 2 people is very secure but not so when many are involved. (Do you know: Kamasutra consists of 64 essential skills and 1 of them is secret writing. A woman who learns the skills of kamasutra must know how to communicate secretly with her partner). (Present Day Edit : Hehe! Why the reference, Siva? Why?)

Coming back, The trust should not be on the system but on the key. And In this world of the internet, billions and billions of emails are sent per day by a billion people. So each one cannot use their own excryption system. So it is best to choose a publicly known encryption system which has a large choice of keys.

This is Kerckchoff's principle : "The security of a crypto-system must not depend on keeping secret the crypto-algorithm. The security depends only on keeping secret the key."

Having learnt all this, let us see a real life analogy. Alice wants to send Bob a confidential message on a sheet of paper by post. She suspects that the postal dept would do anything to read her message. So she decides to put it in a locked box and send it.

So now, putting the message into a locked box is analogous to the process of encryption. Now the immoral postal system can either break down the box or try a brute force method of using various keys till the box opens. From Alice's point of view, the security of this communication through the locked box depends on 2 things:

1. Strength of the box: The box should be strong enough to resist breaking. Alice can choose just a plastic box or an iron box. This is analogous to the choice of encryption system. The more complex, the more harder it is to break.

2. The Key : The key should be complex enough so that the box could not be opened by trying many random keys that the postal system has. This is analogous to the possible number of keys a crypto-system provides. (Ex: only 26 in Alice's system).

So as long as the box is strong and the key is complex, the message stays secure.

Evolution of Ciphers:

Cryptography is ever-evolving. This is because there has been a fierce battle between cryptographers (codemakers) and cryptanalysts (codebreakers) for supremacy. Once the codemakers come up with a strong crypto-system, it lasts for a few decades or centuries only to let the codebreakers come up with an ingenious method to break it. Information security is lost for sometime but yet again, the codemakers come up with another strong cryptosystem. Both groups have had their times of glory till now. And this battle has led cryptography to this present stage where crptographers are leading the race with people enjoying information privacy and codebreakers fighting to regain their place.

Let us see the evolution of ciphers:

1.Name of the cipher : Caesar cipher(circa 100BC)

Type : Monosubstitiutional (one letter in the plain alphabet is always replaced by the same letter in the cipher alphabet. ex: in Alice's "I LOVE APPLES" the 2 "P"s are always replaced by "Q". In a polysubstituional cipher, the second "P" can be replaced by some other letter).
Encryption algorithm : Shifting the alphabet.
Possible No of Keys : 26.
Method of breaking : Brute force.
Credits for breaking it : Unknown.

Till 16th century, monosubstitutional cipher was used in its various forms and symbols. And Arab cryptanalysts found an ingenious technique called frequency analysis to break it. So development in cryptography came to a stand-still until the 16-century. And then came...

2. Name of the cipher : Vigenere cipher (1523).

Type : Polysubstitutional.
Encryption algorithm : uses a table of alphabets (see table in this link).
Possible no of keys : Infinite (sender and receiver can agree upon any word).
Method of breaking : The vigenere cipher was theortically unbreakable if the key used is long and different each time. But it was practically flawed because such types of keys cannot be used in real life. So as repeated and short keys were used, advanced frequency analysis helped to break it.
Credits for breaking it : surprise.. surprise.. Charles Babbage and later by Freidrich Kasiski(1863).

Till 19th century all methods were breakable by codebreakers. But as the 20th century was born, there was a great demand for a secure crypto-system as the Radio was invented by Marconi and it was increasingly used by the military. In World War I, the german ADFGVX cipher was used which was a complex substitution and transposition cipher which was broken in just 5 years time. The codebreakers thus held an upperhand till the german Enigma was invented.

3. Name : Enigma & Lorentz(advanced) machine ciphers. Enigma was used by common military communication and Lorentz by the German High Command for very secure communication. The movie U-571 was based on the efforts to capture an Enigma machine.

Type : Transposition & polysubstitution using mechanical discs.
E.A : see the link on how it works (its amazing!)
Key : 1016(Enigma) and 1.5 x 1020(Lorentz).
Method of cracking : Truly amazing! A work of great genius by Alan Turing, arguably one of the greatest codebreakers ever. His life history and efforts are worth reading(see link below). He constructed a machine called Turing's Bomb which was the forerunner to the not-acclaimed first programmable computer - Colossus. ENIAC was the first to be acclaimed even when the Colossus was built first. (Do you know : Turing proposed a simple test called Turing Test to check for Artificial Intelligence. If a computer or Robot passes this test, it is said to be Intelligent. Link about Turing Test is given below. CAPTCHAs(those disfigured and twisted letters that you fill up in an account creation form on the Net) is actually an abbreviation of Completely Automated Public Turing test to tell Computers and Humans Apart!)

So after World War II, the Computer Era began and it exponentially increased computing power. And so it enabled the creation of complex and unbreakable crptosystems with a vast number of keys. The power of computers enabled the plaintext to be passed onto complex mathematical functions and loops to create a totally confusing ciphertext. As commercial computers spread, everyone was able to create their own complex cryptosystem. But then a new problem was created. A problem of plenty. There were so many good ciphers that a common method was needed for secure communication among everyone. And so in 1973, DES(Data Encryption Standard) was accepted in the US as a standard for secure communication.

DES is still used. It is powerful and unbreakable. It possible number of keys cannot be broken by brute force in a finite time even by a supercomputer. It is like having a box made of the strongest unbreakable element & having infinite number of possible keys. DES is both theroretically and practically unbreakable!

so is that all? Is our information secure when we use DES? Have the codemakers won?

The answer is both 'yes' and 'no'. 'yes' for the fact that an unbreakable cryptosystem has been created after 2000 years. Both the 2 factors (encryption system and possible no of keys) are unapproachable by codebreakers. A definite 'yes' for the greatest lock ever!

And a 'no' because of a problem that was overlooked for 2000 years - The probem of key distribution.

Imagine this - Alice puts a secret message in an unbreakable iron box. She locks it and sends it to Bob. Now how would Bob open it without the key? Alice could send the key to Bob. But what guarantee is there that the key would reach Bob safely? Anyone could take the key if they wish to read the message. So the key is as important as the message. So the only way for Alice is to distribute the key beforehand. That is, Alice should have made 2 keys for the lock and must have given it to Bob before. Now Bob can open the box and read the message. But here lies the biggest problem - If Alice wants to send a 100 messages per day to 100 people, how would she do it? She cant make a 100 similar keys and distribute to everyone? So in the actual cryptographic sense, a chosen key must be communicated between the sender and receiver. But sending this key demands secure communication which again depends on key distribution. So no matter how secure a cipher(like DES) is in theory, in practise it can be undermined by the problem of key distribution.

This problem has been prevailing for 2000 years unsolved. The Germans distributed keybooks everymonth to all Enigma operators. Even to those in the U-boats at Sea. This was a great overhead. And if the keybooks were captured, one month's communication became insecure. And even when DES was established, large business corporations and banks used couriers (people with a padlocked briefcase chained to their hands) to distribute keys to their clients to have a secure communication between them. But as business and number of people to communicate with grew, hiring couriers became a great overhead to these companies. Key distribution was restricting the general public from having secure communication among them. And it seemed imposssible to come over this problem. The world was eagerly waiting for a breakthrough for the key distribution problem. Especially the military and business corporations. Lots of men and money were involved in research to solve the key distribution problem. It was a gloomy time for cryptography at large.

At this time('70s), Information was not yet secure even when DES was invented and Internet had its birth(in the form of ARPAnet in the '60s). When all hope was lost, arose 2 heroes from humble backgrounds(in the form of Diffie anf Hellman) who gave the 2000 year old dying field of cryptography a fresh breath!

## END of Part 1 ##

to be continued..

A Note on "The Code Book": This blog could also be titled - "The Codebook - In a nutshell". I actually started to write about Diffie & Hellman Key Exchange. But then felt that understanding the importance of that would require prior knowledge on cryptography and its evolution. So i have given my best try here to explain the gist of a famous and wonderful book - "The Codebook" by Simon Singh. I struggled really hard and lost interest a lot of times to write this blog. But I took it as a personal tribute to one of the most captivating books i've read. I am eager to finish the remaining 2 parts, which were the actual things i wanted to write on. So for those of you who read this, thank you. For those who did not, please read the next 2 atleast.

~~~***~~~

Present day footnotes

1. In the introductory paragraphs, I have raised questions about the security of email passwords. I was talking about theoretical security. There are practical problems like phishing et al, which I or The Codebook did not focus on.

3 comments:

pushkal said...

good one siva

Unknown said...

Thanks Siva :)

Unknown said...

This blog discusses about cryptography. I am familiar about its simple detail i.e basic meaning. I find the detail interesting to read and is curious to learn more about this scheme. Do keep sharing.
e signatures