Suddenly decided to make something bad, while waiting for other things to settle.
Years ago I used to believe that I know something about hacking. Not kernel hacking, but the one, related to cracking of various supposed to be secure systems. Starting from ciphers down to code ‘issues’ (described in phrack, yup). It is rather laughable right now :)
But today I know, that reading (and even understanding) smart and hardcore articles is really far from being able to apply given knowledge to some practical problems. So, I decided to start (and complete practical implementation) from really simple things: various mono/poli alphabet ciphers. Ceasar/rot13 and Vigener are good choices afaics.
Bruce Schneier’s Self-Study Course in Block Cipher Cryptanalysis does not allow me to easily fall asleep, although its a little bit fun to compare my alphabet ciphers analysis versus even simplest crackdowns described in the article.
Anyway, it took me a day or less to hack up a semi-automatic Vigenere and monoalphabet cipher cracker in LISP. In Vigenere code there are two steps: find out key length and split and decode monoalphabet enciphered text blocks.
The former task is performed manually – the only application created searches trigrams in text and shows their frequencies. When there are 3 or more trigrams it is possible to find key length with rather high probability. Bigrams will work too, but error rate is higher. One have to find distance between the same trigrams and found greatest common divisor, which will likely be equal to cipher key length.
Vigenere cipher is theoretically uncrackable when unique key long enough to cover input message is used. But in practice shorter keys are used, which are then repeated number of times so that resulted key string length becomes equal to plaintext message. So, when cipher starts to repeat itself, the same letters will be encrypted into the same ciphertext.
This method is named after Friedrich Kasiski, who found it 150 years ago. When key length is found, we split ciphertext into separate strings, where each one encrypted using monoalphabet cipher. It consists of letters separated by key length, i.e. the each string will be formed from $string_number + $i * $key_length letters, where $i runs over ciphertext until it is fully covered.
Monoalphabet ciphers are rather trivial to crack using frequency analysis. Namely we should find the most frequent letters in the encrypted text and they will correspond to the most frequent letters in the language used for plaintext. Difference between those letters is a cipher shift, so it is trivial to recover the text just by replacing each letter with the one shifted by calculated number.
Here is example I used (random New York Times article, cut just to show idea, I also converted it to downcase and dropped all non-letter character, since
gcipher application does not understand that):
After spending months searching for a bipartisan consensus on financial regulatory reform, Senator Christopher Dodd, chairman of the banking committee, is expected to unveil his own bill on Monday, without one Republican supporter.
That’s how it was encrypted and resulted ciphertext:
$ gcipher -c Vigenere -k asknfalwiwf v1.enc v2.enc
Trigram search uncovers that the most frequent ciphertext trigram ‘tzo’ is spread in the text with the distances being multiple of 11, which does correspond to above key length.
Splitting text and calculating frequences is a rather simple technical task. After performing decoding we got following result:
Second line is a plaintext message, which differs in less than half symbols. I was not able to decode some of the text parts because of small enough text length, so that frequency analysis did not always provide correct data. But even as is decoded text allows to read and recover data.
On this positive note I will start preparig for the weekend skiing.