Wednesday, February 18, 2015

It's a secret (6)

(Continuation from Monday)
Neither of our examples of substitution or transposition enciphering would cause a knowledgeable decipherer much trouble. The achilles heel of such attempts is what is known as frequency distribution.  One of those things which "everybody knows" is that the most common letter in the English language is "e", but perhaps what is not so well known is that the frequency with which "e" appears in any reasonable sized piece of English is always the same - just over 12%.  And this is quite independent of content - it doesn't matter whether you are dealing with a treatise on the theory of relativity, or a pornographic novel - "e" will appear with a frequency of around 12%. And of course there's nothing special about "e" - the frequency with which any letter will appear is just as constant.  "p" for example will crop up around 2% of the time and "i" about 7% and so on.  The ten most common letters are e,t,a,o,i,n,s,h,r,d in that order.  What does this mean for our would-be decipherer? Well the first thing he or she will do is count up how many times each letter appears in the cipher text.  If the distribution is as it is for standard English, then that points to this being some sort of transposition cipher. If on the other hand, the letter "j" appears most often, then that suggests that this is a substitution cipher, and that "j" represents the letter"e" - as indeed is the case with our 5-place Caesar shift.

No comments: