Introduction Exhibit 1
This page describes how I went about cracking Exhibit 1 from Silent Years. If you would like to try your hand at cracking Chaocipher, you should do so before reading the rest of this page. Trying to crack a cipher yourself is a lot more fun and instructive than just reading about it.
ALLGOODQ |
ALLGOODQ |
ALLGOODQ |
(a) | (b) | (c) |
ALLGOODQ |
ALLGOODQ |
ALLGOODQ |
(d) | (e) | (f) |
ALLGOODQ |
ALLGOODQ |
ALLGOODQ |
(g) | (h) | (i) |
Figure 1 Determining a partial ring configuration from repeated symbols. |
The key to breaking Chaocipher is to
use repetitions of plain-text or cipher-text symbols to
learn something about the state of the rings. In the figure to the
right and the description of each step below, is a demonstration of
how this is done. The plain-text phrase is ALLGOODQ
and the
cipher-text phrase IROROISO
.
LL
repetition in the plain
text. A red font colour indicates which symbols are about
to be added to the rings; black shows which symbols have
already been added; and gray which symbols have not yet
been added.L
/R
symbols. To fully understand the transition
from (a) to (b), refer to Figure 1 in
the Introduction. In (b) the ring
permutation has already been applied, meaning that this figure
corresponds to the rightmost set of rings in
the Introduction.L
, was on the inner ring to add the third
cipher-text symbol, O
, to the outer ring. Due to the
permutation the R
on the outer ring was moved all the way
from the zenith to the nadir.G
/R
) the
cipher-text symbol was used since the plain-text symbol had not
yet been observed in the plain-text phrase. This means that
the G
was added to its location on the inner ring,
opposite R
on the outer ring, before the permutation
was applied.O
on the outer ring was used to add
an O
to the inner ring.O
on the inner ring was used to add
an I
to the outer ring. This leaves us stuck on the
right-hand side of the phrase since neither the D
of the
plain-text nor the S
of the cipher-text has been
encountered before.A
/I
) since we now know where the I
is on the
outer ring. Here the process changes slightly because we
have to move in reverse. To arrive at this figure, we had to
backtrack all the way to step (a). (Note that in this state we
would be ready to encipher the second symbol (L
to R
) as the correct plain-text symbol is in the same
location as the correct cipher-text symbol.)I
was used to incorporate the
plain-text symbol A
. To do this in reverse, we followed
Figure 1 in the
Introduction from
right to left. Firstly, the rings were rotated so that the cipher-text
symbol is at the zenith. Secondly, the rings were unpermuted.
Finally, the plain-text symbol was added at its correct
location—adjacent to the cipher-text symbol. Now we have
incorporated all of the available symbols and have learned where
6 out of 52 symbols are located on the Chaocipher rings.This is essentially the method I used to crack Exhibits 1. There are still some additional problems to address, namely how to deal with gaps in phrases of repeated symbols, and how to find a good starting location in the text for cracking the cipher.
To continue with the example in Figure 1, the D
/S
symbols have to be incorporated. Since they were not observed in the
first part of the phrase, the only option is to try out every
possible ring configuration consistent with observing these
symbols. Figure 1(i) shows that there are 19 locations
where D
/S
could be placed. The other 7 locations
already have at least a plain-text or cipher-text symbol
present. For each of the 19 possible ring configurations, the adding
of seen-before symbols can now continue. Some configurations will
turn out to be inconsistent with new symbols. For example,
when incorporating the final symbols in the
example, Q
/O
, the configuration shown in Figure 2 is
not allowed. According to this configuration D
should
encipher to O
, but from the text we know that Q
should
encipher to O
. Since there is an inconsistency
this configuration can be discarded. Cracking the
cipher reduces to repeatedly incorporating known symbols, exploring
all possible configurations when unknown symbols are found, and
discarding configurations that are inconsistent with the
plain and cipher texts of the exhibit.
ALLGOODQ |
Figure 2 An inconsistent configuration |
Since long sequences of repeated symbols are useful, it is a good idea to search through the text for the location with the longest sequence of repeated symbols. Doing such a search shows that the longest sequence
This is very encouraging. By searching through fewer than 26^{4} ≈ 457 000 ring configurations it is possible to find the configuration that matches the plain text to the cipher text—a task easily accomplished on a computer. It turns out that the task is even easier than that. Because inconsistent ring configurations are discarded as seen-before symbols are encountered, the maximum number of different consistent ring configurations to consider is 444. That is,
Now, 444 is not a large number and this code could well have been cracked by a few careful and persistent people. In a later article, on how to crack Exhibit 4, I will show that even fewer possible configurations need to be considered to crack that cipher. I believe that, had the mechanism underlying Byrne's cipher been known 90 years ago, it would eventually have been rejected as too weak for use by the military.
The best starting point for cracking Exhibit 1 is at offset 7187, that is at the location highlighted in the text below.
...RANCEOFTHESECOLONIESUANDSUCHISNOWTHENECESSITYWHICHCONSTRAINSTHEMTOALTERTHEIRFORM...
...SREXYRUWMBTXTHYVNGZLXELVTZDCQMVFLCBBYKBMESGHSOEPSKPKEWMEQWCOQNBURIIQBNQOGAAXPEIT...
From here two symbols pairs from the phrase are added to the rings without traversing any gaps.
...RANCEOFTHESECOLONIESUANDSUCHISNOWTHENECESSITYWHICHCONSTRAINSTHEMTOALTERTHEIRFORM...
...SREXYRUWMBTXTHYVNGZLXELVTZDCQMVFLCBBYKBMESGHSOEPSKPKEWMEQWCOQNBURIIQBNQOGAAXPEIT...
First the I
/G
gap is traversed. This does not allow
the addition of any additional symbols to the rings.
...RANCEOFTHESECOLONIESUANDSUCHISNOWTHENECESSITYWHICHCONSTRAINSTHEMTOALTERTHEIRFORM...
...SREXYRUWMBTXTHYVNGZLXELVTZDCQMVFLCBBYKBMESGHSOEPSKPKEWMEQWCOQNBURIIQBNQOGAAXPEIT...
Next the T
/H
gap is traversed. This allows the
addition of one more symbol pair, Y
/S
, since S
is already on the outer ring.
...RANCEOFTHESECOLONIESUANDSUCHISNOWTHENECESSITYWHICHCONSTRAINSTHEMTOALTERTHEIRFORM...
...SREXYRUWMBTXTHYVNGZLXELVTZDCQMVFLCBBYKBMESGHSOEPSKPKEWMEQWCOQNBURIIQBNQOGAAXPEIT...
Traversing the third gap (W
/O
) brings us a lot
further. Note that symbols from both before and after the
previous phrase can be added to the rings.
...RANCEOFTHESECOLONIESUANDSUCHISNOWTHENECESSITYWHICHCONSTRAINSTHEMTOALTERTHEIRFORM...
...SREXYRUWMBTXTHYVNGZLXELVTZDCQMVFLCBBYKBMESGHSOEPSKPKEWMEQWCOQNBURIIQBNQOGAAXPEIT...
Traversing the final gap (M
/U
) allows us to find the
full state of the rings and crack the entire cipher.
...RANCEOFTHESECOLONIESUANDSUCHISNOWTHENECESSITYWHICHCONSTRAINSTHEMTOALTERTHEIRFORM...
...SREXYRUWMBTXTHYVNGZLXELVTZDCQMVFLCBBYKBMESGHSOEPSKPKEWMEQWCOQNBURIIQBNQOGAAXPEIT...
Figure 3 The initial configuration for Exhibit 1. |
After having cracked Exhibit 1, the initial ring configuration that
was used to encipher the plain-text can be found—see Figure
3. At first glance it might not look as if there is anything very
special about this configuration, but there are in fact many
consecutive symbols on both rings. On the outer (cipher) ring
there are QRST
, XY
,
LM
, ZAB
, and JK
. On the inner (plain) ring there
are YZ
, EFGH
, OP
, and TUV
. Since
Chaocipher shuffles alphabets quite quickly, it seems likely that
the configuration in Figure 3 is not far removed from a fully ordered
initial alphabet (with ABCD
...Z
on both
rings). The key would be the phrase that takes us from the
fully ordered alphabet to the initial alphabet for enciphering or
deciphering. Such a key would need to be known by both the party that
enciphered the text and the party that wants to decipher it. Here, as
a code breaker, I would like to discover what the key was. To guide
this search for the key, I worked from the following assumptions.
ABCD
...Z
) on both rings, and applied a key to arrive
at the initial state for enciphering.The brute force approach would be to search through all possible keys of a particular length until one that takes us from the ordered alphabet to the alphabet in Figure 3 is found. This requires too much calculation—searching through 26^{L} keys where L is the (unknown) length of the key. To make the search space smaller, we need some way of decreasing the size of the search space.
In my everyday work in the field of probabilistic inference, entropy is a measure of disorder—the lower the entropy, the more structure there is in whatever is being looked at. Define ring entropy as the total number of symbols on the rings that are not followed by the next symbol in the ordered alphabet. Here the following symbol is located clockwise from the reference symbol. For the configuration in Figure 3, the ring entropy is 37. This definition of entropy has some useful properties.
The search for the key is now done backwards from the configuration in Figure 3. This configuration has entropy 37, and we want to find symbols that would decrease the entropy (i.e. make the rings more ordered) with each backwards step. Since the key is expected to be short, the rings should become more ordered quite quickly when we backtrack correctly. Since the entropy is bounded from below, this process has to converge or fail after at most 37 steps.
Applying this search strategy reveals that the plain-text key for
Exhibit 1 has length 10 and is TILNOYHIVK
. Starting
from the fully ordered configuration and enciphering this key leaves
the rings in the configuration of Figure 3. This is not a very
satisfying result since the key is clearly not an English phrase or
name. It is also possible that the key was not meant to be
enciphered but rather to be deciphered, in order to reach the initial ring
configuration. Since we already know the plain-text key, the
cipher-text key is easy to find. It is the output from
enciphering the plain-text key, namely THIKKTBDNB
.
This is also not very satisfying. Fortunately, with the habit of
writing plain-text phrases above their cipher-text counterparts,
TILNOYHIVK
THIKKTBDNB
comes a revelation. After staring at this key for a while, the
phrase THINK THINK
pops out.
TILNOYHIVK
THIKKTBDNB
It seems that Byrne used this 10-letter key and a pattern alternating between the inner and outer rings (that is, enciphering some symbols in the key and deciphering others) to arrive at the initial ring configuration for enciphering Exhibit 1.
Using a known plain-text plus cipher-text attack it was possible to crack the Chaocipher system. Cracking the code
Cracking Exhibit 1 was fun and challenging to an amateur cryptographer armed with a laptop computer and about 48 hours. However, I do not believe Chaocipher would have held up to professional code breakers even in the 1920s and even without the aid of a computer. In Chaocipher's defense, this attack did require both the plain and cipher texts to be known. A cipher-text only attack would be much more difficult and is, to my knowledge, still an unsolved problem. Furthermore, even slight modifications to the Chaocipher system result in texts with different statistical distributions and make them, without knowing what the modifications were, difficult to analyse or break—Byrne's Exhibits 2 and 3 are cases in point. Future articles will address cracking Exhibit 4, what I've learned from trying to crack Exhibits 2 and 3, and some thoughts on cipher-text only attacks.
Many thanks to Moshe Rubin for helpful comments and corrections.