# An Unbreakable Code?

I heard this method of creating a code described in a movie a long time ago.
Two people want to send encoded messages to each other by mail(or now email). To encode their messages they both buy the same novel , lets say Moby Dick.
To write the message they use a system where a triple set of numbers designates a letter. For example(240,15,41) means on page 240 of Moby Dick on line 15 of that page and position 41 of that line is the letter that the person wants to use to spell out a word.
This system is good because a different set of triple numbers can be used to stand for the same letter, i.e. the letter k can be (17,6,43) or(88,51,7) so you can't use frequency to determine the letter.
Also the space between words can also be represented by a set of triples, so the encoded message has no separate words and you don't know the word length of any word. The whole encoded message would be a chain of triples like (56,45,3)(316,44,53).... for the whole page.
As long as the code-breaking novel, Moby Dick, is kept a secret, is this code unbreakable?

The biggest weakness is that there aren't that many books that's been published in the history of humans. If I know the language your plaintext is written in, a brute force search will find the book fairly quickly. Remember, your key isn't the order in which you use the letters from the book, but the book itself. No combinatorial explosion to give you a large keyspace.

Even factoring in different editions of the same book, this is on the order of 10^9 ~ 2^30 books. Once you have a searchable catalogue, it's absolutely trivial to break.

PeroK
Homework Helper
Gold Member
2020 Award
The biggest weakness is that there aren't that many books that's been published in the history of humans. If I know the language your plaintext is written in, a brute force search will find the book fairly quickly. Remember, your key isn't the order in which you use the letters from the book, but the book itself. No combinatorial explosion to give you a large keyspace.

Even factoring in different editions of the same book, this is on the order of 10^9 ~ 2^30 books. Once you have a searchable catalogue, it's absolutely trivial to break.

What if you used your own "book"? Something you wrote specifically.

Or, what if you coded the page references as well? 149 could be simply coded as 249 or whatever.

What if you used your own "book"? Something you wrote specifically.

Or, what if you coded the page references as well? 149 could be simply coded as 249 or whatever.

If you write your own "book", then you might as well write down a codebook consisting of nothing but uniformly random letters. If you discard a book after each use, then it's really unbreakable. This is called a "one-time pad".

If you re-use the book, what you have is a kind of polyalphabetic cipher. You can break them with many methods, like http://en.wikipedia.org/wiki/Kasiski_examination

The other main problem is that your cipher is still deterministic when it comes to individual letters. Which means 147:64:1 will always be "A", for example.

PeroK
Simon Bridge
Homework Helper
Dragonfall said:
The biggest weakness is that there aren't that many books that's been published in the history of humans.
No that's not really it...
https://what-if.xkcd.com/76/
... the literary output of humanity throughout the history of publishing has been pretty vast ... the main thing is that the book effectively correlates the number groups. You'd certainly want to avoid using a book that has an electronic version... i.e. anything out of copyright. The main problem is the same as anything where a human is expected to be "random". There are a bunch of studies on this. It's pretty well discussed in the link I gave - but nobody follows links much do they?

Note: technically we are talking about a "cypher" rather than a code.
http://en.wikipedia.org/wiki/Code_(cryptography)

Dragonfall said:
If you write your own "book", then you might as well write down a codebook consisting of nothing but uniformly random letters. If you discard a book after each use, then it's really unbreakable. This is called a "one-time pad".
I agree with this.
A one time pad has an advantage over a book, if the main aim is to be difficult to break, in that you can arrange for the letters to be uncorrelated.
The advantage of writing a book over writing a one-time pad is that it could maybe blend in ... so not obviously a cypher-book.

The main thing to think of is "what is the problem that this approach is supposed to solve?"
I mean as opposed to other common approaches.

One reason for using a book may be that you have to somehow communicate the key to the other person while you are under surveillance - maybe you are a prisoner and your letters are being read? It may be possible, and worthwhile, to indicate a favorite book to someone who is more familiar with you than possible attackers. Of course you'd still have to encode the numbers sequences into a regular letter. My point here is that codes often have to do more than just be "unbreakable" - in fact, being unbreakable is not often worth the effort.

I also tend to the principle that if you don't know how to break a cypher that you use, then it is insecure.
Remember: anyone can make a cypher that they, themselves, cannot break.
https://www.schneier.com/blog/archives/2011/04/schneiers_law.html

No that's not really it...
https://what-if.xkcd.com/76/
... the literary output of humanity throughout the history of publishing has been pretty vast ... the main thing is that the book effectively correlates the number groups. You'd certainly want to avoid using a book that has an electronic version... i.e. anything out of copyright. The main problem is the same as anything where a human is expected to be "random". There are a bunch of studies on this. It's pretty well discussed in the link I gave - but nobody follows links much do they?

Like I said before,

Let's say we have been this productive every year for the past 10000 years, it's still around 2^35. Trivially breakable.

The point of the "book code" is that it was supposed to solve key distribution. Why distribute keys yourselves when publishers already do that for you for free? So all you need to do is exchange a few book titles as opposed to actual codebooks.

This is moot now that we have public key crypto.

If you are worried about communicating with someone under surveillance, what you want isn't cryptography, but steganography. It's a different problem, one which book codes don't solve anyway.

I still do not see how it is possible. If I were to send "I will meet the spy at the train station on Tuesday morning" it would be impossible.It Is such a short message I don't see how to decipher. By the way I am not using Moby Dick as my key; I am using "Antwort uber Balthasar Hubmaiers Taufbuchlein" by Hulderich Zwingli , the 16th century Protestant reformer (Leipzig,1927) ( This is as obscure as I can get!).
Wouldn't you also get false positives. "The spy is in my living room", etc that would make the whole effort useless.

phinds
Gold Member
I still do not see how it is possible. If I were to send "I will meet the spy at the train station on Tuesday morning" it would be impossible.It Is such a short message I don't see how to decipher. By the way I am not using Moby Dick as my key; I am using "Antwort uber Balthasar Hubmaiers Taufbuchlein" by Hulderich Zwingli , the 16th century Protestant reformer (Leipzig,1927) ( This is as obscure as I can get!).
Wouldn't you also get false positives. "The spy is in my living room", etc that would make the whole effort useless.
I agree w/ you that this would be completely unbreakable.

I still do not see how it is possible. If I were to send "I will meet the spy at the train station on Tuesday morning" it would be impossible.It Is such a short message I don't see how to decipher. By the way I am not using Moby Dick as my key; I am using "Antwort uber Balthasar Hubmaiers Taufbuchlein" by Hulderich Zwingli , the 16th century Protestant reformer (Leipzig,1927) ( This is as obscure as I can get!).
Wouldn't you also get false positives. "The spy is in my living room", etc that would make the whole effort useless.

It's trivially breakable because for every book except that one your code will decrypt to nonsense.

Simon Bridge
Homework Helper
I still do not see how it is possible. If I were to send "I will meet the spy at the train station on Tuesday morning" it would be impossible.It Is such a short message I don't see how to decipher. By the way I am not using Moby Dick as my key; I am using "Antwort uber Balthasar Hubmaiers Taufbuchlein" by Hulderich Zwingli , the 16th century Protestant reformer (Leipzig,1927) ( This is as obscure as I can get!).
Wouldn't you also get false positives. "The spy is in my living room", etc that would make the whole effort useless.
"Unbreakable" means that it cannot be broken. What you have described is a breakable cypher which is sufficiently difficult to break that it may be useful under some constraints. You can call that "unbreakable" (functionally unbreakable?) if you want, but , by that way of thinking, any cypher is "unbreakable" given sufficient constraints; or you can always find a sufficiently restricted situation where a particular cryptosystem is secure. (i.e. the stories about Leonardo and mirror writing.) The trouble is - that's not how people use codes. Think how easy it is to break password protection.

You need to think in terms of which problems you want the cypher to solve, before thinking up the cypher.
Usually the problem the book cypher is supposed to solve is to do with getting the key to the recipient... do you understand why that's an issue?
These days we do that with public-key cryptography... that pretty much needs a computer but everyone has a phone these days, or can get one easier than they can get a copy of Antwort.