Cracking a monoalphabetic substitution cipher

In summary, the conversation is about breaking a ciphertext using a monoalphabetic substitution cipher. The ciphertext is 244 characters long and consists of only uppercase letters. The speaker has identified the probability distribution of the letters and has made some progress in identifying a few letters of the key. They also mention the use of a chi-square test for efficient solving, but note the importance of considering the context of the problem and the fact that language is markovian.
  • #1
Bipolarity
776
2
I am trying to break a harmless ciphertext that uses a monoalphabetic substitution cipher.
The ciphertext is exactly 244 characters long, without any spaces between words. It consists only of uppercase letters.

ciphertext = "JGRMQOYGHMVBJWRWQFPWHGFFDQGFPFZRKBEEBJIZQQOCIBZKLFAFGQVFZFWWEOGWOPFGFHWOLPHLRLOLFDMFGQWBLWBWQOLKFWBYLBLYLFSFLJGRMQBOLWJVFPFWQVHQWFFPQOQVFPQOCFPOGFWFJIGFQVHLHLROQVFGWJVFPFOLFHGQVQVFILEOGQILHQFQGIQVVOSFAFGBWQVHQWIJVWJVFPFWHGFIWIHZZRQGBABHZQOCGFHX"

I have come up with the probability distribution of the letters. The ratio indicates the probability of that letter.

Letter: F Tally: 37 Ratio: 15.163934426229508
Letter: Q Tally: 26 Ratio: 10.655737704918032
Letter: W Tally: 21 Ratio: 8.60655737704918
Letter: G Tally: 19 Ratio: 7.786885245901639
Letter: L Tally: 17 Ratio: 6.967213114754098
Letter: O Tally: 16 Ratio: 6.557377049180328
Letter: V Tally: 15 Ratio: 6.147540983606557
Letter: H Tally: 14 Ratio: 5.737704918032787
Letter: B Tally: 12 Ratio: 4.918032786885246
Letter: P Tally: 10 Ratio: 4.098360655737705
Letter: I Tally: 9 Ratio: 3.6885245901639343
Letter: J Tally: 9 Ratio: 3.6885245901639343
Letter: R Tally: 7 Ratio: 2.8688524590163933
Letter: Z Tally: 7 Ratio: 2.8688524590163933
Letter: E Tally: 4 Ratio: 1.639344262295082
Letter: M Tally: 4 Ratio: 1.639344262295082
Letter: A Tally: 3 Ratio: 1.2295081967213115
Letter: C Tally: 3 Ratio: 1.2295081967213115
Letter: K Tally: 3 Ratio: 1.2295081967213115
Letter: Y Tally: 3 Ratio: 1.2295081967213115
Letter: D Tally: 2 Ratio: 0.819672131147541
Letter: S Tally: 2 Ratio: 0.819672131147541
Letter: X Tally: 1 Ratio: 0.4098360655737705
Letter: N Tally: 0 Ratio: 0.0
Letter: U Tally: 0 Ratio: 0.0
Letter: T Tally: 0 Ratio: 0.0

Since I haven't taken much statistics, I'm not sure how I would set up a chi-square test to solve this problem but my cryptanalysis text says that a program using a chi-square test would be essential to solve this problem in the most efficient way possible.

Perhaps someone could help me with the chi-square? Or perhaps someone could help me with a few letters of the key using their knowledge of English?

I don't necessarily care about solving the problem efficiently, I would just like to know what the cipher text comes to.

Progress so far:
E --> F --> E
T --> Q --> T
H --> G --> H

Thanks!

BiP
 
Last edited:
Physics news on Phys.org
  • #2
Hey BiPolarity.

For a chi-square test, you have an expected distribution (which will be the expected distribution of letters corresponding to frequencies which is something that Claude Shannon was looking at) and an observed.

You calculate the test statistic by summing [(Oi - Ei)]^2/Ei and then calculate the probability value corresponding to where this estimate lies on a Chi-Squared n-1 distribution where n is the number of entries in the PDF (so in this case n = 26).

If the probability is too small (usually we test at a level of 0.05), then we reject this. However, you need to take into account the context of your problem.

The other thing is that language by its nature is markovian which means that conditional probabilities have just as much, (if not more of an importance) than the non-conditional data.
 

1. How does a monoalphabetic substitution cipher work?

A monoalphabetic substitution cipher is a simple type of encryption where each letter in the original message is replaced with a different letter according to a predetermined key. For example, if the key is "BADC", then every instance of the letter "A" in the message will be replaced with "B", "B" with "A", "C" with "D", and "D" with "C". This process is repeated for every letter in the message, resulting in a jumbled and unreadable ciphertext.

2. How can I crack a monoalphabetic substitution cipher?

The most common method for cracking a monoalphabetic substitution cipher is through frequency analysis. This involves looking at the frequency of letters in the ciphertext and comparing it to the frequency of letters in the English language. Letters that appear most frequently in the ciphertext are likely to correspond to letters that appear most frequently in the English language. By making educated guesses and applying patterns, it is possible to gradually decipher the message.

3. Are there any tools or software available for cracking monoalphabetic substitution ciphers?

Yes, there are several tools and software available that can assist in cracking monoalphabetic substitution ciphers. These include online decryption tools, specialized software, and even mobile apps. However, it is important to note that these tools are not foolproof and may not always provide accurate results.

4. How long does it take to crack a monoalphabetic substitution cipher?

The time it takes to crack a monoalphabetic substitution cipher depends on various factors, such as the length and complexity of the key, the length of the message, and the skill and experience of the person attempting to crack it. In some cases, it can take just a few minutes, while in others it may take days or even weeks.

5. Can a monoalphabetic substitution cipher be considered a secure method of encryption?

No, a monoalphabetic substitution cipher is not considered a secure method of encryption. It is a very basic and easily breakable form of encryption, as the key can be easily guessed or discovered through frequency analysis. It is important to use more complex and secure methods of encryption when dealing with sensitive or important information.

Similar threads

  • Computing and Technology
2
Replies
52
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
853
Replies
2
Views
3K
  • Engineering and Comp Sci Homework Help
Replies
9
Views
2K
  • Nuclear Engineering
Replies
1
Views
3K
  • Engineering and Comp Sci Homework Help
Replies
7
Views
1K
Replies
2
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
6
Views
2K
  • Astronomy and Astrophysics
Replies
2
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
5
Views
2K
Back
Top