Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Why frequency analysis works?

  1. Jul 9, 2013 #1
    English has the letter 'e' with most frequency. Other language also have some of their alphabets appearing more frequently in text hat others. Why is this? Why don't all letters appear with equal frequency?
    Do humans speak vowels more comfortably? What exactly is the reason?
  2. jcsd
  3. Jul 11, 2013 #2


    User Avatar
    Gold Member

    Tone languages are languages (like Chinese, Thai, Yoruba, and Zulu) in which the pitch or “tone” of words and syllables makes a difference to word meaning. For example, in Chinese huār (with a high level pitch) means ‘flower’ and huàr (with a falling pitch) means ‘picture’. In non-tonal languages (like English or Spanish), pitch is only used at the sentence level, for emphasis and overall meanings like questioning. Roughly half the languages in the world are tonal and half are non-tonal, but they’re fairly unevenly distributed: tone languages are the norm in sub-Saharan Africa and are common in Southeast Asia and among Native American languages especially in parts of Central and South America. Non-tone languages are the norm in Europe and Central, South and West Asia, and among the aboriginal languages of Australia.

    The World Atlas of Language Structures (WALS) is a large database of structural (phonological, grammatical, lexical) properties of languages gathered from descriptive materials (such as reference grammars) by a team of 55 authors (many of them the leading authorities on the subject).

    specifically, see this chapter:
    Chapter 2: Vowel Quality Inventories
    by Ian Maddieson
    Some excerpts:

    1. Introduction
    This chapter discusses the number of vowel contrasts in the inventory of sounds in languages.

    2. Establishing the values.
    When vowel qualities are counted in this way in the sample of languages surveyed for this chapter, the average number of vowels in a language is just fractionally below 6. The smallest vowel quality inventory recorded is 2 and the largest 14.

    3. Geographical distribution
    There are strong areal patterns in the distribution of vowel quality inventories. Not surprisingly, languages with average inventory sizes are the most widely scattered. In just a few areas, southern Africa being one, they occur almost to the exclusion of the other two types.

  4. Jul 12, 2013 #3


    User Avatar
    Staff Emeritus
    Science Advisor
    Homework Helper

    Why should letters appear with equal frequency? Do the sounds of a language occur with equal frequency?

    Vowels are used to sound out the consonants, at least in Indo-European languages. English has 5 vowels and 21 consonants. Other languages will have a slightly different mix.

    Frequency analysis is one tool which can be used to attack ciphered messages. Other tools are needed along with FA to produce a complete decipherment.
  5. Jul 13, 2013 #4
    Yes, I know it's used to attack ciphered messages. Actually this question arose from that very context. I was curious to know why some letters had more frequency?
    The question perhaps requires the knowledge of how human speech works. Can you pleas explain why this work
  6. Jul 13, 2013 #5


    User Avatar
    Staff Emeritus
    Science Advisor
    Homework Helper

    All I can say about FA development comes from this article:


    See the section on History and Usage.

    If you are really interested in cryptography and ciphers, I recommend the book by Kahn (in the References portion of the same article.)

    However, it does stand to reason that the occurrence of letters in written text, like a lot of things, would have some statistical distribution, given enough samples of text written in the same language. Some clever person recognized this in the mists of time, before statistical analysis was ever thought of.
  7. Jul 16, 2013 #6
    Yes, there has to be some statistical distribution but I find the sound of 'e', 'a' the most common across many languages. I am trying to find a reason behind this
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook