30 March 2012


ESIARN TOLCDU PMGHBY FVKWZX QJ should be used in place of the more familiar ETAOIN SHRDLU CMFWYP VBGKQJ XZ when guessing letters during the playing of the word game "Hangman."

The new sequence is better because the old one represents the frequency of letters in the written/spoken language, which is dominated by common words, while the new sequence reflects letter frequency in the list of dictionary words.

Then you need to modify your choices according to the length of the word, and refine it again depending on whether the first letter you tried was present in the word or not.

Full details are at Data Genetics, via Neatorama.


  1. I watched way too much Wheel of Fortune as a kid so I always guess RSTLNE as my first letters.

  2. Very nice to read and think about.
    I am pondering one possible issue though: does pluralizing a word make it into another word? I mean, does the dictionary the author uses include "dog" and "dogs" as two separate entries?
    If not (and if "dogs" is an acceptable word in hangman), then "s" might be a more frequently occurring letter than this analysis implies.

  3. I don't think spoken English actually has any letter frequency.

  4. Yes, the author's dictionay had plurals in. So yes the analysis takes this into account.

    (I'm the author)

    1. Thanks for your reply!
      And again, nicely written and illustrated ;)

  5. This is not actually the optimal series. What you really want to do is guess letters who's revelation will most quickly reveal the solution. That is, the letters that contain the most information. E, for example, is likely to be uniformative precisely because it occurs more frequently. Thus there is a trade between guessing letters likely to be there to get some information (which is better than none) and getting the most useful information the soonest (so as to minimize the number of subsequent guesses required.)

    1. I believe the way the game is played you are limited only by the number of wrong guesses, not by the number of guesses per se. Thus, even if the addition of an E to change _ _ _ _ _ _ _ E _ doesn't add much "information," it costs you nothing, and thus would be a logical early step.
