In the 1960s, Mark Mayzner studied the frequency of letter combinations in English words by using a corpus of 20,000 words he put together from a variety of sources including newspapers and books.
In 2012, the now retired researcher wrote to Peter Norvig asking if he and his Google team would be interested in making use of the Google Corpus Data to expand on his previous research. The answer was a resounding ‘yes!’
Using the books scanned by Google, Norvig’s team identified 97,565 distinct words which were mentioned 743,842,922,321 times (a vast increase on Maysner’s 20,000). You can read their very interesting results here.
If you’re a more visual person, Abacaba have made a wonderfully colourful video showing the most common words in English using Norvig’s research. What do you think the most commonly used word in English is? What about the most commonly used 4-letter word in English? Watch and find out if you’re right!