Some readers might be curious about how we came to have wordcounts in the first place. Contrary to popular belief, word counts existed long before computers did, and poor students and professionals used to have to count words by hand! As publishers were constrained by the amount of paper they could use per book, word count was used to predict how much paper they would need to order to print a single copy.
But where did it all begin? If you are familiar with history, you would not be surprised that like most technologies, wordcount began with a military application.
In ancient times, nobody had any use for word counts, though writers tried to be as concise as possible when it came to inscribing text on stone tablets, and later codices made with parchment. This was mostly because the materials used for writing were quite expensive at the time, though poets and writers naturally gravitated to specific sentence structures that sounded good to the human ear when read aloud. However, as literacy improved, classical states such as Rome began adopting simple forms of encryption to prevent hostile entities from reading their military communications.
These codes were named Caesar ciphers; the simplest versions of this cipher simply shifts the alphabet in one direction, so that A becomes B, B becomes C, and etc. More advanced versions takes the standard list of A to Z, and then for each of them you assign a random alphabet to match. The people who were supposed to be able to decode these codes were given the key, which was just a list of alphabets that matched each other.
At the time, no one had the ability to break any coded messages mathematically because literacy rates were still relatively low, and most just assumed that it was in a foreign language if they could not read the text. As a result, all military communications were secure, unless the key was leaked. This state of affairs continued more or less the same way, until after the final Great War between the Eastern Roman Empire (also known as Byzantine Empire) and the Sassanid Persian (Parthian) Empire in the 600s, both sides were left utterly exhausted. While the Romans had technically won, they did so with massive casualties, both in economic terms and manpower.
It was at this moment that the mercenaries from Arabia, who had fought for both sides throughout the war, and had knowledge of the battle tactics and communication techniques of all parties, became independent and conquered much of what we today call the Middle East. The Romans lost all of North Africa and most of the Levant, while the Parthians were conquered completely. With the changing status quo in the region came rapid technological developments, as now the Romans competed against the various Arabic Caliphates.
It was in this environment that by the 9th Century, an Arab Polymath called Al-Kindi wrote a text called “A Manuscript on Deciphering Cryptographic Messages”. The method he used was to analyse word frequency in a language. By analysing a lot of text, he was able to find the percentage of times that an alphabetical unit would appear in a standard text. Once that is done, you could then analyse words called bigrams and trigrams, and so on (a bigram is two units long, a trigram is three units and etc).
When you have the word frequency of a language, you can find the same encrypted sequence in the text. An example would be “AB” encrypted, and “NO” when it is decrypted. If in a very long text, the word “NO” should appear 0.5% of the time (note that this is an example percentage, and not a true value), and instead you see “AB” encrypted 0.5% of the time, then you would attempt to substitute A with N and B with O. You would repeat this with as many other encrypted words as possible, until you go to the point where the entire message is decoded.
This technique slowly spread throughout the modern world, and lead to advances in both encryption and decryption as a communications arms race erupted.
If you are interested in the exact frequencies that you can use to solve simple substitution ciphers, you might want to take a look at: http://www.practicalcryptography.com/cryptanalysis/letter-frequencies-various-languages/english-letter-frequencies/
The process to solve Caesar ciphers using word frequency can be found here: https://sandilands.info/sgordon/classical-ciphers-frequency-analysis-examples
If we take a step back from word counts’ military applications, we can also see that it has a rather long tradition in poetry. In English, the most famous poetic rhythm is the Iambic pentameter. The term itself describes the rhythm, also known as the meter, established by the words on a particular line of poetry. Lines in the Iambic pentameter start with an unstressed syllable followed by a stressed one. There are five unstressed syllables and five stressed ones in a line of poetry that follows the Iambic pentameter, hence the “penta” (from Greek pente, meaning five).
Many ancient writers used poetry to communicate their stories to the public. No matter their language, whether the writers be ancient Greek, Latin, Sanskrit or Chinese, we find that all ancient poets followed some form of ruleset that involved syllable counts. In fact, we can clearly see a line of descent where ancient poets such as Ovid inspired later writers like Shakespeare. While the rules in some meters were less absolute than others, there was always a pattern in reciting poetry that could be detected by the human mind. While this form of wordcount was based more on aesthetics than on mathematical applications, we can see that this is an alternative use of word count in our past, and today we still have individuals who study and perform in this manner. Of course, some of these artists would probably balk if you tried to compare their poetry with integer based word counts!
As ciphers got more advanced, the methods for solving ciphers also became more complicated, until it was no longer possible to solve communications by word frequency analysis. By the Second World War, the Nazis used the Lorenz cipher (alongside the more famous Enigma machine) to encrypt their coded messages. However, early computer scientists used the flaws in the German system to crack the code. U.K scientists spearheaded the creation of the Colossus computer to crack the German codes. Today we recognize the Colossus to be the first true digital computer.
Without word frequency analysis, you could argue that ciphers would never have advanced so far, which in turn would have voided the need for the invention of modern computers to break said ciphers. Of course we will never know for sure, but we do know one thing; computers are very good at for gruelling tasks that would bore the human mind to death, and that happens to include word counting. Today we take it all for granted, and our word counts are certainly not a matter of life and death. But for a moment in history, the ability to analyse word frequencies may have changed the course of all human civilization.
By the time personal digital computers came along, counting words was already a common task in people’s day to day lives, from writing essays, counting words for a novel, legal briefs, and all other manner of applications where we still use wordcounts today. Unfortunately for these individuals, they had to count their words by hand, or if they were professional workers, they would use a mechanical device called a tally counter.
This set the stage for digital computers to revolutionize the field and automate a very tedious task. The first office applications that were able to count words, such as WordStar, were a revolutionary invention that saved a great deal of time, and therefore greatly improved productivity. The incredible utility of these digital tools began to change our world completely. Today, the most valuable companies in the world all deal with digital technology, from Microsoft to Amazon. To the disbelief of many, the byte had conquered the atom, a trend that continues to gain momentum to the present day.
Word counters applications on computers are now ubiquitous, having replaced page counts, line counts, and various other methods that have been swept away. Our modern society seeks to quantify all data, including written text, via metrics that can be proven mathematically. This mind-set has led to massively improved logistics across the world, allowing us to process immense amounts of information and goods at a scale that were unimaginable in the past.
Still, there are many who argue against having wordcounts, particularly when it comes to creative endeavours. In education, there are many claims that show how minimum word counts can change a student’s writing style. It is a common trend where a student will start well in an essay, and then as soon as they hit the word count, the entire argument collapses and ends on an unsatisfying note. On the other side, if the minimum word count is set too high, writers will often use bad writing techniques and meander along until they hit the required word count.
In the end, while word limits are a useful tool to constrain and guide writers in producing more concise and relevant work, we should also remember that as writers, we are writing for people, not an integer counter stored in the memory of a machine.
• accessed 2 May 2021.