The Vagaries of the English Language

The English language, like the English nation, has come into contact with an extraordinary range of other cultures and languages. The Telegraph recently reported that Britain has invaded or fought with nine out of ten countries in the world, leaving only twenty two nations historically unmolested. Similarly, England itself has experienced waves of immigration, invasion, settlement and occupation. All of this is reflected in the diversity and complexity of the English language, little pieces of history echoing through the ages to leave their mark in the form of various linguistic peculiarities.

Despite all this, some of the grandest claims as to English’s sheer size and complexity are in fact plausible sounding misconceptions. Steven Fry, the popular actor and public speaker, has been quoted in support of the oft-repeated but erroneous claim that the English language has the largest vocabulary of all languages, ever, throughout history. It’s presumed by the people who adhere to this belief that because of its sheer complexity and the history of extra—linguistic contact the English language has gone through that it has taken on the characteristics of, and many of the words from, these various other languages. Not only is this supposed to have left English with a “larger vocabulary” than any other language but it is sometimes claimed this sheer volume of words is larger than all the other languages combined, or some similar piece of hyperbole.

A relatively simple thought experiment can discount the possibility of such suppositions. Different languages can be classified according not only to their linguistic family in terms of relatedness to other tongues but also to their functional nature. English is what we call an analytical language, as opposed to languages like German which we call synthetic. This means that in English we modify meaning by changing word order and appending adjectives, adverbs and conditional clauses to the statements we make, whereas in synthetic languages a lot of this work can be done by combining words into new neologistic constructions. We’re all familiar with German’s comically over-complex words formed from the Frankesteinian combination of what might account for an entire sentence in English. These new words aren’t necessarily new in the sense that they need to be added to the lexicon, because German speakers have a clear and established set of rules for how these compound neologisms are formed, such that the meaning of each is often apparent so long as the speaker is familiar with its component words.

The comparison of English to German may seem extreme, with the monstrosities of German linguistic synthesis such as Donaudampfschifffahrtsgesellschaftskapitänsmütze seeming almost absurd*. But there are all kinds of similarly confounding complexities to the question of exactly what counts as a word. Think of the relatively simple case of different verb forms in English; to run, running, ran, did run etc. Or think instead of the differing degrees of modifiable adjectives, such as big, bigger, the biggest. Are these multiple forms of the same word, or is each a word in its own right? What about compound words? I used German because it’s probably familiar to most people, but there are also less obvious but more instructive examples of synthetic languages, such as Czech.

In Czech, for one root you get something like twelve words for free. If I’m addressing you in the second person, I add the suffix “-íš,” pronounced “-eesh,” whereas if I’m speaking in the first person I use “-im.” There are different forms of the word if I’m using it in the general sense, different tenses, and depending also on conjugation. The language also sometimes dispenses with vowels, resulting in apparently unwieldy clusters of consonants like the children’s tongue twister “strč prst skrz krk.” As difficult as that kind of thing is for English speakers to get their tongues around, however, the apparently complex rules of the Czech language are in fact almost entirely consistent, and compared to English the spelling is almost entirely logical and phonetic, albeit in a weird, uncannily altered version of the Latin alphabet. It’s rated one of the hardest languages for other people to learn to speak, but English is right up there with it.

One factor often making it difficult for learners of English as a foreign language is that of its countless inconsistencies and irregularities. One historical peculiarity is that in the past, many of these rules didn’t exist at all. It was only with the implementation of the printing press that familiar standardized spellings began to emerge. Printers used mounted blocks to transfer letters and words to the page, and it was more efficient to have a block for an entire word than to arrange it every time with individual letter blocks. This combined with the existence of multiple rival publishing houses to generate a profusion of standardized and yet differing sets of spellings for commonly used words. The need to create these blocks also led to certain rules for the creation of sensible spellings, so that one house would create word blocks according to one principle and another to a quite different rule. It is from this chaotic evolution that we get the written word of today, with the added confounding factor of differing dialects and their canonical texts, such as American English and Webster’s dictionary.

Webster is the reason many American spellings differ from their British counterparts, in that Webster decided it would make more sense to start spelling in phonetics, according to some kind of logical plan. But this didn’t always stick, and a lot of his simplifying influence has been lost or rejected as people have clung to the old ways. He still managed to push a few of his innovations through though, such as the change from “colour” to “color”, “favour” to “favor” or “realise” to “realize.”

Beyond rivalry between printing houses and dictionary publishers for the right to wrest control of the English language once and for all from the chaotic inconsistencies that had determined its evolution to that point, there is one fundamental schism at the heart of English’s history which leaves it and many other European languages greatly affected. This is the fact that English is a Germanic language, but from the time of the Roman occupation a thousand or so years ago it has been written in a Latin script. There are some linguists today who argue that English has effectively become a Latin language, shifting from one historical branch of the Indo-European language family to another by virtue of accretion. The sheer weight of Latinate terminology which has injected itself into English, as well as the fact that its script is based on the Latin alphabet, means that English as we know it today has both historical and incidental connections to Latin. It could be argued that in some sense English is more “purely” Germanic, that its origins come from the Celtic tongues spoken in the British isles before the Roman occupation, but questions of what came first are difficult to settle given that English’s evolution into its present form has involved from multiple branches of linguistic development merging over time. How can we determine that one is the original branch, simply by virtue of its continuous occupation of the same physical location, when the language itself has expanded out to become what is effectively the universal auxiliary language for almost all international affairs? Is the most defining aspect of English its history in the land of England, or its history as a promiscuously mongrelized meta-tongue?


*This example is actually a joke, but in some scientific disciplines such as chemistry, where names are required for enormously complex chemical compounds, the length of these constructions can be virtually unlimited.