November 17, 2013

As somebody with a moderate understanding of linguistics, I get asked all sorts of language-related questions by non-linguists. By and large, the most common question pertains to language complexity. Many people have this burning question within them: Which language is the most complex? I’ve found that typically people aren’t after simply finding out which language is most complex, but rather they aim to place a value judgment on the answer. They want to be able to establish whether or not language x is better or worse than language y, or whether language y is easier or harder to learn than language z.

With that said, I’m writing my first linguistics-related blog post in an attempt to remove this superiority hierarchy from the discussion of language complexity. The thing that prompted this outcry was the recent social media circulation of the following poster:

We’ll start with why this “infographic” is stupid and generally useless:

1) They say of Arabic that it’s one of the hardest languages to learn because it has “very few words that resemble those of European languages.” By this logic, any non Indo-European language on the poster would also be one of the hardest to learn, however, inexplicably we find Hebrew (quite similar to Arabic in structure), Finnish, Turkish, Vietnamese, and Thai under the “Medium” difficulty heading.

2) They say that learners of Japanese “need to memorize thousands of characters.” While undoubtedly true if the learner wants to learn the writing system, what about simply learning to speak the language while omitting the reading and writing aspects? When explorers and travelers learned languages in the past they learned the language first and the writing system second. It’s important to realize that a writing system is not the same thing as a language.

3) They say that Chinese (Chinese, mind you, not Mandarin or Cantonese or anything specific) is one of the hardest to learn because it “is a tonal language.” So what? If we’re letting the number of tones be the judge, Vietnamese would be off the charts with almost twice as many tones as Beijing Mandarin. Thai also leads Beijing Mandarin on tone count.

4) And last but certainly not least, Korean is difficult because it has “different sentence structure, syntax, and verb conjugations.” Simply laughable. Quite literally every other language on the list differs from English by those same parameters, many of them more drastically than Korean.

I get the feeling that somebody wanted to use their Illustrator skills to draw a pretty poster, but they couldn’t be bothered by actual data analysis so they just made stuff up. How does this relate to language complexity though? Well, we’re led to believe that the languages highest on the list are the most complex. In reality, the whole situation is vastly more complicated.

The bottom line is that it’s essentially impossible to create an index by which we can judge language complexity. The factors are simply too numerous and nebulous. Comparing two languages, side by side, based on anything from phonetics to pragmatics yields very few useful results. The reason is because language is not black and white, different languages accomplish the same things via different means. For example, many languages encode grammatical aspect with a morpheme attached to either the verb or the object (or occasionally other arguments), many more languages encode grammatical aspect periphrastically (through the use of multiple words), and in yet more languages grammatical aspect is an intrinsic property of the verb. Which of these methods is more complex? Morphological aspect is surely more morphologically complex than the other methods, but by the same token periphrastic aspect is undoubtedly more syntactically complex. Likewise, lexically-encoded aspect requires a separate word for each aspect likely resulting in more net words than languages with morphological or syntactic aspect. Is there a way to determine a weight between these methods? Not really, they’re just different ways of encoding the same information.

Similarly, is there any way of establishing whether ergative-absolutive languages are more or less complex than nominative-accusative languages, or active-stative languages? No, there is not. Like grammatical aspect, the morphosyntax of a language can align itself in many different ways. Just because ergative-absolutive languages treat their syntactic arguments in profoundly different ways than nominative-accusative languages like English does not make them any more or less complex.

To make matters worse, grammatical aspect and syntactic alignment are infinitesimally small slices of the whole inventory of tools languages have at their disposal. Attempting to compare every component of a language to establish a hierarchy is simply a dead end.

On the other hand, there do exist small areas within language that lend themselves to side-by-side comparison, at least superficially. Quantitative measurements, such as the total size of a language’s phoneme inventory, the number of noun classes a language has, or the number of inflection paradigms a particular class has in a language, can be somewhat easily-compared. This technique is dangerous, however, and oftentimes leads to false conclusions. Again, what we’re left with is a tiny piece of language x which may be more complex than that of language y. For example, the Svan language has a fairly large phoneme inventory, consisting of around 50 phonemes by most counts. Compare that to the 33 or so phonemes of Tagalog. Now we’re sure that at least the phoneme inventory of Svan is more complex than that of Tagalog. What does that tell us about other aspects of the languages though? Not much, unfortunately. If anything, we would be more likely to conclude that Svan is less complex than Tagalog elsewhere due to so much functionality being focused in the phonological tier, again leading us to a fairly useless comparison. Oftentimes these simple comparisons fail to take into account components of crucial importance in determining the flexibility of the phoneme inventory as well, such as phonotactic constraints and morphophonology.

Rather than attempting to establish a complexity hierarchy for languages, it’s much more useful to compare and contrast specific parts of different languages while keeping in mind that they are all there as a means to the same end. Languages, and the components therein, are precisely complex enough to suit the needs of the speakers.


3 Responses to "Language Complexity and You"

  1. This is the best response to this question I have ever read. I get tired of people (whose knowledge of a language other than their mother tongue is minimal) saying that X language is more difficult than Y language. I normally keep my answer simple by saying that it depends on who is trying to learn X language, how motivated they are, and what their background (including mother tongue/knowledge of other languages is). Now, at least I can vary my response, and instead ask them, “On what level – morphological?” That may well have the desired effect: a hasty change of subject!

  2. A language may be more complicated than another to learn, but that may not make it more complex. Notice the distinction between complicated and complex. Complication is simply a linear measure of how hard it is to learn or grasp the language, perhaps from a specific speaker’s point of view such as in the Foreign Service Institute. Complexity is something else, describing (I am not a linguist) how parts of the language affect other parts, such as how tenses or subjunctives shape sentences and parts of the speech, etc. Complexity is about how elements of a system affect other elements of the system to generate sometimes unforeseen effects. Now, the interesting part would be to understand how language complexity enables or hinders the ability to express complicated thoughts, modern, technological thoughts or new business concepts.

    • ibarrere Says:

      I’m not sure it’s suitable to establish such a distinction though. Think of what contributes to a language’s difficulty to learn: many noun classes, frequent stem changes, subtle and/or numerous tense distinctions, etc. All of those also contribute to a language’s overall complexity. Of course there are other factors that don’t necessarily contribute to the complexity of the language but still contribute to its difficulty to learn, but those are mostly focused around differences between it and your native language. I agree that a language’s complexity isn’t an exact analogue for its difficulty or ease of learning, but I think they’re fundamentally related.

