Linguistic Limitations

There's a way to dumb things down while conveying more information. Here's how machines eat the cake and have it, too.

Dec 01, 2024

Language can be a significant source of inefficiency in communication and knowledge management. Some languages are so rudimentary that you can’t perform basic arithmetic with them, while others have dozens of synonyms that vary critically in subtlety.^⁠1 Some can express a thought in two characters that would take a long sentence in another language. Meanwhile, some can elaborate for several pages with little substance. French speakers, anyone?

This limitation is perfectly illustrated by what happened at the Facebook AI Research (FAIR) lab when, in an attempt to train an artificial intelligence agent to conduct better negotiations, developers juxtaposed two bots and made them try to outbid each other. In this case, a “happy accident” was that scenario specifications didn’t force negotiating parties to stick to English or try to sound like humans. Very quickly, Alice and Bob, conventional names in computer science used to hydrate otherwise dry-sounding third-party A and third-party B, realised that using “meat bags” communication gizmos sucked, resulting in the following dialogue excerpt disclosed by Facebook.

Bob: “I can can I I everything else.”
Alice: “Balls have zero to me to me to me to me to me to me to me to me to.”

Although it seems nonsensical, this is an efficient and concise way to convey a point. These two sentences could replace a paragraph or chapter if one does not consider the opponent’s background. The experiment had to be interrupted as things were getting dangerously out of control. The experimenters could not explain the reasoning behind this gibberish. When justifying AI’s decisions, trails go cold quickly. However, allow me to share my thoughts.

When I see phrases like “can can,” “I I,” or “to me to me to me,” I can’t help but think of languages that extensively use reduplication—the act of entirely or partially repeating a word to create new meanings.

In Thai, for example, you can form emphasis from singular nouns, such as “sǔay sǔay,” meaning “(very) beautiful.” You can also make adjectives less precise, like “pèt-pèt,” which changes “spicy” into “spicy-ish,” or ask someone to do something, such as “dern rew rew sì,” meaning “walk quickly!”

In Chinese, reduplication is crucial. For example, “tiān” (day) can be turned into “tiāntiān,” which means “every day.”

Bahasa Indonesian and many other languages also do this, sometimes making them sound just as strange as the chatter of Facebook’s bots.

This might seem suboptimal for those who read this in a language that doesn’t rely on reduplication. After all, isn’t adding an ’s’ to the end of a word a much more efficient way of forming plurals? Let’s play the devil’s (or AI’s, in this case) advocate for the sake of a pinkies-out philosophical debate.

First, languages that rely on reduplication also use an equivalent of the ’s’ suffix in English. Chinese has the character “们” (men), which is used specifically when written directly after the singular form of a noun.

Second, while great flexibility can lead to confusion, some words don’t have plural forms. For instance, words like “cacti” do not use the suffix at all, while others are spelt the same way, regardless of whether plural or singular. Some words may have a suffix but are not considered plurals. Handling these exceptions is costly in engineering. On the other hand, reduplication is surprisingly efficient in eliminating confusion, although it requires extra space, which is inexpensive.

Third, while the suffixing technique may seem natural today, it has not always been so. Chinese is one of the ancient languages established before the invention of sound tokenisation, known as the alphabet.

Imagine being in a theoretical setting where you are unaware of a tool, like a letter representing a sound. If someone asks you to invent a written language, your initial instinct would probably not be to design a collection of scribbles representing sounds and call it an alphabet. There is a direct correlation between the levels of abstraction of what we want to operate and the controls provided to achieve it. Consider kitchen stoves, their knobs, multitaskers, and unitaskers.

A letter represents a simple atomic sound. A combination of letters forms a word, an abstraction of a series of simple sounds creating a more complex sound. In turn, the latter is an abstraction of a critical mass of people from the same ethnic group who agree to refer to tangible things like “sun.” Therefore, “s” + “u” + “n” = “sun,” a conventional way of referring to that object in the sky that shines for half the day.

Why would you develop something so complicated if you were a language inventor? If you want to say “sun,” just draw the sun and agree on a specific way of barking while pointing at it with your finger. That is precisely what our ancestors did.

The word '“horse” evolution in Mandarin Chinese.

We still use picture dictionaries today. They seem like a great idea to many because they align with our thinking. However, once you teach your brain the connection between the word ‘horse’ and its image, you may wonder how to express concepts like ‘riding” a “fast” horse in that language. Finding pictures for abstract concepts can be challenging. You face the same issue our ancestors encountered. They weren’t aware that you could place an abstraction layer between the written word and the concept, like an alphabet. Consequently, they would draw a person on a horse to illustrate the riding action.

The word “ride” evolution in Mandarin Chinese

Fine. However, the problem is that now you’ll use the picture of a horse rider to express the act of riding any other animal or object. This merely shifts the problem elsewhere. You can express the verb but must provide more context to avoid confusion. Therefore, you must rely on context or specify what you are riding.

This is similar to most other languages that are lumpers. Hebrew, for example, often behaves like a splitter. The words for “love” differ depending on whether they refer to a lover, God, a parent, a child, or a mother. This is not the case for lumping languages.

Trade-offs will always exist, regardless of your approach. The Iron Triangle requires you to decide where you prefer complexity. This is why linguists agree that, although we prefer classifying languages, all languages are roughly equally complex. Whether something is easy or difficult for you depends on personal factors such as your mother tongue, ability to hear and reproduce foreign sounds, engineering mindset, and memory. Learning Chinese is much easier for a Vietnamese than for a Brit.

Which instrument is more challenging to learn: the piano or the guitar?

Finding notes on a piano roll is easy because they are arranged linearly from left to right. The initial difficulty of learning a musical scale is straightforward: press each white key from left to right. However, you must memorise many chord shapes if you want to play them on this layout. You also don’t experience any physical pain pressing those keys—a low entry barrier at the expense of postponing complexity.

If you’re learning to play the guitar, you need to invest a substantial amount of time memorising the locations of the notes on the fretboard. You will also experience blisters and sore muscles at the beginning of your journey. This is a trade-off for learning just one chord shape and moving it up and down the fretboard to play that chord in any key—a high entry barrier with the advantage of having a simpler life in the future.

Guitar chord transposition to play different keys. Source: fachords.com

Learning the alphabet can seem challenging for toddlers in school. This creates a barrier of complexity that, once overcome, allows for easy manipulation of abstract concepts and reading movable type. In contrast, Mandarin Chinese learners are initially promised straightforward one-to-one mappings between objects and their corresponding characters, only to discover complications later in the process. By then, they are entrenched in the language ecosystem, shaping their cognitive approach to languages. Transitioning from piano to guitar will be challenging.

Despite the gamut of linguistic and contextual pains of the design described above, these languages allow you to fit a lot of content into something as short as “I can can I I everything else.”

Now that you have a meaningful token for the sun, can you think of a creative way to use what you already have to express something like “very sunny”? If you’re not a programmer, can you guess why JavaScript developers use “===“ to indicate “strictly equal” instead of using “==“ or “=“? Did you know that “eighty” or “quatre-vingt” literally means “four times twenty” in French? Can you read the following Roman numeral: XXII?

Reduplication is such a natural mental response that I can’t avoid mentioning one of its most hilarious manifestations. To illustrate, you and I are going to Japan.

…

Bam! Here we are in a typical Japanese household, relaxing in front of the TV and flipping through channels broadcasting the reality shows popular in the country of the rising sun (read “s”+”u”+”n”). In these shows, participants do obstacle runs dressed in spongy hot-dog costumes, play Twister while being administered air enemas, or get hit with a stick whenever they laugh as their classmate struggles to read a book in English. This one looks interesting, so we pause our channel hopping for a moment.

“One, two, three, four, five, six, seven, eight, nine, ten,” the classmate shouts proudly. So far, so good. He continues, “eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen…”

A moment of hesitation…

“Teen teen, teen teen one, …”

Classmates can’t resist and get their butts smashed. The show goes on. The reader is asked to say “one hundred.” At this moment, everyone knows what comes next: the punishment is imminent.

While folding his fingers, he fires away a rapid “teen, teen, teen, teen, teen, …”.

This is one of those instances where a picture is worth ten times ten times ten words. I highly recommend you interrupt your reading now and watch it. I’ll wait.

Now that you’re back with your tears wiped let’s return to the conversation between Facebook’s machines.

There’s a reason why the man in the video folded his fingers to hit under or over a hundred instances of “teen.” Humans are known to be lousy at Henry Fordesque conveyor belt repetitive tasks. We become sloppy very quickly. Did you have to reread any part of this text at any point? If not, just wait a little longer. Therefore, as human beings, we either leverage our strengths or work around our weaknesses. This is why saying “twenty” rather than “ten ten” makes sense to us human beings. Computers, however, prefer “10100” over “twenty.”

If I were a machine, which some of my peers complain I am, I’d use my fortés. Why create additional abstraction levels if performing a million simple repetitive operations in a split second is a cheaper way to achieve the same result?

The fractal nature of collecting and connecting the dots and the divide-and-conquer approach is evident when examining how computers operate at any zoom level. It resembles a flock of birds forming incredible, morphing patterns in the sky. Each bird has one role but executes it well, aligning with the Unix philosophy. Graphics Processing Units (GPUs), responsible for rendering complex visuals on our screens, have one primary function: drawing an enormous number of triangles per second. The more triangles your graphics card can draw, the sharper the image you can see without jitter. It manifests as a deep, complex shape or a high-definition movie to our eyes, and our brains adeptly complete the picture if it is incomplete. However, at its core, it’s a machine continuously repeating “triangle, triangle, triangle, ….”

Although it might sound like once a machine renders a complex-looking result on the screen, we consume it as it is. We, too, aren’t as monolithic as we seem.

Squint your eyes. Source: creativebloq.com

Once our sensors—eyes, ears, etc.—pick up input, we understand that we’ve identified a dog or a cat due to numerous elementary atomic events. Seeing a cat activates a network of interconnected neurons in our brains that fire together each time we see one and are told it is a cat, allowing this circuit to be permanently imprinted in us. What fires together, wires together. To a great extent, we are as fractal as the machines we’ve created. If you think about it, it’s a fascinating biological design, not unlike the conversation between Alice and Bob at FAIR.

Subscribe subscribe…

McWhorter, J. (2019). [C] Language families of the world. USA.

The Mechanics of Knowledge Management

Discussion about this post