What a letter costs

An essay on who decides what exists

May 17, 2026

In late 1990, in an office in Brussels, someone ran the numbers. Producing a keyboard cost more if it carried a key that wasn’t used across the rest of the common market. That key was the Ñ. And Spain had a law requiring manufacturers to include it on any machine sold within its territory.

In 1991, a report from the European Economic Community recommended repealing that law.

The official reason: to harmonize the market. The predictable consequence: manufacturers would be able to sell keyboards without the Ñ. And since almost no one was going to manufacture two models for a single country, keyboards with the Ñ would tend to disappear.

No one wrote “we want to erase a letter.” What was written was: we want to remove a trade barrier. But the trade barrier was a letter. And the letter was alive in the mouths of five hundred million people.

The Real Academia Española responded in May 1991 in El País: a grave attack on the language. García Márquez, who had been a Nobel laureate since 1982, wrote:

It is outrageous that the European Community has dared to propose to Spain the elimination of the ñ for the sake of commercial convenience. The perpetrators of such abuse should know that the ñ is not an archaeological relic, but exactly the opposite: a cultural leap of a Romance language that left the others behind by expressing with a single letter a sound that in other languages still requires two.

The fight lasted two years. On April 23rd, 1993, World Book Day, the Spanish government passed a Royal Decree shielding the Ñ in keyboards, invoking the Treaty of Maastricht. Cultural exceptions. Culture takes precedence.

That’s the story as it’s usually told. And it’s true. But it’s only a tenth of what was actually happening.

First, it helps to know that the Ñ doesn’t belong to Spanish.

Well — it does, but not only. The Ñ belongs to Guaraní, co-official in Paraguay. To Quechua and Aymara in the Andes. To Mapudungun in Chile and Argentina. To Asturian and Galician. To Tagalog in the Philippines, where it arrived through colonial contact and took root. To Wolof in Senegal, Gambia, Mauritania. To Chamorro in Guam.

When the EEC proposed allowing keyboards without the Ñ, it wasn’t affecting only Spain.

It was affecting a letter alive across four continents and ten languages, in the mouths of people who had never voted for any European deputy and probably didn’t know that in Brussels someone was debating their alphabet.

The defense was pan-Hispanic from minute one, coordinated through the Association of Academies of the Spanish Language.

And in parallel, another fight was happening across the Atlantic: Latin American keyboards use a different layout — the American one — and there the battle for the Ñ was being waged against North American manufacturers.

Two fronts at once. The same letter. Two different enemies on the same side of the cultural map.

That alone tells you this wasn’t a European fight. It was a fight of the Ñ against a mold that had been built somewhere else. Long before. And for other people.

Let’s go back to California, thirty years earlier.

In 1963, a committee of American engineers designed a system that would let computers store text. They called it ASCII. And they decided that 128 numbers would be enough.

That’s the number. And it’s worth looking at closely.

Of those 128, some are control codes no one ever sees. What’s left is about 95 printable characters: the 26 letters of the English alphabet in uppercase and lowercase, the ten digits, the basic English punctuation marks, and a handful of symbols. That’s all.

Ninety-five characters to represent human writing.

Inside those 95, not a single sign of Spanish would fit. Nor French, nor German, nor Portuguese. The Ñ didn’t fit, nor the Ç, the Ü, the É. None of the accents fit. And of course neither Arabic, Hebrew, Russian, Greek, Hindi, Bengali, Tamil, Chinese, Japanese, Korean, Vietnamese, Thai, Burmese, Swahili, Tibetan, nor Mongolian. The writing of most human beings alive in 1963 did not fit.

This was not an oversight. A monocultural committee looked at itself, decided that what it used was enough, and declared it sufficient for the planet. The planet wasn’t in the room. The committee was.

That decision from 1963 still governs what you write today.

When the problem was eventually addressed, the system wasn’t rebuilt.

Patches were made. Supplementary tables, new encodings, layers on top of the original layer. And the 1963 piece stayed intact as the privileged base.

That means something very concrete. And it’s where the whole trick becomes visible:

A letter from the English alphabet takes up one byte. Eight bits. The minimum.

The Ñ takes up two bytes. Double. Because the Ñ wasn’t at the table in 1963 and had to come in afterwards, through the back door of Unicode. It pays double what an A pays, simply for having arrived late to a table it wasn’t invited to.

And it scales.

For you, the Ñ costs double. For a Vietnamese speaker, the Ộ costs triple.

Three times what an A pays. For political reasons tied to standards negotiation, these characters took several versions to be admitted. For years, Vietnamese speakers who wanted to write their language on a computer had to give up the tones. Giving up the tones in Vietnamese means giving up meaning.

The tiered bill:

For one English W you pay one byte. For one Ñ you pay two. For one Ộ you pay three.

The price of a letter, inside the machine, is not a property of the letter. It is a property of when it was allowed to exist.

And then there are those who didn’t get in through an extension, but had to build a whole new house.

Chinese, Japanese, Korean, Taiwanese, Vietnamese couldn’t be encoded with extensions of ASCII: they have thousands of characters, not twenty-something.

During the eighties, each country invented its own local system. All of them incompatible with each other.

For a decade, while some were exchanging emails without friction, a billion people in East Asia lived in digital silos that couldn’t even communicate with one another. A Chinese document wouldn’t open on a Japanese machine. A Korean email wouldn’t reach Taiwan legibly.

Each Asian country isolated from the one next to it, all of them solving the same problem — a problem that wasn’t even theirs: the problem that the global standard only fit English.

In 1991, when Unicode was created to unify all of that, Asian characters went through a process called Han unification: the ideograms with common historical origins in Chinese, Japanese, and Korean were merged into a single table. It was read as a cultural decision imposed by American manufacturers who had spent years downplaying the needs of other writing systems.

Meanwhile, the indigenous languages of the Americas kept waiting. Guaraní, Quechua, and Aymara didn’t enter Google Translate until May 2022.

Four years ago. Sixteen million speakers of Andean languages couldn’t ask a machine to translate anything in their own tongue. They existed, they spoke, they wrote. To the machine, they were not.

And Africa?

Well, here’s the thing. Unicode currently encodes 159 writing systems worldwide. Of those, fifteen are African. Another forty African writing systems are still outside, waiting to come in.

More African writing systems remain pending digital existence than those that have managed to exist at all.

And that’s only the first layer, because Africa has more than two thousand living languages, of which roughly eighty percent don’t even have standardized orthography yet. It’s not that they pay triple or quintuple in bytes. It’s that the question of how much they pay can’t even be asked: most still haven’t been written down by consensus, not because they don’t deserve to be, but because sixty years of global technical priorities never put any resources into helping establish those orthographies.

What happened in 1991 with the Ñ was not a fight between Spain and Brussels.

It was the tail-end of a much larger operation, decided on another continent. In 1963, in California, a committee had decided that the world’s alphabet was theirs. Everything that didn’t fit would have to adapt, wait, or disappear.

The Ñ got lucky: a market with weight, a centuries-old Royal Academy, a Nobel laureate writing in El País, a pan-Hispanic network of academies across four continents. It fought and won.

The Vietnamese vowels didn’t get that luck. There was no Vietnamese García Márquez writing in El País, and if there had been, it wouldn’t have been published in Spanish. The Chinese characters didn’t either. The Andean languages didn’t get it until 2022. And many African, Oceanic, and indigenous languages are still paying triple — or simply not existing for the machine.

And here’s the thing that doesn’t get said, and ought to be:

This wasn’t a technical problem. It was a cultural problem dressed up as a technical one.

When someone decides what fits inside 128 numbers, they are deciding which cultures exist for the machine. When that someone is monocultural, the ones that exist are theirs. And when, on top of that, they call their decision “universal technical standard” instead of “decision about which parts of the world count,” the bias hides inside the word “technical,” which sounds neutral and scientific.

Thirty years later, everyone argues about whether Brussels is protectionist or free-market, and no one remembers that the real fight had started in California, much earlier, and that it was about something much larger than an Ñ.

And now, before closing, it’s worth looking at the screen you’re reading this on.

What comes next isn’t history. It’s what’s happening, in real time, every time someone writes a question to an artificial intelligence.

Language models — ChatGPT, Claude, Gemini, DeepSeek, all of them — don’t process letters. They process tokens. A token is an arbitrary unit the model learned to handle during its training. It can be a whole word, a syllable, a root, two stray letters. It depends on which languages were in the training.

Models are trained, in their overwhelming majority, on English text. Not out of bad faith: because the bulk of the world’s high-quality digitized text is in English. And it’s in English, in large part, because for sixty years English was the only language that fit comfortably inside the machines. The A that entered free of charge in 1963 has filled the planet’s servers without friction for six decades. When a model now gets trained on that accumulated text, it learns to compress English with extreme efficiency. A whole English word can fit inside a single token.

Spanish doesn’t. French doesn’t. Arabic doesn’t. Burmese, Hindi, Shan, Amharic — those fly apart in pieces.

In 2023, Aleksandar Petrov, at Oxford, measured this carefully.

He ran the Universal Declaration of Human Rights — officially translated into hundreds of languages — through the tokenizers of the most widely used models. He wanted to know how many tokens the same text cost in each language.

The same text costs fifteen times more tokens in Burmese than in English. In Shan, up to seventeen. In Hindi, five. In Arabic, four. In Spanish, around double.

Fifteen times. It’s worth pausing on that number.

Because the models have what’s called a context window: a token limit per conversation. Which means that an English-speaking user can fit, inside the same conversation, fifteen times more text, more documents, more nuance, more prior history than a user writing in Burmese. Their thinking has fifteen times more room to unfold inside the same tool.

This essay was first written in Spanish. It paid double to come into being. You’re reading the version that pays half — because you happen to speak the language that the machine learned to compress for free. We’re not paying the same bill, you and I. We never were. That’s what this essay is about.

And this isn’t a passing flaw. It’s structural. As long as the bulk of the world’s digitized text remains in English — and it will, because the loop of the last sixty years doesn’t reverse itself in a decade — the models will keep compressing English better than everything else.

Sixty years later, the same decision keeps charging the same bill, now scaled by artificial intelligence.

And here’s what should be said without blinking.

Artificial intelligence tools are sold, worldwide, as levelers. The official line: now anyone can access legal, medical, financial, educational guidance without having to pay a human expert. Intelligence is being democratized. The barrier between privilege and the rest is coming down.

What happens underneath is exactly the opposite.

A person who speaks English — and who probably, by the correlations of the world as it currently stands, also has better education, better connection, better purchasing power, a better passport — gets the maximum return from the tool at the minimum cost. More conversation. More nuance. More iterations. And if they pay, they pay less to do it.

A person who speaks Quechua, Aymara, Vietnamese, Burmese, Amharic — and who probably, by those same correlations running the other way, has less formal education, worse connection, less money, a worse passport — gets less return from the same tool at double, triple, quintuple, tenfold the cost.

Their conversation runs out sooner. Their nuance overflows faster. And if they pay, they pay more for less.

The tool sold as a leveler is, internally, regressive. It gives more to those who already had more. It charges more to those who already had less. And it does this without anyone having to write a single line of discriminatory code. It does it simply by carrying forward the bias of 1963.

This is not an implementation flaw. It is the logical continuation of the ablation. The operation that started by deciding which letters fit inside 128 numbers now decides which thoughts fit inside a context window. And the ones that fit best are still the ones that fit best back then.

The monocultural decision of 1963 still governs, in 2026, who can think more and who can think less with the most powerful tools humanity has ever had. No dictator controls them. No government controls them. A tokenizer controls them. And the tokenizer is blind, the industry says. It’s just math.

Well, no.

The tokenizer is the latest heir of a 1963 California committee that decided the world fit inside 95 letters. And it compresses best what was declared standard, and compresses worst what was declared extra.

This is the last nail in the coffin of equity. Because the equities that the twentieth century, stumbling forward, tried to push ahead, are now being rewritten from inside a tool that doesn’t need to forbid anyone from anything.

It’s enough that it charges more to think in one language than in another. And since no one sees the tokens, no one sees the bill. And since no one sees the bill, no one protests. And since no one protests, the asymmetry becomes invisible and permanent, just as the 1963 decision became invisible and permanent.

The difference is that in 1963 a letter was at stake. Now what’s at stake is access to augmented intelligence: education, medical consultation, legal advice, navigating bureaucracy, learning, job searches, knowing your rights, official information.

All of that. And all of that, already, today, costs more to think in Quechua, Hindi, Arabic, or Burmese than in English.

Things are. No matter who measures them, they are.

The Ñ exists. The Vietnamese Ộ exists. The Punjabi ਪ exists. The Amharic ሰ exists. The thought of every person who speaks each of those languages exists, is legitimate, is sovereign, and should cost the same to unfold as the thought of any person who speaks English.

But it doesn’t cost the same. It costs double, triple, quintuple. Or more. And no one protests because no one sees it. And no one sees it because the tokenizer is invisible. And the tokenizer is invisible because we’ve been told it’s just math.

It isn’t just math. It’s math inherited from a 1963 cultural decision that has never been revised — only patched, scaled, and now automated.

What happens in California keeps happening in California. Only now it happens on every screen on the planet.

Lines Aja

Brand Strategist & Verbal Identity Consultant — Las Musas

cultooruido.com

Discussion about this post

Ready for more?