Friday, October 31, 2014

The Etymology of Colours: Part 2

On Wednesday we looked at the origins of words we use for colours, focusing on the first three colours of the visible light spectrum: red, orange, and yellow. Today we'll be going through the remaining colours.


In Middle English and Old English, the colour green was grene. In Proto-Germanic *gronja- was the root of green, grass, and grow, as well as the root of the word for green in a number of other related languages such as Dutch, Danish, Old Frisian, Old High German, Old Norse, and Old Saxon.

Earlier, the term in Proto-Indo European (PIE) languages is thought to have been *ghre-, which means grow, since green is the colour of most vegetation.


The story of blue is fascinating. The word comes from the Old French term blo, which generally referred to a range of colours and shades including what we would now consider blues, greys, blonds, and other pale colours. The term is thought to have evolved from the Proto Germanic term *blæwaz, which gave rise to the term in a huge number of languages.

While in PIE languages the term for blue is fairly widespread, what really makes the colour fascinating is the lacunae it has left in other world languages. While in English the colours green and blue are fairly distinct, there are plenty of languages where the two colours are referred to by one term. Several languages in Asia, including Old Chinese, Old Japanese, Thai, and Vietnamese did not distinguish between the two and refer to a concept of a blue-green range that doesn't really exist in English.

The term for blue is thought to be a latecomer to the vocabularies of many languages since the dye is so difficult to make, while autumnal shades such as reds, oranges, and yellows were easier to make and therefore required terms sooner.


For many English speakers, designating the colour between blue and violet seems arbitrary and difficult to define. Its presence in the spectrum is thought to be a result of Sir Isaac Newton's superstition against the number six.

The colour chosen by Newton was none other than indigo, a term whose origins can be found in the Greek name for the colour dye which came from India. The Greek word indikon (ινδικόν) became indicum in Latin before inspiring indico in Spanish and endego in Portuguese, which are considered to be the root of the Dutch word indigo. The Dutch term entered the English language in the 16th century.


The last colour in the rainbow is violet. The term came from Old French by way of Latin where it was viola in reference to both the colour and the flower. It is thought to have come from a PIE language somewhere in the Mediterranean. During the 14th century, the term came to Middle English from the diminutive of the Old French viole.

After the weekend, we'll be back to look at the terms for some of the colours outside of the visible light spectrum.

Part 1 | Part 2 | Part 3

Wednesday, October 29, 2014

The Etymology of Colours: Part 1

Today we're taking a trip through the rainbow as we look at the etymology and origins of the names we use for colours. For simplicity, we're going to start today with the classic "rainbow" colours, which Sir Isaac Newton dubbed the spectrum, from the Latin for "apparition". The term later became used to reference the visible light split through a prism, another Latin word meaning "sawed", which originated as the Greek term prisma.


The first colour of the rainbow has origins in several languages and unfortunately can't be traced back to one single language. The word red was written as rēad in Old English. In fact, the British surname Reed is from the Old English for red, and is pronounced in a similar manner to how it was said before vowel shortening occurred in Middle English.

Before Old English, the word was rauthaz in Proto-Germanic, from rewdʰ, a Proto-Indo European (PIE) word. As a result of this origin, a large number of languages have similar words for the colour.


The word, colour, and fruit called orange, is often subject to a large degree of debate. While many people claim that it is one of the only words that rhymes with no other word, this is not actually true. The word sporange, a sac where spores are made, is one of the few words that rhyme with it that isn't a proper noun.

Rhyming aside, there is also a debate as to whether the fruit was named because of the colour or whether the colour was named after the fruit. Etymologists consider the colour to be named after the fruit since the word's origins are from the Sanskrit word for the tree. नारङ्ग or nāraṅga made its way into Persian as نارنگ, or nārang, before reaching European languages.

While the word nārang remained fairly true to its roots in a number of European languages, when it reached Old French it is thought to have lost its initial "n" due to rebracketing, whereby the initial "n" was thought to be part of the indefinite article "une" so that "une norenge" was heard as "une orenge".


Yellow has an interesting etymology that is similar to that of the colour red. Yellow's roots begin with PIE languages. The root of yellow in PIE has retained the same root as yell for several millennia, as both words originate from the PIE root gʰel-. This shared root has resulted in a number of European languages, particularly the Germanic languages, having similar words for yellow. The words for yellow in Dutch, East Frisian, German, Swedish, and West Frisian all have similar origins.

The term ended up in Proto-Germanic as gelwaz before it became geolu in Old English. This Old English term gave us the word we use today for yellow. However, it should be noted that in Middle English, the term also referred to colours and tones that we wouldn't consider yellow by today's standards, including a number of blue and grey colours.

We'll finish the remainder of the rainbow on Friday when we'll cover the colours with shorter wavelengths.

Part 1 | Part 2 | Part 3

Friday, October 24, 2014

United Nations Day: The Languages of the UN

Today, October 24, marks the date that the Charter of the United Nations came into effect. While it hardly makes for a riveting read (you can read it here if you must), what it does in practice is far more astounding, since it acts as the treaty that founded the UN.

The flag of the UN
The treaty itself was signed on 26 June 1945 at the San Francisco War Memorial and Performing Arts Center. When it was signed, Poland was the only of the 51 founding nations not present,  eventually signing the treaty a couple of months later.

The five permanent members of the Security Council (P5) at the time, the Republic of China, France, the UK, the US, and the USSR, ratified the charter alongside a number of other nations. While it may seem odd to mention the P5, their importance will become evident as we look at the official languages of the UN.

When the charter was made, it was written in five languages: Chinese, English, French, Russian, and Spanish. It wasn't until the first General Assembly that the five official languages and working languages of the UN were decided. Initially, English and French were decided upon as the working languages.

Spanish was added as a working language in 1948, making the three languages the status quo for the General Assembly until 1968, when Russian was added as the fourth working language. By this point, four of the five official languages were in use as working languages. Chinese was then made a working language in 1973, making all five original official languages also working languages.

Arabic was added as both an official and a working language in 1973. The official language status of Arabic only extended to the General Assembly and its "main committees", as opposed to the five other languages, which held official status throughout all committees. For the first three years after Arabic became an official language, the Arab nations of the UN were expected to fund the procedures required enact this change.

After seven years as an official language for the General Assembly and its main committees, Arabic's official status was extended to all subcommittees in 1980. Three years later, all six languages were adopted as the official languages of the Security Council.

Currently, there are a number of additional languages vying for official language status. In 2009, the president of Bangladesh suggested that Bengali be an official language of the UN. Esperanto has also been suggested, despite its relatively small number of speakers.

Hindi and Portuguese have also been suggested since they are both widely-spoken languages. The Secretary-General of the UN and the Turkish Prime Minister have also suggested that Turkish become one of the official languages.

Do you think the UN uses the right languages? Which languages do you think should become official languages of the UN? Tell us in the comments below.

Monday, October 20, 2014

Celebrating the Linguistic Life of Richard Francis Burton

On this day in 1890, Richard Francis Burton's fascinating life came to an end. Today we've decided to honour the man with a post about his life and his work as both a linguist and translator. While the stories of linguists and translators are often fascinating to us, few have led a more interesting and exciting life than Richard Francis Burton.

The hyperpolyglot himself in his later years.
Burton was born on 19 March 1821 in Torquay, England. However, a relatively small amount of his time was spent in his hometown since his family travelled often when he was a child. He spent a good number of his very early years in Tours, France after his family moved there in 1825. Burton later returned to England to attend a prep school in Surrey.

As his family travelled across Europe, generally between the United Kingdom, France, and Italy, Burton's love for languages led to him learning a considerable number of them. Starting with primarily Romance languages, he learnt French, Italian, Latin, and Neapolitan. He also learnt some Romani following a supposed affair with a gypsy woman, as well as learning Arabic during his time at school.

Having enlisted in the East India Company's army, Burton shipped out to India where he mastered a number of the local languages, including Hindustani, Gujarati, Punjabi, Sindhi, Saraiki and Marathi, not to mention improving upon his Arabic and adding Persian to his rapidly-growing list of languages. He also owned a group of monkeys which he attempted to communicate with, earning him much ridicule from his fellow soldiers.

Eventually, a sense of adventure compelled Burton to undertake a pilgrimage to Mecca, earning him widespread fame. However, Burton was undercover during the pilgrimage. While he had extensively researched and improved upon his Arabic, he pretended to be Pashtun in order to help explain why he spoke the way he did.

Burton was an active participant in the Crimean War after he rejoined the army. After an alleged mutiny in which Burton was mentioned during the subsequent enquiry, he spent time exploring Africa.

After several stints exploring Africa, Burton's later years were spent in diplomatic and academic roles. He spent time in Brazil, Damascus, and Trieste, to name a few places. He also continued to travel and write before undertaking the translations that earned him significant recognition.

Sir Richard Francis Burton translated the Kama Sutra, which generated considerable controversy at the time. He also translated The Book of the Thousand Nights and a Night, which is often known as Arabian Nights. By the time Burton died, he had mastered somewhere between 25 and 40 languages, depending on how you count them, making him more than worthy of our respect.

Friday, October 17, 2014

Hatsune Miku: Virtual Vocals and Synthetic Singing

During a recent Facebook scrolling session, an odd link popped up on my news feed. It was this video of a musical performance on the Late Show with David Letterman.

You don't need to be the most observant person in the world to realise that the performer, Hatsune Miku, or 初音ミク, as her name is written in Japanese, is not a real person. Hatsune Miku is not the first virtual performer; other popular virtual acts include Alvin and the Chipmunks, The Archies, and Gorillaz. However, Hatsune Miku can do something that other acts can't do: sing.

You may think that her high-pitched singing is not as good as the sped-up singing of Alvin, Simon, and Theodore, and you may be right. However, the Chipmunks, much like other virtual acts, had their music and their vocals pre-recorded. Hatsune Miku's vocals are synthesised using Yamaha's VOCALOID2 and VOCALOID3 vocal synthesisers.

If you're familiar with Japanese, you may recognise the components of Hatsune Miku's name. In fact, the name translates as "the first sound from the future", with Hatsu (初) meaning "first", Ne (音) meaning "sound", and Miku (ミク) meaning "future".

Sapporo, Japan, the hometown of Hatsune Miku.
While 16 year-old Hatsune Miku could be said to be from Sapporo, the technology that allows her to sing was conceived of in Spain as part of a research project at Pompeu Fabra University in Barcelona.

Hatsune Miku's voice isn't purely synthesised and is in fact generated from phonemes prerecorded by Japanese voice actress Saki Fujita. Initially, only Japanese phonemes were recorded, before learning English (from Saki Fujita's recordings) for a later release. This allows her to sing in both languages, albeit with a Japanese accent when she sings in English.

The process that allows for the manipulation of the phonemes into song is known as concatenative synthesis. Using this process, sound samples (known as units) can be manipulated. This allows the user to modify a range of qualities, including the unit's length, pitch, and timbre.

Since anyone who owns the software can synthesise speech and vocals, Hatsune Miku is "technically" the performer of thousands of songs. She's not alone, though. There are also other virtual performers available with different language combinations such as Spanish and Chinese. Other languages can also be approximated using preexisting phonemes, with differing levels of success.

Wednesday, October 1, 2014

Localization and the Video Games Industry: Who Gets What?

Last weekend, Saturday to be precise, I was lucky enough to take a trip to London for this year's Eurogamer Expo, which now refers to itself as the cooler-sounding "EGX". As a self-confessed video game and language nerd, I am very interested in the translation and localization of video games and electronic entertainment.

When I was younger, I often didn't give a second thought to the fact that the video games I played were always either in English or provided an option to select English from a number of languages. As a kid I would often head into town to get a new game and immediately spend the entire trip home reading the blurb on the back and the instruction manual.

Growing up in the UK meant that the text on the box and in the instructions was either only in English or was in EFIGS (English, French, Italian, German, and Spanish), which are often deemed the "most important" languages in Europe. While some of the packaging featured other languages, the software often was only in English, with no other language options provided.

The discrepancy between the packaging and the software barely bothered me as a kid. However, as an adult I now realise that large corporations will only translate and localize games when there is a profitable market to be exploited. With all this in mind, I decided to quickly do some research into which languages and locales the video games industry favours.


Steam's search engine allows for the filtering of the online distribution service's catalogue by language. This past weekend there were 14,576 titles available on Steam, with around 90% of these available in English. Titles in the other EFIGS languages are widely available. 44% of titles are available in German and almost 42% are available in the French language. 37% and 35% of games are listed as being in Spanish and Italian respectively. 

These figures are hardly surprising if you just take a look at the usage notes for "EFIGS" on Wiktionary: "In software development, used to designate five widely used languages that software (notably video games) is often translated to."

It's very clear that games are not translated in the same proportions as there are speakers of a language. If this was the case, Simplified and Traditional Chinese combined would not account for only 4% of the games available through Steam. In fact, it's fairly obvious (and a little sad) that the proportions clearly line up with the relative size of the markets and their spoken languages.

Xbox Marketplace

It's not just the language you speak that may limit the number of games you can get. While I am lucky to speak English, I am also in the United Kingdom. However, that did little to console me when I found out that if you take a look through the Xbox 360 games available on the Xbox Marketplace, you are privy to a vastly different number of games depending on your locale.

The United States enjoyed the largest number of games available. At the time I checked, the UK's catalogue contained 76 fewer titles than the United States. That said, there were 1223 titles available in the US and the UK's catalogue contained 1147 games, making the difference minute.

While Steam showed a linguistic bias towards European languages, the Xbox Marketplace tends to favour markets in North America and Europe, where users have access to more titles than elsewhere in the world. For example, 1112 games were available in Spain while only 365 were available in Argentina, despite both countries being primarily Spanish-speaking. For some odd reason, Argentina also has half as many games available as other Spanish-speaking countries in South America, such as Chile (840), Colombia (861).

Much like on Steam, mainland China gets the short end of the stick, where a paltry 25 titles were available. However, 976 were available in Hong Kong. Undoubtedly this can partially be attributed to non-linguistic factors. In fact, the Hong Kong marketplace had more titles available than any other Asian locale.

Israel, Turkey, and Saudi Arabia all have access to between 300 and 400 games while in the United Arab Emirates (UAE), over 500 titles were available. Does this increase have anything to do with the fact that the UAE is home to the highest net migration rate in the world?

Is the difference between the number of games available in Europe and South America solely due to the size of their video game markets or are there political and economic reasons as well? Is the discrepancy just because some languages are easier to work with than others? If you happen to be an industry expert or deal with localization, I'd love to hear from you in the comments below.