Thursday, February 28, 2013

Rise of the Machines: Computational Linguistics

We've said before that Google Translate isn't very good, and it isn't. The main problem is that when we compare it to human translators, it's proven to be horrendous. However, the science behind it, known as computational linguistics, is very impressive.

If you've ever tried programming you know it's not the easiest subject. Programming languages follow a strict syntax that can rarely be broken. With natural languages you can make a mistake and be understood, whereas computers refuse to allow the user such liberties.

When you combine programming with the field of linguistics you end up with what we call computational linguistics. Its function is to model human, or natural, languages, often with the use of crazy mathematics and programming.

Computational linguistics, like most technology, came about due to political paranoia during the 1950s. When the United States became aware that it had made some foreign enemies, it decided that messages needed to be translated into English, rather than learn another language. How times have changed...

Even this is more advanced than the
machine they tried to translate with.
The initial research was done using text, as speech recognition poses its own problems. Unsurprisingly, what really interested the Americans was the Russians. They needed scientific journals translated from Russian to English, and en masse.

As you can imagine given the state of current machine translation, sixty years ago the technology was even less advanced. So much so that the scientists were required to reassess the whole field. Word for word translation would barely make any sense, and once they knew that grammar, syntax, lexicon, and morphology were essential to producing high-quality and accurate translations, they had ultimately given themselves significantly more work.

What started as research into using machines to monitor the Russians has since developed into the study of what makes languages work and how to synthesize them. Though  machine translation can be easily criticised, you cannot take away some of the phenomenal work that goes on in computational linguistics departments across the world as they endeavour to map, model and analyse our favourite thing in the world, languages.