Monday, December 30, 2013

Language Profile: Czech

Today we'll be taking a brief look at Czech, the official language of Czech Republic. Czech is a member of the Slavic language family, and was formerly known by the name "Bohemian" in English until the late 19th century. It shares much of its lexicon with other Slavic languages such as Ukrainian and Russian.

The Czech language is also used as a minority language in Slovakia, whose official language is Slovak. These two languages are closely related, to the extent that most varieties of both are mutually intelligible. Together, they can be considered to be a pluricentric language, or language with multiple standard versions, such as Catalan/Valencian/Balearic and Serbo-Croatian.

The Charles Bridge over the Vltava River in Prague.
There are several dialects of Czech, including one specific to Czech speakers residing in Texas. The dialect developed in the late 1800s when many Czech immigrants moved to this area of the US. Czech journalism was prominent in the area for some time, but in recent years use of the dialect has been declining, sadly. 

The Institute of the Czech Language regulates standard Czech from its headquarters in Prague, the capital of the Czech Republic. The language is written using a Latin-based script. It is interesting to note that the language contains many words that contain no vowels, such as vlk ("wolf") and smrt ("death"). It also uses three different genders: masculine, feminine, and neuter. The masculine is then further divided into animate and inanimate, though the same is not true of feminine and neuter terms.

Friday, December 27, 2013

5 Tips for Learning a Dialect by Amy Rinkle

Just as there are dialects and regional variations to spoken English, students of languages quickly become aware of the many dialects that exist within other languages. As an Arabic student myself, I soon learned there are many different shapes and forms of Arabic; spoken dialects that were far more commonly spoken in daily life than the Modern Standard Arabic I was being taught at University. One of the many hurdles I’ve faced when studying Arabic is how to acquire the vocabulary, phraseology and inflection that was specific to the dialects I was interested in speaking, not just the formal language I was being taught. I’m not alone in this problem, and Arabic is not the only language where students face this issue.

So what should you do if you want to learn a specific dialect, but the resources and classes available to you are oriented to a different one? Here are five tips to help:

1. Learn the professional spoken and written language

Yes, that’s right. Even if the classes and resources that are available to you are not available in the dialect you most want, it is not a waste of time to learn the common or professionally spoken and written dialect. This knowledge will aid you when you are learning your dialect, and it will enable you to at least communicate with native speakers, and give you a base knowledge of vocabulary and grammar from which to compare where one dialect differs from another.

There is a debate on whether or not it is better to learn a regional dialect first, and then the formal or professional dialect, and I will not take sides on that debate — I will only say that in my experience, many of the resources and books that are available for teaching students a regional dialect are supplemental, and would be vastly confusing to someone who did not already have familiarity with the language. Therefore, I do not consider a waste of time (just the opposite!) to put in significant effort in learning the most common or professional dialect.

2. Look in obscure places to find new resources

Though they might not be easy to find, and though it may be harder to find classes that are taught in your target dialect, there are resources out there for those who are looking to learning a regional dialect. It’s finding them that can be the problem!

There are several different ways to find books and recordings meant to teach you your chosen dialect. I would suggest first finding a university that is teaching your language and has a strong study abroad program to the region, and email or talk to the language department or area studies center and ask for resources. Since most professors who will be teaching in that language are native speakers, they will often know about obscure curricula in regional dialects, and they will have access to the niche publishers and organizations that produce them. Some resources are not even available to the general public! It certainly can’t hurt to ask, and many professors will be delighted to help advise you on learning a dialect, or even recommend a tutor, if you are living in the country and region that speaks the dialect you are trying to learn.

It may sound bizarre, but missionary organizations or aid groups are other places that might have excellent recommendations on resources for learning a dialect. Since many of their members are interacting with the public and therefore need to speak a dialect in a specific region, they will often have either created their own curriculum for learning that dialect, or they will know where to find classes, tutors, books, or other resources. All it takes is an email to find out.

3. Watch and listen to local media

Sometimes, the best way to begin learning words in your chosen dialect is by listening and watching media that features that dialect. Movies that are set in the regional where the dialect is spoken, even when its in your target language, are often not the best bet — accents are often toned down in movies, or actors are hired who are not from that specific region. But the news can sometimes be a good source for hearing the dialect, and I have found that talk shows and interviews in particular are excellent for hearing the dialect spoken. Talk radio that is specific to a major city in the region is also a way to begin listening.

Music can also be a great way to pick up a dialect, and especially vocabulary that is specific to a region. Artists from a region will often sing in that particular dialect. Rap, however, is often very specific to the region and you will hear a lot of new words and vocabulary. Personally, it’s not my favorite style of music, but for learning an accent, I would recommend listening to up and coming rap artists — and this is regardless of the language. Rap as a genre is available in a wide variety of languages, and even if it isn’t, it’s still worth it to seek out the music that is produced and sung by artists who speak your target dialect.

4. Find a native speaker to talk to

The three tips above will help you in beginning to learn a dialect, but you will never be able to master it unless you find native speakers to talk to and ask questions. If you live in a region that speaks the dialect you want to learn, this is fairly easy to do — you will be running into native speakers that you can practice with, and many expat groups and language centers are able to recommend native speakers to partner up with.

If you do not live in the area and you want to learn to connect with a native speaker, I recommend emailing a university, again. Oftentimes a language department will host conversation clubs and will have contacts in the community who will know native speakers that might be able to meet with you. If that is not an option, then I suggest looking online. There are several sites that exist to help facilitate language partnerships and meetings via Skype. Look for someone who is a native speaker of your target dialect, and jump onto Skype.

5. Use technology to help fill the gaps

Beyond the sites mentioned above, the internet is a great resource. Use Twitter to find people from the region where your dialect is spoken — many of them will be writing and interacting in that dialect. Jump onto forums or sites specific to that region and see how the members write back and forth to each other. Use Youtube to look up video clips of interviews, shows, and even regular people talking to each other in your target language and dialect. This information is not designed to help you learn a dialect, but it will still assist you, especially if you combine it with the tips above.

There is also a site called forvo.com that is crowdsourcing spoken language with clips of native speakers pronouncing different words. The clips specify which country the speaker is from. It is not precise, but it can help with hearing the pronunciation of the dialect, and if there is more than one recording, then hearing the differences between speakers from different countries and regions.

Finally, my last suggestion is to be a part of the online language learning community, and to follow blogs such as the Lingua File. I myself work from Lango, an iOS app that will help record and crowdsource languages and their dialects, and I have only learned about sites like forvo.com and other resources by being part of the language learning community. Though reading a blog may not directly contribute to learning a dialect, it might point you to new tools that can.

Amy Rinkle is a 25 year old perpetual Arabic student, French speaker, and freelance writer. She is currently affiliated with Lango, an app to learn any language, anywhere, which is fundraising on Kickstarter until January 13th, 2013.


Wednesday, December 25, 2013

Merry Christmas!

Wherever you are, if you celebrate it, we'd like to wish you a Merry Christmas! We'd like to thank everyone who reads, contributes, and gets involved. Thank you for your support. Normal service will be resumed on Friday. Until then, don't overindulge and enjoy this time of year.

Merry Christmas!

Monday, December 23, 2013

Language Profile: Quechua

This week we're taking a look at Quechua, a macrolanguage primarily spoken in the Andes mountain range of South America. Known as runa simi in its native tongue, Quechua is the most widely spoken indigenous language in the Americas, with over 8 million native speakers. It is an official language in Peru and Bolivia, and is also spoken in Colombia, Ecuador, Chile and Argentina.

As a macrolanguage, Quechua is actually a language family of several closely related varieties. Some of these varieties aren't mutually intelligible, and are in fact completely distinct languages. 

A couple of llamas enjoying a swim.
The word llama comes from Quechua.
Quechua has a long linguistic history. It was the official language of the Inca Empire. The language was also used as a lingua franca between the Spaniards and the indigenous populations after the Spanish conquest of the area in the 16th century.

It has been written using a Latin-based alphabet since the Spanish conquest, though it is primarily an oral language. Due to its main use as a spoken language, there are not many books or newspapers published in Quechua.

In recent times, there has been quite a bit of lexical transfer between Quechua and Spanish. In fact, approximately 30% of the terms in modern Quechua come from the Spanish language. A few Quechua terms have also made their way into the English language, which we covered several months ago.

Friday, December 20, 2013

The Dos and Don'ts of Siamese Twins

Today we're looking at Siamese twins, not the politically incorrect version of conjoined twins, but rather a linguistic concept that almost dictates your word usage. Have you ever wondered why you can say "dos and don'ts" yet not "dont's and dos", and why it would be so incredibly wrong to do so?

The term is named after original Siamese twins Chang and Eng Bunker, the conjoined twins from Siam who effectively popularised the condition and "inspired" the name used for the condition.

Mmm... peanut butter and jelly!
If you feel uncomfortable using the phrase "Siamese twins" in this day and age, then feel free to use any of the other, perhaps more acceptable, linguistic terms that include the word binomial, meaning "having two names". These irreversible binomials are expressions with two principal elements that can be nouns, adjectives, verbs, or adverbs, usually joined by a conjunction (andornor, and but in most cases). Sometimes they are instead composed of the two words in isolation. Examples include "above and beyond", "peanut butter and jelly", and "rain or shine". There are even trinomials, such as "tall, dark and handsome" or "signed, sealed, delivered". 

Perhaps you've been thinking to yourself that a pre-existing relationship between words that can dictate their usage sounds familiar. If you were just about to say that this sounds a lot like a collocation, you would be right. Some Siamese twins are effectively a collocation so strong that it is effectively frozen, hence the term freezes being another more accurate and politically correct name for the phenomenon.

Though certain Siamese twins are indeed a collocation that is so unbreakable that reversing the order sounds almost disgusting, some are not collocations at all and instead are idioms that through regular use are ingrained into the lexicon of the language.

Given that Siamese twins are fixed, unchangeable, and therefore, always the same, they inevitably become clichés and catchphrases through overuse.

There are really no such things as the dos and don'ts of Siamese twins, as the only thing you can do is use them as they are and make sure you don't change them. That said, they are a fascinating phenomenon that we probably never give much thought and often overlook despite being almost innately aware of their existence and constraints.

Wednesday, December 18, 2013

The Key to Learning Pronunciation by Gabe Wyner


As rumor has it, you can’t learn to have a good accent if you’re above the age of 7, or 12, or some other age that you’ve most definitely already exceeded. But that can’t possibly be true. Singers and actors learn new accents all the time, and they’re not, on average, smarter than everyone else (and they certainly don't all start before the age of 7).

So what’s going on here? Why does everybody tell you that you can’t learn good pronunciation as an adult? And if that’s not true, what is?

In this article, we’ll take a tour through the research on speech perception and pronunciation, and we’ll talk about learning pronunciation efficiently as an adult. But first, allow me a moment on my soapbox:

Pronunciation is important

This is a big topic, and as an opera singer, it’s a topic close to my heart. I find accents extraordinarily important.

This is a fényképezőgép
For one, if you don’t learn to hear the sounds in a new language, you’re doomed to have a hard time remembering it. We rely upon sound to form our memories for words, and if you can’t even comprehend the sounds you’re hearing, you’re at a disadvantage from the start. (Try memorizing Hungarian's word for camera, fényképezőgép [recording] or train station, vásutállomás [recording]. These words are brutal until you really get a feel for Hungarian sounds.)

But in addition to the memory issue, a good accent connects you to people. It shows people from another culture that you’ve not only taken the time and effort to learn their vocabulary and their grammar; you’ve taken the time to learn how their mouths, lips and tongues move. You’ve changed something in your body for them – you’ve shown them that you care – and as a result, they will open up to you.

I’ve seen this repeatedly when I sing or watch concerts in Europe. As a rule, audiences are kind, but when you sing in their native language, they brace themselves. They get ready to smile politely and say, “What a lovely voice!” or “Such beautiful music!” But beneath the surface, they are preparing for you to butcher their language and their heritage before their eyes. No pressure.

At that moment, if you surprise them with a good accent, they open themselves up. Their smiles are no longer polite; they are genuine. You’ve shown them that you care, not just with your intellect, but with your body, and this sort of care is irresistible.

But enough romanticizing; how do you actually do something about pronunciation?

Research on Ear Training and Pronunciation

Good pronunciation is a combination of two main skills: Ear training and mouth training. You learn how to hear a new sound, and you learn how to make it in your mouth. It’s the first of these two skills that’s the trickiest one; if you can hear a sound, you can eventually learn to produce it accurately, but before then, you’re kind of screwed. So for the moment, we’ll focus on ear training.

While doing research for my book, I came upon a wonderful set of studies by James McClelland, Lori Holt, Julie Fiez and Bruce McClandiss, where they tried to teach Japanese adults to hear the difference between “Rock” and “Lock.” After reading their papers, I called up and interviewed Dr. McClelland and Dr. Holt about their research.

The first thing they discovered is that ear training is tricky, especially when a foreign language contains two sounds that are extremely similar to one sound in your native language. This is the case in Japanese, where their “R” [ɺ] is acoustically right in between the American R [ɹ] and L [ɫ]. When you test Japanese adults on the difference between Rock and Lock (by playing a recording of one of these words and asking them which one they think you played), their results are not significantly better than chance (50%). So far, so bad.

The researchers tried two kinds of practice. First, they just tested these Japanese adults on Rock and Lock for a while, and checked to see whether they improved with practice.

They didn’t.

This is very bad news. It suggests that practice doesn’t actually do anything. You can listen to Rock and Lock all day (or for English speakers, //[bul/pul/ppul] in Korean), and you’re not going to learn to hear the differences between those sounds. This only confirms the rumors that it’s too late to do anything about pronunciation. Crap.

Their second form of practice involved artificially exaggerating the difference between L and R. They began with extremely clear examples (RRrrrrrrrrock), and if participants improved, stepped up the difficulty until they reached relatively subtle distinctions between the two recordings (rock). This worked a little better. The participants began to hear the difference between Rock and Lock, but it didn’t help them hear the difference between a different pair of words, like Road and Load. In terms of a pronunciation training tool, this was another dead end.

Then they tried feedback, and everything changed.

Testing pairs of words with feedback

They repeated the exact same routine, only this time, when a participant gave their answer ("it was 'Rock'") , a computer screen would tell them whether or not they were right ("*ding* Correct!"). In three 20-minute sessions of this type of practice, participants permanently acquired the ability to hear Rs and Ls, and they could do it in any context.

Not coincidentally, this is how actors and singers learn. We use coaches instead of computerized tests, but the basic principle is the same. We sit with an accent coach and have them read our texts. Then we say our texts out load, and the coach tells us when we’re right and when we’re wrong. They’re giving us feedback. They’ll say things like “No, you’re saying siehe, and I need sehe. Siehe…Sehe. Hear that?” And as we get closer, they’ll keep continue to supply feedback ("You're saying [something that's almost 'sehe'] and I need sehe.”) After the coaching, we’ll go home, listen to recordings of these coaching sessions, and use those recordings to provide us with even more feedback.

Now, some caveats. Participants didn’t reach a full native ability to hear the difference between Rock and Lock. Their accuracy seemed to peak around 80%, compared to the ~100% of a native speaker. Further investigation revealed what was going on.



Consonant sounds have lots of different components (known as 'formants'). Basically, a consonant is a lot like a chord on a piano: on a piano, you play a certain combination of notes together, and you hear a chord. For a consonant, you make a certain (more complex) combination of notes, and you hear a consonant. This isn’t just a metaphor; if you have a computerized piano, you can even use it to replicate human speech.




English speakers tell the difference between their R’s and L’s by listening for a cue known as the 3rd formant – basically, the third note up in any R or L chord. Japanese native speakers have a hard time hearing this cue, and when they went through this study, they didn’t really get any better at hearing it. Instead, they learned how to use an easier cue, the 2nd formant – the second note in R/L chords. This works, but it’s not 100% reliable, thus explaining their less-than-native results.

When I talked to these researchers on the phone, they had basically given up on this research, concluding that they were somewhat stumped as to how to improve accuracy past 80%. They seemed kind of bummed out about it.

Possibilities for the future

But step back a moment and look at what they’ve accomplished here.

In three 20-minute sessions, they managed to take one of the hardest language challenges out there – learning how to hear new sounds – and bring people from 50% accuracy (just guessing) to 80% accuracy (not bad at all).

What if we had this tool in every language? What if we could start out by taking a few audio tests with feedback and leave with pre-trained, 80% accuracy ears, even before we began to learn the rest of our language?

We have the tools to build trainers like this on our own. All you need is a spaced repetition system that supports audio files, like Anki, and a good set of recorded example words (A bunch of rock/lock’s, thigh/thy’s, and niece/knee’s for English, or a bunch of sous/su’s, bon/ban’s and huis/oui’s for French). They take work to make, but that work only needs to be done once, and then the entire community can benefit.


Pronunciation is too important, and this solution is too valuable to wait for some big company to take over. Over the next 9 months, I’m going to start developing good example word lists, commissioning recordings and building these decks. I’m going to recruit bilinguals, because with bilinguals, we can get recordings to learn not only the difference between two target-language sounds, like sous and su, but also the difference between target language sounds and our own native language sounds (sous vs Sue). I ran this idea by Dr. McClelland, and he thought that may work even better (hell, we might be able to break the 80% barrier). And I’m going to do a few open-ish beta tests to fine tune them until they’re both effective and fun to use.

Hopefully, with the right tools, we can set the “It’s too late to learn pronunciation” rumors to rest. We’ll have a much easier time learning our languages, and we’ll have an easier time convincing others to forget about our native languages and to speak in theirs.

Gabriel Wyner is the author of Fluent Forever (Harmony/Random House, August 2014) and the Fluent Forever blog. His Kickstarter project, a series of pronunciation trainers in 11 languages, will run until January 2, 2014.

Monday, December 16, 2013

Language Profile: Zulu

Today we'll be looking at Zulu, the second most spoken Bantu language in the world after Shona, which we covered in last week's profile. It is also known as isiZulu, with -isi being a prefix associated with languages that is used in Zulu.

Drakensberg, the highest mountain range in South Africa.
Zulu is one of South Africa's 11 official languages, and is understood by over half of the country's population. It is the most common native language in South Africa, where it is spoken by approximately 23% of the population, followed by Xhosa, Afrikaans, and English. Zulu is also spoken in the neighboring countries of Lesotho, Malawi, Zimbabwe, Mozambique and Swaziland.

Despite being the most common native language in South Africa, Zulu has only had official status in the country since 1994. The language has also been increasingly used in television, radio, film, newspapers and education in recent decades. Standard Zulu tends to avoid the use of loanwords, while the variety spoken in cities includes them in its lexicon, including many loanwords from English.

As with many African languages, Zulu was not written until the arrival of European missionaries, hence the use of a Latin-based alphabet. If you're looking for a challenging language to learn, you might be interested in Zulu. It's a tonal language whose syllables can have one of three tones: high, low or falling. It also contains some click consonants, a feature typical of languages in the region. These consonants have three different types of articulation: dental, alveolar and lateral.

Friday, December 13, 2013

How Crowdsourcing Is Like The X Factor


This crowd at a cricket match are probably as good at
choosing a national pop icon as they are at translating.


Over the last few weeks we have ashamedly become somewhat addicted to the X Factor in the UK. With the final this weekend, we will certainly be watching and hating ourselves for it. It did get us thinking though, that the X Factor is exactly like crowdsourced translation.


First of all, if you didn't know, the X Factor goes through auditions followed by a series of rounds in which one contestant is eliminated each week. The two contestants with the lowest number of votes, as voted by the public, compete in a sing-off whereby the panel of judges select the better of the two to save, at least until the following week.

Although the finalists are selected by the judges from nationwide auditions, it's certainly not guaranteed that you'll necessarily like all of the finalists. You can only hope that the public will vote to eliminate the worst contestants first.

With crowdsourced translation, the translation solutions are provided by a large community of people, just like the public voting for contestants on the X Factor. Crowdsourcing often comes under criticism for providing a lower quality in translation. One explanation for this is that often there is no prior screening to ensure the skills of the translators who translate, just like how those who vote on the X Factor are not necessarily music experts, talent scouts, or agents. They may know what they like, but that doesn't mean they know what qualities constitute a pop star that will be famous for years to come.

However, some crowdsourced translation efforts are edited or harmonised by a professional, much like the judges that are entrusted to "save" the best act of the two with the lowest number of votes on the X Factor. This cannot always work though, if for instance the public vote has put two acts who deserve to stay in the competition into the bottom two, or if perhaps there are mistranslations that have permeated through the crowdsourcing systems and made their way to the editor.

This comparison provides a good argument against crowdsourcing. If you consider the previous X Factor winners, Steve Brookstein, Shayne Ward, Leona Lewis, Leon Jackson, Alexandra Burke, Joe McElderry, Matt Cardle, Little Mix, and James Arthur, how many have had sustained musical careers, or even success amongst their peers through awards?

The UK isn't the only country to have the X Factor.
In the same way that not everyone is a translator, not everyone is good at choosing the best musicians or singers. Of course, the business model of X Factor is not entirely geared towards producing stars, but rather generating earnings through sponsorship, marketing, TV ratings, and the huge buzz that surrounds the show in the build up to the final. Of course, if crowdsourced translations provided the same buzz and entertainment as a national spectacle culminating in one solitary winner, then perhaps it would be worthwhile having it.

We're definitely not saying that crowdsourced translation is a lost cause that produces terrible results. Much like the X Factor, the proof's in the pudding. Whereas the X Factor has only had 9 "winners" and the quality of those 9 acts is debatable, crowdsourced translation has produced a huge amount of work and from the criticisms levelled at it, the quality is certainly significantly behind that of work produced by an individual professional translator or a harmonised team of translators.

Wednesday, December 11, 2013

Language Profile: Shona

In this week's language profile, we're taking a brief look at Shona, the first Bantu language we've covered. Also known as Chishona, it is the native language of the Shona people of Zimbabwe and Zambia. It is also the most widely spoken Bantu language in the world in terms of native speakers, closely followed by Zulu. It is spoken in Zimbabwe, Zambia, Botswana and Mozambique.

The Zambezi River in Zimbabwe.
In Zimbabwe, Shona is one of 16 official languages. While English is used in the education and judicial systems, it is not the native language of most Zimbabweans. Shona is spoken by approximately 70% of the population of Zimbabwe, while Ndbele (aka Sindebele), another indigenous Bantu language, is spoken by about 20% of the population. Media such as radio and television are available in all three of these languages. 

The name Shona is used to refer to a standardized language based on the language's several dialects: Karanga, Korekore, Zezuru, Ndau, and Manyika. Recently, at least one linguist has suggested that each dialect is in fact a distinct language from Shona. In any case, the many varieties of Shona constitute a dialect continuum that spans across a large geographical area.

The Shona alphabet is written in a Latin-based script which includes additional letters such as dzv, svw, tsv, and zh. If you're looking for a language to learn, Shona may be for you, since it contains features such as all syllables ending in a vowel, all verbs ending in -a, and all five vowels (a, e, i, o, u) being pronounced the same as in Spanish. The main phonological difficulty would be the language's whistled sibilants, which are common in languages of Southern and Eastern Africa, but foreign to many other areas of the world.

Monday, December 9, 2013

Intro to Translation Studies: André Lefevere and Cultural Theories

As part of our series on Translation Studies (TS), we've looked at the foundations of the field, how it was subject to three important turns (linguistic, cultural, and sociological), and in our last post in the series we looked at the concept of dynamic equivalence, as popularised by Eugene Nida.

Today, we'll be delving into the second of the turns in TS, the cultural turn. As you will have seen in our previous post, dynamic equivalence was one of the first widely-accepted theories in TS to put the cultural aspect of translation to the forefront.

The city of Ghent, where Lefevere first studied his
undergraduate degree.
Following dynamic equivalence, the focal point of TS began to change, whereby culture took centre stage. Translation was no longer a transfer between texts, but a transfer between cultures. Perhaps the first noteworthy theorist to stake their claim as a cultural translation theorist was Itamar Even-Zohar with his polysystem theory. However, polysystem theory was focused solely within literature as a system of systems.

Belgian theorist André Lefevere built upon Even-Zohar's work viewing translation as far beyond linguistic transfer and seeing translation as not being trapped in texts, but as a means of adapting and retelling the source text (ST) or source medium.

Lefevere was also one of the first to take the focus of translation away from the source and put greater importance on the target as a product of the ideology, economics and status within the the target culture. He was not the first theorist to involve culture in the paradigm of TS, but he was one of the first to view culture seriously as an aspect of the translation act.

Though some modern-day scholars have argued the validity of both linguistic and cultural theories, scholars of the cultural turn were very quick to oppose the linguistic turn and dismiss its theories. The quality of translation could not be simply judged on how well equivalence is met between the source text (ST) and target (TT). During the turn, translation became considered as an act that takes place within a particular time in history, within a particular culture.

Friday, December 6, 2013

Language Profile: Balearic

After taking an in-depth look at Catalan and Valencian earlier in the week, we're going to conclude our week of language profiles with a brief look at Balearic.

The coast of Mallorca.
Balearic is a dialect of Catalan spoken in Balearic Islands, which are located in the Mediterranean Sea off the coast of Spain. The name was created by linguists to describe the group of Catalan varieties spoken in the Balearic Islands. Speakers generally refer to these varieties by their local names: mallorquí on the island of Mallorca, eivissenc in Ibiza, and menorquí in Menorca. 

In the Balearic Islands, the standard form of Catalan regulated by the Institut d'Estudis Catalans is used. However, it is adapted to the Balearic dialect by linguists at the Universitat de les Illes Balears to create a standard Balearic dialect. For example, the Balearic standard doesn't use endings in the first-person singular of the present tense, so "I sleep" becomes jo dorm and "I fear" becomes jo tem, as opposed to standard Catalan which uses an -o ending.

The classification of Balearic as a dialect of Catalan is somewhat controversial, but seems to cause less conflict than the disputes regarding the classifcation of Valencian in relation to Catalan. As usual, the occasional politician in the Balearic Islands claims that Balearic is a separate language from Catalan, but generally without any actual linguistic basis.

The Balearic dialect contains several words that differ from those used in standard Catalan or Valencian. For example, the word moix is used for "cat" instead of gat, while ca is used for "dog" instead of gos. Balearic also contains some English loanwords that remain from the period of British occupation, which include grevi ("gravy"), xoc ("chalk"), and xumaquer ("shoemaker").

To conclude our look at these three language varieties spoken in Spain, we'd like to point out that some linguists suggest using the name català-valencià-balear (Catalan-Valencian-Balearic) to refer to them. It is a bit long, but it certainly would settle some long-standing linguistic disputes!

Catalan | Valencian | Balearic

Wednesday, December 4, 2013

Language Profile: Valencian

Today we're continuing our look at the closely related language varieties known as Catalan, Valencian and Balearic. On Monday we looked at the Catalan language, and today we'll be focusing on Valencian.

Valencian is an official regional language of Spain in the area known as the Valencian Community. While it has high levels of prestige among its speakers, its use is not as widespread as that of Catalan within Catalonia. In large cities such as Valencia and Alicante that have experienced a recent influx of immigrants, Spanish tends to be the dominant language. However, in certain areas of the Valencian Community, Valencian is the preferred language and is used in all social situations.

Since Spain recognizes both Valencian and Catalan as official regional languages, government texts are translated into both languages, though the resulting documents are often identical. When it comes to defining Valencian, the linguistic consensus seems to show that it is the name for Catalan in the Valencian Community. It is also important to note that while it is another name for the same language, there is no consensus that Valencian should be seen as a dialect of Catalan, as there isn't sufficient historical evidence to show that Catalan preceded Valencian.

A prawn sculpture in Vinaròs, a primarily
Valencian-speaking town.
The status of Valencian in relation to Catalan has been controversial for several decades, for both cultural and political reasons. Valencians have a strong sense of cultural identity with their own literary history, cultural traditions, and linguistic characteristics, which they consider different from the Catalan identity. Some politicians claim that Valencian is a completely distinct language, generally because they believe that some Catalan nationalists want to eliminate the Valencian identity by creating a political union between Catalonia and the other Catalan-speaking areas of Spain.

There are several regional dialects of Valencian which are mainly distinguished due to phonological differences. The Southern Valencian dialect spoken in the area between the cities of Valencia and Alicante is generally considered to be the standard variety. Standard Valencian is regulated by the Acadèmia Valenciana de la Llengua (AVL), which was created in 1998. Its official position is that the language spoken in the Valencian Community is the same language spoken in Catalonia and the Balearic Islands. Some linguists from the AVL believe that Valencian should be used as the name for the language as a whole instead of Catalan, which seems to us like a great way to stir up even more linguistic controversy. 

There are many differences between Valencian and Catalan despite their high levels of mutual intelligibility. Besides differences in the pronunciation of vowels, there are also differences in verb conjugations. Spelling differences include the use of huit instead of Catalan's vuit for the word "eight", and meua instead of the Catalan meva for the word "my". Valencian also contains several lexical differences, including the use of roig for "red" instead of vermell and xiquet for "boy" instead of the Catalan term nen. Valencian lexical differences are often due to influence from the Spanish language, such as as the use of per favor for "please" instead of the preferred Catalan term si us plau or sisplau, which more closely resembles French.

We'll be back on Friday with a brief look at Balearic, yet another language variety related to Catalan and Valencian.

Catalan | Valencian | Balearic

Monday, December 2, 2013

Language Profile: Catalan

This week we're going to be looking at three closely related language varieties spoken in Spain and what makes each of them unique. Catalan, Valencian, and Balearic all belong to the Romance language group that also includes the neighboring languages of Spanish and French.

According to most linguists, Catalan is a Romance language. Valencian and Balearic, on the other hand, are generally considered to be the names for the regional varieties of Catalan spoken in their respective regions, the Valencian Community and the Balearic Islands. However, there are many people in Spain who disagree with these linguistic assessments, which we will see later this week when we look at Valencian and Balearic.

Catalan is an official regional language of Spain in the autonomous community of Catalonia as well as the Balearic Islands, an archipelago in the Mediterranean Sea. It is also the sole official language of Andorra, a tiny principality situated in the midst of the Pyrenees mountain range between Spain and France. The language is also spoken in an area of southern France known as Northern Catalonia.

A chart showing how
gender and number
inflection work in Catalan.
Catalan evolved from Latin around the 9th century. Since then it has suffered through some difficult times, the most recent being when its use was banned by Franco's dictatorship between 1939 and 1975. In recent years however, the language has become quite prestigious in the areas where it is spoken. It is used as a language of education, and it is mandatory that it be taught in all schools in Catalonia. Most forms of mass media are also available in Catalan including television, radio, and literature.

Many foreign visitors to Catalan-speaking areas such as Barcelona believe that Catalan is just a "funny dialect of Spanish", but this is not true at all. In fact, Catalan has more in common linguistically with French, Italian, and Occitan than with Spanish or Portuguese.

Standard Catalan is regulated by the Institut d'Estudis Catalans, or IEC. It is written using the Latin alphabet, and uses some interesting digraphs such as ig which is pronounced /tʃ/ at the end of a word, and ix, pronounced /ʃ/.

On Wednesday we'll be looking at the variety of the language spoken in the region south of Catalonia, the Valencian Community. Check back in a couple days to learn more about Valencian.

Catalan | Valencian | Balearic

Friday, November 29, 2013

SYSTRAN: A Brief History of Machine Translation

When we last looked at the history of machine translation (MT), we covered the ALPAC report and prior to that, the Georgetown-IBM experiment. Today we're looking at SYSTRAN, one of the oldest technologies in MT.

SYSTRAN traces its origins back to the Georgetown-IBM experiment, and in 1968, the company was founded by Dr. Peter Toma. Despite the lack of funding available to MT research following the ALPAC report, SYSTRAN survived and would work closely with the US Department of Defense.

In 1969, SYSTRAN was contracted by the US Air Force (USAF) in order to provide MT for them. During the Cold War, as per usual, US military branches were very interested in what the Russians were up to. Translations were from Russian to English and covered various domains, while the USAF was particularly interested in scientific and technical documents.

If you have used MT before, you will know that the quality tends to lag far behind that of human translators. The same could be said for the translations provided by SYSTRAN during the Cold War. Despite the quality of the translations, they were generally understood by those using them.

A barbel fish, not to be confused with BabelFish.
SYSTRAN was contracted to work for the Commission of European Communities (CEC) in 1975. Work began on a new system in 1976 operating from English to French. The system for French to English arrived the following year, and a third language combination was provided in 1979.

By 1981, the CEC was using SYSTRAN on an experimental basis for English-French, French-English, and English-Italian. At the time, French translators did not show the same zeal towards the systems as those translating between English and Italian. In 1982, 293 pages were translated from English to Italian with the assistance of SYSTRAN and 330 pages were translated from French to English. That said, these numbers equated to 50% of the Italian workload and only 25% of the French workload.

SYSTRAN had also provided services for Xerox as of 1978 and had been shown to increase productivity, though in-house translators still expected a higher degree of quality than that of the MT provided. English was translated into six target languages for Xerox, and SYSTRAN reported that they were satisfied with the results.

Xerox staff were encouraged by SYSTRAN to change the way they worked in order to maximise the efficiency of their products, whereas the CEC did not report as much productivity as Xerox. The USAF was also still using SYSTRAN and incorporating the newer language pairings as they became available.

In 1995, SYSTRAN released SYSTRAN PRO on Windows, and by 1997, search engine AltaVista's BabelFish, powered by SYSTRAN, was providing real-time translations on the internet. For many years SYSTRAN provided rule-based MT and helped power Google's language tools until 2007 and the translation widget in Mac OS X, among other things.

SYSTRAN also provided MT combining rule-based translation and statistical machine translation in 2010, one of the first products on the marketplace to do so. Though SYSTRAN is still a distance from the levels attained by human translators, the research conducted throughout the decades could be argued to have helped MT to survive until now.

Wednesday, November 27, 2013

Get It Right: Peek, Peak, And Pique

It has been a few months since we last corrected a common grammatical mistake in the English language, so it's about time we continue our quest to rid the world of these unfortunate errors. Today we'll be explaining the difference between peek, peak, and pique.

Peek

The word peek refers to looking at something quickly or in a secretive manner. It's generally used as a verb, as in "the naughty girl peeked at her Christmas presents".

Valley of the Ten Peaks, Banff National Park, Canada
Peak

Peak has a completely different meaning, which refers to a high point. As a noun, a peak can mean "the top of a mountain", or indeed the high point of any number of other things. In its verb form, it means you've reached the highest state of something, such as "the gymnast peaked at age 16". 

Pique

Finally, we have pique, a term of French origin. It's a verb that refers to eliciting a specific response, generally used alongside the terms "interest" or "curiosity", as in "the strange music piqued my curiosity".

Piqué

Gerard Piqué is a Spanish footballer who plays for FC Barcelona and the Spanish national team. He's most known for his good looks, having a baby with Shakira, and his football talent, in that order.

Are there other common grammar or spelling mistakes that drive you crazy? Let us know in the comments and we may correct them in the future.

Monday, November 25, 2013

Language Profile: Hungarian

This week we're taking a look at Hungarian, the most widely spoken non-Indo-European language in Europe. It is the sole official language of Hungary, where it is the native language of over 99% of the population. The language also has many speakers in Romania, Slovakia, Serbia, and Ukraine.

Hungarian is a member of the Uralic language family, which includes Estonian and Finnish. The Hungarian name for the language is Magyar, which is also used in English as in reference to the Hungarian ethnic group.

The Hungarian Parliament Building in Budapest.
In terms of its lexicon, Hungarian shares many of its word roots with other Uralic languages. It also contains some Latin, Greek, and English loanwords. Another interesting feature of Hungarian is that it is an agglutinative language, which means that words are formed by joining morphemes together, including many affixes.

The first traces of written Hungarian appeared around the 10th century. For a while, Hungarian used its own writing system, known as Old Hungarian script. The script was a runic alphabet that was written on wooden sticks.

In modern times, the language is written using an extended Latin alphabet. The Hungarian alphabet contains 44 letters, including additional letters such as cs, dz, ly, ny, and sz.

The Hungarian language also employs four different levels of politeness. They vary in levels of formality, with one often used for official texts and business situations, and another used in informal situations as well as religious contexts. However, in recent years this last form has been used in more and more situations, with other levels of politeness losing favor. 

Friday, November 22, 2013

The ALPAC Report: The Failings of Machine Translation

One of the organisations interested in
the potential of machine translation.
Not long ago, we had a look at the birth of machine translation (MT) with the Georgetown-IBM experiment. Following the experiment, optimism was at an all-time high for MT, and the problem was expected to be solved promptly. Today we're looking at the next important milestone in early MT, the ALPAC Report. Unfortunately, our tale includes a lot of government bodies and research groups, so expect a lot of acronyms.

In the US, the Department of Defense, the National Science Foundation, and the Central Intelligence Agency (CIA) were very interested in the prospect of automatically processing languages and MT. In the case of the Department of Defense and the CIA, this was mainly because the US was extremely curious and sceptical of the Russians and wanted to know what they were up to. By 1964 they had promoted and funded work in the field for almost a decade, and together the three organisations founded the Joint Automatic Language Processing Group (JALPG).

In 1964, JALPG set up the Automatic Language Processing Advisory Committee (ALPAC) in order to assess the progress of research. ALPAC was, in essence, founded by the US Government to ensure that funds were being spent wisely.

John R. Pierce, head of ALPAC.
The group was headed by chairman John R. Pierce, an employee of Bell Labs, who was assisted by various researchers into MT, linguists, a psychologist and an artificial intelligence researcher. They worked together in order to produce the 1966 ALPAC report, which was published in November of that year.

Titled "Languages and machines: computers in translation and linguistics", the report would appear to have a focus not only on MT, but also on computational linguistics as a whole. However, the report viewed MT very narrowly, from the perspective of its applications in terms of the US government and military, and how they could use the technology exclusively with the Russian language.

The report showed that since most scientific publications were in English, it would actually be quicker and therefore more cost-effective to learn and read Russian than to pay for translations into English. They also noted that there were an abundance of translators and that the supply of translators outweighed the demand for them, meaning that there was even less demand for research into MT to replace human translators.

While the report evaluated the translation industry in general, it also covered research into MT. It condemned the work done in Georgetown, as there was little evidence to support quality translations from the same place that had spawned the idea that the MT issue was close to being solved.

In fact, Georgetown's MT project had produced no translations of scientific texts, nor had it any immediate plans to do so. The report had defined MT as a process that required no human interaction and the fact that Georgetown's work still required human post-editing left ALPAC to deem it as a failure.

One of the criticisms of the unedited output of the MT was that though it could be deciphered by a human reader, it was sometimes inaccurate or completely wrong. It also criticised the work of Georgetown when compared with the 1954 experiment, stating that the output from 10 years previous were not only better, but showed little progress of the programme after that time.

Though the input for the original experiment was extremely limited and the systems tested by ALPAC were experimental, this did not lead to ALPAC cutting Georgetown any slack. ALPAC did, however, state that MT was not an issue with a foreseeable resolution as the Georgetown-IBM experiment had certainly suggested.

Though ALPAC hardly praised MT, it did appear to approve of the ideas of "machine-aided translation", which effectively refers to translation tools, which are fairly commonplace in today's translation industry. The report assessed that MT had advanced the field of linguistics more than it had the field of computing, and that MT was not deserving of more funding. Before it could receive more funding, certain criteria would have to be met.

In conclusion, ALPAC suggested the following:
  1. practical methods for evaluation of translations; 
  2. means for speeding up the human translation process;
  3. evaluation of quality and cost of various sources of translations;
  4. investigation of the utilization of translations, to guard against production of translations that are never read;
  5. study of delays in the over-all translation process, and means for eliminating them, both in journals and in individual items;
  6. evaluation of the relative speed and cost of various sorts of machine-aided translation;
  7. adaptation of existing mechanized editing and production processes in translation;
  8. the over-all translation process; and
  9. production of adequate reference works for the translator, including the adaptation of glossaries that now exist primarily for automatic dictionary look-up in machine translation
It would be fair to say that given the aim of the report, ALPAC achieved its objective of assessing MT. The downside to the report is that research into MT was effectively suspended for two decades, since all significant government funding was cut.

Perhaps we are little bitter that the ALPAC report was so damning of the work of MT merely because we can still see failings in modern day MT, such as our "favourite" Google translate. However, it would be fascinating to see what MT could have achieved had it been funded with as much fervour during the 60s, 70s, and 80s as it had been in the mid-to-late 50s.

Do you feel we would be better off had MT research continued? Or do you think "Machine-Aided Translation" was the correct avenue to pursue? Tells us your thoughts in the comments below. If you wish to read the 1964 ALPAC report, a full copy can be found here.

Wednesday, November 20, 2013

Intro to Translation Studies: Equivalence

When we first introduced Translation Studies (TS), we discussed the emergence of the linguistic turn, whereby TS drew most of its fledgling knowledge from its sister discipline, linguistics. When first considering a translation, there is a question that every translator asks themselves. Does the target text (TT) accurately reflect what is written in the source text (ST)?

Obviously, for every good translator the answer to this question should be "yes". However, it surely can't be that simple, can it? Unfortunately, the answer seems to be "no".

As we covered in our introduction to the series, prior to the 1950s there was not a significant call for TS as a discipline. However, by the 1950s there were studies being conducted and even classes being taught in comparative linguistics, whereby established academics and students alike were formalising the field of TS.

The concept that we will be discussing today, equivalence, put simply is finding an equal value (hence equi and valence) between the ST and the TT. However, you will soon see, as with many things in TS, that it's not that simple.

The idea of natural equivalence proposed that languages have pre-existing equivalents before translation takes place. To oversimplify, if, prior to contact with each other, the French and the English had made bread, surely there would be a word for "bread" in both English and French that shares a natural equivalence. Any time the word bread is used in English, it could surely be translated as pain in French, and vice-versa.

A Korean road sign. Would anyone care to
tell us how it translates literally?
Of course, as anyone with any practical experience in translation knows, this is rather idealistic. Unsurprisingly, natural equivalence is fairly prescriptive and of little use to practising translators.

The other, and perhaps more useful, side of the coin is directional equivalence, whereby the translation does not pre-exist the act of translation. The French TS scholars Vinay and Darbelnet humourously discussed this in an anecdote concerning road signs on Canada's highways, particularly in Quebec, where the signs are in both English and French.

On one particular road sign, the word slow in English is represented as lentement in French, a translation of the English. This is seen to be peculiar as in France the word ralentir is used, as chosen by the French to convey the message, rather than being a translation from English. The relationship between slow and ralentir in road signs is seen as natural equivalence, whereas the relationship between slow and lentement is seen as directional equivalence.

In the 1950s, American Eugene Nida looked at TS through a linguistic lens. His earliest works were based on structural linguistics, a field of linguistics stemming from Ferdinand de Saussure's seminal work. While Nida's work in linguistics may be of interest to the linguistic field, it was his work in TS in the 1960s that we will be paying closer attention to today.

Nida's most important work on equivalence could be argued to have helped bridge the linguistic turn to the cultural turn that we will soon be covering. Nida's work concerned directional equivalence, which was subdivided into formal and dynamic equivalence.

In formal equivalence, the structure (such as syntax and grammar) is strictly adhered to, which in some cases can create unnatural-sounding and unwieldy expressions in the target language. Dynamic equivalence, however, renders the TT in more idiomatic and natural ways in the target language, while maintaining the meaning of the ST. However, these are not a "one-size-fits-all" solution and have been criticised as oversimplifying issues in translation.

Nida's theories on equivalence, though sometimes criticised, are important to TS as they were some of the first to consider culture as the main focus in the translation action and pioneering for the time.

Monday, November 18, 2013

Language Profile: Greek

After over a year's worth of language profiles, we've finally reached the world's oldest recorded living language, Greek. It is the official language of Greece, as well as an official language of Cyprus alongside Turkish. It is also comprises its own independent branch of the Indo-European language family. 

Records of the Greek language stretch back for centuries, with the earliest found written record of the language being a clay tablet that dates back to between 1450 and 1350 BC. It has been an essential part of the development of European as well as Western history with its use in everything from the philosophical texts of Plato and Aristotle to epic poems such as the Odyssey. Greek was also a lingua franca of the Mediterranean for a time, as well as the official language of the Byzantine Empire. 

The Parthenon in Athens, Greece
The Greek language is an important part of the study of Classics, a branch of humanities that focuses on the civilizations of Ancient Greece and Ancient Rome. Its roots are often used as a starting point for the creation of new words in other languages such as English, especially for terminology related to the sciences, philosophy, athletics, theatre, and religion. Greek is also one of the main sources of international scientific vocabulary.

Greek also has a large lexicon, primarily derived from Ancient Greek, but with some loanwords from Latin, Venetian, and Turkish. Modern loanwords, however, are more likely to have come from French or English. 

Unsurprisingly, Greek is written using the Greek alphabet, a modified version of the Phoenician alphabet. A few diacritics used to be included in the past, but for the most part have been removed from the language since Greek writing reform in 1982.

Friday, November 15, 2013

Four Ways You Can Become a Better Language Tutor by Ron G

It’s tough becoming a good language tutor.

I've had experience tutoring all kinds of people, ranging from students in intensive language programs, to translators needing help passing certification exams, to college students needing to learn how to write better term papers and essays.

Early in my tutoring career, I was frustrated because I could tell I wasn't helping people as well as I wanted to. With practice and effort, however, I improved.

Based on what I learned as a language tutor, here are four ways to become a better tutor and get the most out of your students.

1. Be Kind

Most people get into tutoring because they like to help people. Yet some people think that to really help their students, they have to be hard on them.

That might not be the best idea.

Tutoring is so effective because you’re dealing with a person in a one-to-one setting and can therefore focus more of your attention. That same kind of intimacy, though, amplifies any criticism or negative comments. Being too harsh, even if your intentions are good, can easily cause a student to become discouraged and clam up.

While you’re correcting students, use quite a bit of tact. Err on the side of kindness. It’s definitely better to be too nice than to be too mean.

2. Tailor your instruction to the student

While studying Spanish, I decided to hire a tutor to help me with my conversation skills. My tutor was a native Spanish speaker who was very intelligent and understood Spanish really well. Unfortunately, during our first session together, he didn’t individualize his instruction for me at all. He simply read from Chapter One of a Spanish textbook.

I got very little out of that session. Had he spent even a couple minutes assessing my needs, he would’ve understood that I needed conversation practice and not a grammar lecture. He would’ve also understood that I was at an intermediate level at the time, well past Chapter One of a beginner’s textbook.

Treat each of your students as an individual. Figure out what they’re strong at and where they need specific improvement, and then come up with an appropriate plan of instruction.

3. Keep your student’s goals in mind

What is your student really trying to accomplish? Be specific when answering this question. Is she trying to:
  • Become conversationally fluent?
  • Pass a test, class, or certification exam?
  • Improve a specific skill, like vocabulary?
  • Prepare for international travel?
Each goal requires its own plan of attack. You probably wouldn't teach a student intricate grammar points if her goal is simply to speak better in social settings. You probably would do that, however, if she were preparing for a CEFR exam.

If your student doesn't have a specific goal in mind, help her identify one. Figure out exactly what she wants to achieve, when she wants to achieve it, and how you’ll measure whether she’s successful.


4. Pick your battles

You can’t nitpick a student to death, especially during conversation.

If you use written assignments, administer quizzes, or perform targeted drills, it’s fine to mark a student’s answers as correct or incorrect. Most people are comfortable with being “graded” if the rules and criteria are well-established and the correction is limited to the context of the activity.

But if you’re helping a student practice conversation, you have to let the student talk.

Language learners speaking their new language have to pronounce words correctly, use proper grammar, and remember vocabulary words accurately. With so much to keep track of and so many ways to slip up, they’re going to make a lot of mistakes.

Try not to correct every single error during speaking. Nitpicking backfires for a several reasons. It will slow down the session, and it will frustrate, discourage, and overwhelm the student.

Instead, pick your battles. During conversation practice, try not to interrupt or correct the student unless you:
  • Cannot understand what the student is trying to say.
  • Notice recurring errors.
  • Spot an error that would cause the student social embarrassment.
Wrapping Up

Tutoring well requires quite a bit of finesse, which comes with practice and time. In the meantime, if you try to keep in mind the four tips above, you should see immediate improvements in your ability to connect with your students and help them achieve their language goals.

Ron G. is a technical writer and translator from Orlando, FL. He writes about language learning at www.languagesurfer.com.