Statistical machine translation with the support of artificial intelligence has been an on-going topic for global enterprises for many years.

The ultimate dream for anybody targeting a multilingual audience is to communicate with people in their own language regardless of cultural or linguistic background, in real-time, and if possible without having to pay for it.

The foundation of any automatic translation providing comprehensible content to a target audience is to have a strong database of pre-translated source content, ideally from reliable resources which contain professional translations.

The more content to compare, the more likely you will get a translation which can be understood. Nobody expects it to be perfect. Human translations are often not ideal either and a lot of it is subjective, but is has to convey the right meaning in an easily comprehensible manner.

Big companies such as Microsoft, Google and Facebook have the financial means to develop ever more sophisticated translation tools and they have been continuously and heavily investing in linguistic analysis.

Facebook tried Bing’s translator’s tool first but quickly realized that it could not handle colloquial content very well and therefore started to develop their own AI-driven translation software which has been available for a while now. Facebook’s translation function performs well as long as the sentences are not too complex, have few redundancies and avoid difficult colloquialism or abbreviations as much as possible.

For some of the bigger languages such as German, French, Spanish and Italian, automatic translations from Google, FB or Microsoft have seen visible improvements. However, languages which largely differ from English such as Japanese, Chinese, Korean or Arabic due to their complex linguistic structure and the lack of sufficient data still have a long way to go.

Facebook claims that it tackles slang, idioms and metaphors better than other tools since it looks more closely at the context in which the text to be translated is presented. However, spoken languages are often tailored to a small group of people and their understanding of a subject area resulting in modifications to the languages which cannot be picked up by a more thorough context analysis.

Spoken language always changes very quickly. What was frequently used and popular a year ago, may have a number of new terms or ways of being described a year later, and this is especially true for informal conversations.

Arabic is one example of a language where there have not been any impressive results in terms of translation of spoken languages, especially on social media. Arabs use different dialects which do not have their own dictionaries and often do not abide by any standard alphabet. A Lebanese person may write one word in Arabic alphabet and the next with a combination of letters and numbers representing sounds which do not exist in the Latin alphabet.

It will be interesting to see what Microsoft, Google and Facebook come up with to tackle this problem given Arabic has been one of the fastest growing internet languages of the past few years.

Microsoft also claims to be able to tackle the spoken language and successfully transcribe it. We are yet to see a tool which can deal with background noises, unclear accents, unfinished sentences, redundancies, and abbreviations etc. which all require a person to have knowledge of the environment, subject area and context in order to fully understand it.

I do believe that translation technology will reach a level that most written text – formal more than slang – will be able to produce reliable and reasonable accurate translations which may only need a human translator for post-editing to ensure the tone is right and the text flows nicely.

However, when dealing with spoken languages, there are just too many elements which have to be considered in order to produce accurate translations. There is a reason why the linguists focusing on translating spoken language are called “interpreters”, not “translators”. When you deal with spoken language, you have to understand the context, you need to get a feel for the tone of the speaker, their fluency, their type of speech and then “interpret” the content – process source language, analyse, summarize, produce target speech.

Similar to colloquialism, marketing content has to be hyper-localized and super-tailored to the target audience since otherwise it may not be understood correctly and in fact get an unwanted or no response at all. There are risks to using any source text for translation in marketing, since the linguist will focus on the text to be translated instead of the objectives and requirements of the target audience when producing the target language content piece. Ideally brands should define their core values, style guides, tone of voice and general campaign objectives on a global level and then leave it to each market to originate content while abiding to the general global brand guidelines.

As such, “marketing translation” isn’t yet a clearly-defined subject area. So avid discussion about automatic translation of marketing content shouldn’t be considered a necessary conversation for today’s marketers. For now.

Hannes Ben

Hannes Ben


Hannes Ben, EVP International at Forward3d.