Machine Translation
Machine translation is becoming increasingly popular and widespread as the technology and data underpinning machine-translation tools continue to advance. However, machine translation is rarely used on its own and may not be suitable for every project. On this page, you can read more about what machine translation is, how it works and what kind of machine translation services we offer here at COMUNICA.
What is machine translation?
Machine translation is an automated translation process whereby a software programme translates text from one language into another without any human involvement at the time of translation. It is distinct from computer-assisted translation (or CAT), which is the use of special software to aid a human linguist in the translation process.
Machine translation research first began in the 1950s with the Georgetown-IBM experiment, and it has developed considerably since then. Initially, it was mostly used as a part of a screening process to identify relevant texts. For example, it was used in the Soviet Union to get a vague sense of scientific papers written in English. Texts of interest were then passed on to human linguists for proper translation.
Nowadays, machine translation tools are advanced enough that they can be used to translate websites, menus and even more technical texts, allowing the user to understand largely what the text is about. However, the semantic ambiguity and fluidity of human text means that, even if these translations can prove useful, they often sound unnatural and can require a certain amount of guesswork and intelligent deciphering in order to be understood without further editing.
How does machine translation work?
But how exactly does machine translation work? The answer to that question has changed over time all depending on the type of machine translation system being used. Initially, most machine translation systems were so-called rule-based machine translation tools (RBMT). This required a set of dictionaries for each language and a complex set of rules for how each language works.
Although RBMT allowed for a high level of control, given that all the rules were written and programmed by linguists and could easily be adapted and debugged, its word-based approach nonetheless meant that it was highly rigid and inflexible. Because human language is a fluid and evolving phenomenon, a great deal of expert manpower remained necessary in order to constantly monitor, update, fix and expand the rules. Customisation based on topic or industry was also sometimes required in order to ensure the right word in one language is paired with the right equivalent in another.
Gradually, RBMT was replaced by statistical machine translation. This phrase-based system is more free-thinking and context-driven than the word-based RBMT. It uses both bilingual and monolingual corpora to generate statistical models and identify the most probable way to translate certain phrases within the current context. This means that it relies on large quantities of existing human translations in order to produce automated translations.
Although this system is generally considered better than RBMT, and is what allowed services such as Google Translate to flourish, it is not without its flaws. It can only translate what has been translated before, and so it is reliant on continuous access to new human translations in order to keep abreast of new changes in language and culture. It is also susceptible to the principle of garbage-in, garbage-out (GIGO), meaning it may replicate certain mistakes if they are made repeatedly by human translators.
Neural machine translation
Neural machine translation is the most recent innovation within machine translation. Neural machine translation is state-of-the-art technology which uses a large neural network to teach itself how to translate. In contrast to RBMT, where machines follow rules written by human linguists, neural systems use end-to-end learning to produce better quality results.
Neural machine translation therefore does not require anything close to the same level of supervision and human intervention as RBMT, and it can adapt to new contexts and be trained quickly through automated processes. Neural machine translation tools can consider the broader context of a text, rather than just a few words to either side of the word in focus.
Like statistical machine translation, however, neural machine translation remains reliant on data sets and previously completed translations. According to Microsoft, it can require up to millions of translated sentences to work. Its advantage is its ability to train itself – the more translations it performs within a specific domain or language, the better it becomes.
Our machine translation services
Different machine translation services can be used depending on the type of text being translated and the needs of the client.
Here at COMUNICA, we offer three different levels of machine translation:
- Raw MT
- MT and post-editing
- MT and post-editing and revision
See the table below for a summary of these three services. In the following sections, you can read about the differences between them and to which contexts each one is most suited.
Raw MT | MT + post-editing | MT + post-editing + revision | |
Content types |
|
|
|
Quality | ⭐ | ⭐⭐ | ⭐⭐⭐ |
Cost | 💲 | 💲💲 | 💲💲💲 |
Turnaround time | 🏁 | 🏁🏁 | 🏁🏁🏁 |
Raw machine translation
Raw machine translation is the most basic machine translation service available. It involves simply feeding a text into a machine-translation tool and performing no further review or editing work. The advantage of this is that it is both fast and inexpensive. However, it also means a greater risk of mistranslations or text that sounds unnatural.
Raw machine translation is therefore generally considered unsuitable for advertisements or outward communications, but may be suitable for internal emails, user-generated texts such as product reviews and survey responses, or for translating large volumes of text where quality is not a priority.
Machine translation + post editing
Post editing machine translation, also known as MTPE, is the most common service within machine translation. This involves first feeding a text through an MT tool and then having the output reviewed by a professional linguist. The post-editor can intervene and amend sentences that sound unnatural and correct any mistranslations. The end result is therefore a hybrid text created by both man and machine, combining the advantages of both.
Because of the human element involved, quality is typically higher compared with raw machine translation. However, the machine translation is still more closely bound by the syntax and word order of the original and this limits its ability to create truly native-sounding texts. This type of machine translation may therefore still be unsuitable for more natural-sounding or creative texts that aim to really engage the reader or to paint a bright and vivid picture.
Click here to read more about our MTPE services.
Machine translation + post editing + revision
MTPE with revision is the same as the above but with the addition of a final revision stage. This serves to further boost quality and address any potential errors. The reviewer could also possess a particular area of expertise, such as law or medicine, in order to ensure the terms used are correct within the context of this field. Choosing this option over human translation is the best solution when you want to reduce costs and translate higher volumes without having to compromise too much on quality.
Problems in machine translation
Machine translation has come on a long way, but it still has its issues. For example:
- Difficulties understanding context: Words can have different meanings depending on the sector they are used in and although machines are good at guessing which words to use in a particular context, they can still get this wrong.
- Mechanical view of language: Language is a very human phenomenon and can be layered, clever and witty. Machines can struggle with subtext, irony and figurative speech. Sometimes they translate idioms literally, resulting in rather amusing but often unusable results.
- Wordy and rigid: In order to arrive at an accurate translation, machines tend to use more words than is strictly necessary, in contrast to humans who naturally cut out unnecessary words. An inability to comprehend also means that machines stick close to the source syntax, whereas a human translator could take a much more creative approach.
Disadvantages and advantages of machine translation
There are many disadvantages and advantages of machine translation. The main pros are lower costs and the ability to translate higher volumes in less time, while the main cons are the possibilities of error and misunderstanding, as well as the need to involve humans to some extent in order to prevent costly or embarrassing mistakes from slipping through.
Ultimately, whether or not machine translation is right for your project will depend on multiple factors such as the purpose of your text, the languages you are working with and your intended audience.
If you would like to discuss the options in more detail, please feel very welcome to get in touch with us here at COMUNICA for a commitment-free chat or quote.