Last week we announced the launch of KantanSnippet™, a part of the KantanWidgets™ Suite of Productivity Apps. KantanWidgets allows KantanMT clients to integrate KantanMT technology within their own environments, including websites, Microsoft Office programs and supported browsers (Google Chrome, Safari and Firefox).
Like the rest of the world, we have joined the Pokémon Go craze, with many of us here at KantanMT searching for Pokémon characters during lunch or after a day in the office. Of course, it goes without saying that we have our own Whats App group, aptly named ‘Poke’ to share our progress throughout the game, each of us playing in different languages.
The globalised make-up of the car industry, means automated translation is an important tool for those working in the automotive industry. KantanMT has helped clients use Machine Translation to efficiently translate technical documentation, motor part catalogues and how-to manuals, whilst automotive websites, such as ChromeData use KantanMT to translate content, so it can give detailed vehicle info and specifications for thousands of websites and dealerships around the globe.
The automotive industry has always been one of change. That change is leading to fundamental shifts in car technology and how users interact with them. In 2016, a typical car coming off the production line will contain 100 million lines of code. 20 million of those lines of code are required just to run a standard navigation and infotainment system. This increasing complexity inevitably leads to increasing level of customisation.
Changing Automotive Industry
While technology continues to advance, car manufacturers are increasingly looking at it as an area of differentiation. As manufacturers explore ways of delivering superior performance, implementing software that can be updated regularly, similar to that of a mobile phone, will enter mainstream usage in our cars. Technology centric car companies such as Tesla are already utilising such conveniences and it is inevitable more will follow.
Master’s student, Rafaella Athanasiadi of the University College London submitted her thesis as part of the MSc degree in Scientific, Technical and Medical Translation with Translation Technology. Rafaella was supervised by Teaching Fellow and Lecturer Dr. EmmanouelaPatiniotaki and she used KantanMT.com for her research. This guest blog post looks at some of her conclusions on Machine Translation and the Localization Industry.
As Hutchins & Somers (c1992:1) argue, “the mechanization of translation has been one of humanity’s oldest dreams.” During the 20th century, the translation process changed radically. From spending endless hours in libraries to find the translation of a word, the translator has been placed in the centre of dozens of assistive tools. To name just a few, today, there are many translation software, terminology extraction tools, project management components, and machine translation systems, which translators have the opportunity to choose from while translating.
However, shifting the focus to audiovisual translation, it can be observed that not so many radical changes took place in that area, at least not until the introduction of machine translation systems in various projects (such as, the MUSA and the SUMAT project) that developed machine translation engines to optimise the subtitling process. Still, the results of such projects do not seem to be satisfactory enough to inspire confidence for the implementation of these engines in the subtitling process both by subtitling software developers and subtitlers.
Based on my personal research that focused primarily on the European setting, in the subtitling industry it seems that only freeware SRT Translator incorporates machine translation while also offering the features that subtitling software usually incorporate (i.e. uploading multimedia files and timecoding subtitles) at the moment. Nonetheless, SRT Translator, which is not very famous among subtitlers, uses solely Google Translator by default, which is a general-domain machine translation engine and not suitable for the purposes of audiovisual translation, one could argue. The quality of the output of Google Translator was tested by translating 35 subtitles of a comedy series. The output was incomprehensible and misleading in many cases.
Even though no further records of traditional subtitling software that incorporate machine translation could be found, there are many online translation platforms that allow users to upload and translate subtitles. Taking into consideration the European market, these can be either translation software like MemoQ, SDL Trados Studio and Wordfast that offer thability to load subtitle files and in some cases link them to the audiovisual content they are connected to, open source tools for translators like Google Translator Toolkit (GTT) or professional and private platforms like Transifex and XTM International that are used by companies and offered to their dedicated network of translators. Nonetheless, in order to enable machine translation in all the above applications, API keys must be purchased. GTT is an exception since it can be used for free anytime and only requires a Gmail account.
The fact that subscription fees have to be paid along with the costs of API keys for each machine translation engine provider puts their usability in question since costs may overweight subtitlers’ profits. Furthermore, these platforms cannot accommodate subtitlers’ needs; for instance, the option to upload and play multimedia files while translating the subtitles is not always possible nor any synchronization features for timecoding the subtitles to the audio track are offered. Transifex, however, is an exception since this localization platform offers users the option to upload multimedia files in the translation editor while translating the subtitles.
According to Macklovitch (2000:1) a translation memory is considered to be “a particular type of translation support tool that maintains a database of source and target language sentence pairs, and automatically retrieves the translation of those sentences in a new text which occur in the database.” Even though machine translation engines were developed through different projects to reduce subtitling time to the least possible degree, no attempts had been traced during this research to integrate a translation memory tool in a subtitling software for optimizing subtitling; at least in a European, Asian and Australian setting. As Smith (2013) argues, “traditionally subtitling has fallen outside the scope of translation memory packages, perhaps as it was thought to be too creative a process to benefit from the features such software offers.” However, as Diaz-Cintas (2015:638) discusses “DVD bonus material, scientific and technical documentaries, edutainment programmes, and corporate videos tend to contain the high level of lexical repetition that makes it worthwhile for translation companies to employ assisted translation and memory tools in the subtitling process.”
Even if such tools have not been integrated in subtitling software, translation memory components are used for subtitling purposes in cloud-based platforms such as GTT, Transifex and XTM International as well as in translation software, MemoQ, SDL Trados Studio, Wordfast Pro and Transit NXT by simply creating a translation memory before or while translating. It should be noted that Transit NXT is the only translation software that can accommodate the needs of subtitlers to a high level among the tools discussed in this research. Apart from the addition of specialized filters to load subtitles (that also exist in MemoQ, SDL Trados Studio and Wordfast Pro), subtitlers can upload multimedia files, translate subtitles while a translation memory component is active and also synchronise their subtitles with the Transit translation editor (Smith, 2013).
Figure 1: The translation editor of Transit NXT by Smith (2013)
The newly-founded company (2012) OOONA has taken a very interesting approach to subtitling by developing a unique cloud-based toolkit that is built exclusively for accommodating the needs of subtitlers. When asked the following question within the context of the MSc thesis,
Considering that other cloud-based translation platforms like GTT, Transifex and XTM International offer the option of uploading a TM or a terminology management component, do you think that it is important to offer it on a subtitling platform as well?
the representative of OOONA (Alex Yoffe) replied that not only will the company implement translation memory and terminology management components in the next phase of enhancing their platform but that they also consider these components to be very important for the subtitling process. In addition, Yoffe (2015) argued that OOONA intends to “add the option of using MT engines. Translators will be able to choose between Microsoft’s, Google’s, or customisable MT engines.” Therefore, it seems that OOONA will become a very powerful tool in the near future with features that will optimise the subtitling process to the maximum and shape the way that subtitling is carried out until now. The fact that Screen Systems, Cavena and EZTitles have partnered with OOONA is an indicator of how much potential there is in this toolkit.
As it can been argued based on the above, there is lack of subtitling software with incorporated translation memory tools. Therefore, this issue was further researched through the form of an online questionnaire that was disseminated to subtitling companies and freelance subtitlers. In addition, two companies that develop subtitling software, Screen Subtitling Systems and EZTitles, were asked to present their views on this topic. In both cases, their willingness to optimise the subtitling process in a semi-automated or a fully-automated way was apparent through their answers. The former company was in favour of a combination of machine translation tools with translation memory tools whereas the latter leaned towards a subtitling system with integrated translation memory and terminology management tools.
Nonetheless, the optimisation of the subtitling process has to coincide with the needs and preferences of subtitlers. Based on the respondents’ answers, it is clear that translation memory tools in subtitling software are desirable by subtitlers. In question,
Which tool would you prefer to have in a subtitling software? An integrated translation memory (TM) or machine translation (MT)?
more than half of the respondents (56.8%) chose TM. Interestingly, the answer Both received the second highest percentage (20.5%) which indicated that subtitlers demand as many assistive tools as possible.
One of the main conclusions that were drawn from this research was that machine translation engines need to be customised to produce good quality output and this can be achieved through customisable engines like KantanMT and Milengo. Moreover, translation memory tools are sought by subtitlers in subtitling software, while cloud-based platforms seem to occupy the translation industry today. Following this trend, subtitling software providers partner with online services/tools like the OOONA toolkit.
Based on the outcomes of this research, it could be said that we are certainly experiencing a new era in subtitling since the traditional PC-based subtitling software are now transforming into flexible and accessible platforms to enhance the subtitling experience as much as possible. It is a matter of time which tool and platform will rule the subtitling industry but one thing is for sure; the technologies of the future will bring a lot of changes in the traditional way of subtitling.
Diaz-Cintas, J., 2015. Technological Strides in Subtitling. In: S. Chan, ed. Routledge Encyclopedia of Translation Technology. London: Routledge, pp. 632-643.
Hutchins, J. W. & Somers, H. L. (c1992). An introduction to machine translation. London: Academic Press.
Macklovitch, E. (2000). Two Types of Translation Memory. In Proceedings of the ASLIB Conference on Translating and the Computer (Vol. 22).
In science fiction, translation of the potentially infinite number of languages spoken by alien species presents a dilemma. How to deal with communication between interplanetary species without resorting to contrivance, or spending the first twenty minutes of each episode’s dialogue clumsily showing characters learning one another’s diphthongs?
The notion of a ‘universal translator’ emanated from Murray Leinster’s novella First Contact, published in 1945 (and clearly that isn’t the only debt Gene Roddenberry owes to Leinster). It’s a greatly helpful – borderline miraculous, in fact – convention of sci-fi: a technological solution to the language barrier, leaving more time for the actual narrative to unfold in one language, typically English.
With the incredible advancements in technology we’re witnessing at the moment such as Microsoft’s pilots of a Skype Translator and the industry leading work KantanMT is achieving in this area, are we seeing the beginnings of live translation – well ahead of Star Trek’s 22nd century deadline? In the meantime, let’s take a look at five of sci-fi’s finest translation machines, which beat anything real-life technology can offer – for now.
1. Star Trek: Universal Translator
An important part of Star Trek’s near-utopian vision of the future is the Universal Translator. Translating any language into another even while a person is speaking, this exceptionally handy tool means Starfleet craft in any quadrant of the galaxy can speak to new life and new civilizations without confusion.
Voiced by Star Trek creator Roddenberry’s widow Majel Barrett until her death in 2008, the development of a universal translator was, in the Trek universe, a portent of Earth’s cultures achieving universal peace. It’s difficult to imagine Google Translate having the same impact.
This convenient concept has been often copied, and occasionally parodied: in Futurama, everyone in the universe speaks English, rendering Professor Farnworth’s one successful invention – a translation device – useless, as it merely translates English into the dead language, French!
2. The Hitchhikers’ Guide to the Galaxy: the Babel Fish
Some sci-fi plays with the concept in less serious ways. In Douglas Adams’ H2G2, to help Arthur Dent deal in some small way with anything that goes on around him, inserted into his ear is a Babel Fish, memorably described by the Guide as “small, yellow, leechlike and probably the oddest thing in the universe.”
The science (such as it is) behind the Babel Fish is that it can absorb the frequencies of outside speakers, and a translation is secreted by the fish into the hearer’s brain via his or her ear canal. In a witty reversal of Star Trek’s idealistic Federation, Adams reveals that, by allowing everyone to understand one another, the Babel Fish has actually caused more war than anything else in the universe.
3. Farscape: Translator microbes
In science fiction, as in reality, it is the individual idiosyncrasies of languages which are trickiest to master. When people in the UK from a hundred miles apart may speak different languages, not to mention a range of different dialects and accents, can auditory translation really be so smooth?
One series to acknowledge this is Farscape, where astronaut John Crichton is injected with bacteria-sized ‘translator microbes’, which are injected into – and colonise – his brain. The microbes work to make their host understand any spoken information in any language – except idioms are translated literally. This leads to a great deal of confusion for John, and opportunities for humour for the audience (all jokes are language, after all) – and also perhaps renders these microbes a more realistically-limited translator technology.
4. Doctor Who: The TARDIS’ Translation Circuit
As well as being telepathically linked with the Doctor, and granting the ability to travel to any time or place in history and the future, the TARDIS’ telepathic field is used to automatically translate what the Doctor and any companions hear or read into a language which they can understand.
While wonderfully convenient, the mind-meld involved does mean that the translation circuits won’t actually work when the Doctor is unconscious – not an outright impossibility. Also, because translations are time specific, ancient civilization won’t understand neologisms – and, neatly, the Romans have never heard the word ‘volcano’ – because they’ve not lived to see an eruption.
5. Star Wars: C-3PO
Luke Skywalker is the ultimate sci-fi everyman: he is every bit as much in need of a guide to the universe he finds himself in as the viewing audience are. Reinforcing this are his guides, C-3PO and R2D2, who Luke needs with him – despite their obvious drawbacks as travelling companions – because C-3PO is programmed with millions of languages, everything from Ewok to R2’s bleeps and whistles.
When the franchise returns with The Force Awakens later this year (which most fans will rightly consider the fourth, rather than seventh, Star Wars movie), C-3PO’s translation abilities are sure to make him at least partially useful to have around.
The KantanMT team say a big Thank You to Richard for a very savvy post on translation machines in science fiction.