Do you have what it takes to provide outstanding customer service? Are you excited about Localization and Machine Translation? We have just the job for you. KantanMT is expanding and is looking for a Customer Service Engineer.
The ideal candidate should be enthusiastic about helping our customers who are using the Machine Translation platform, have patience, and should be a self-starter.
Tony O’Dowd, Chief Architect of KantanMT.com and Louise Faherty, Technical Project Manager presented a webinar where they showed how LSPs (as well as enterprises) can improve the translation productivity of the language team, manage post-editing effort estimations and easily schedule projects with powerful MT engines. For this section, we are accompanied by Brian Coyle, Chief Commercial Officer at KantanMT, who joined the team on October, 2015 to strengthen KantanMT’s strategic vision.
We have provided a link to the slides used during the webinar below, along with a transcript of the Q&A session.
Please note that the answers below are not recorded verbatim and minor edits have been made to make the text more accessible.
Question: We are a mid-sized LSP and we would like to know what benefits would we enjoy if we choose to work with KantanMT, over building our own systems from scratch? The latter would be cheaper, wouldn’t it?
Answer (Brian): Tony and Louise have mentioned a lot of features available in KantanMT – indeed, the platform is very feature-rich and provides a great user experience. But on top of that, what’s really underneath KantanMT is the fact that it has access to a massive computing power, which is what Statistical Machine Translation requires in order to perform efficiently and quickly. KantanMT has the unique architecture to help provide instant on-demand access at scale.
As Louise Faherty mentioned, we are currently translating half a billion words per month and we have 760 servers deployed currently. So if you were trying to develop something yourself, it would be hard to reach this level of proficiency in your MT. Whilst no single LSP would probably need this total number of servers, to give you an idea of the cost involved, that kind of server deployment in a self-build environment would cost in the region of €25m.
We also offer 99.99% up time with triple data-centre disaster recovery. It would be very difficult and costly to build this kind of performance yourself. Also, with this kind of performance at your client’s disposal, you can offer Customised MT for mission critical web-based applications such as eCommerce sites.
Finally, a lot of planning, thought, development hours and research has gone into creating what we believe is the best user interface and the platform for MT, which also has the best functionality set with extreme ease of integration in the market place. So, it would be difficult for you to start on your own and build your own system that would be as robust and high quality as KantanMT.com.
Question: Could you also establish KantanNER rules to convert prices on an eCommerce websites?
Answer (Louise Faherty ): Yes, absolutely! With KantanNER, you can also establish rules, convert prices and so on. The only limitation with that being is that the exchange range will of course fluctuate. But there could be options as well of calculating that information dynamically – otherwise you would be looking at a fixed equation to convert those prices.
Question: My client does not want us to use MT because they have had bad experience in the past with Bing Translate – what would convince them to use KantanMT? How will the output be different?
Answer (Tony O’Dowd): One of things that you have to recognise in terms of using the KantanMT platform is that you are using MT to build customisedmachine translation engines. So you are not going to create generic engines (Bing Translate and Google Translate are generic engines). You would be building customised engines that are trained on the previous translations, glossaries that you clients have provided. You will also be using some of our stock engines that are relevant to your client’s domain.
So when you combine that, you get an engine that will mimic the translation style of your client. Indeed, instead of generic translation engines, you are using an engine that is designed to mirror the terminology and stylistic requirements of your client. If you can achieve this through Machine Translation, you will see that there is a lot less requirement for Post-Editing, and this is one of the most important things that drives away translators from using generic systems or broad-based systems and that’s why they choose customised systems. Clients and LSPs have tested the generic systems as well as customisable engines and found that cloud-based customisable MT add a value, which is not available on free, non-customisable MT platforms.
End of Q/A session
The KantanMT Professional Services Team would once again like to thank you for all your questions during the webinar and for sending in your questions by email.
Have more burning questions? Or maybe you would like to see the brilliant platform translate in a live environment? No problem! Just send an email to email@example.com and we will take care of the rest.
Want to stay informed about our new webinars? You can bookmark this page, or even better – sign up for our newsletter and ensure that you never miss a post!
Master’s student, Rafaella Athanasiadi of the University College London submitted her thesis as part of the MSc degree in Scientific, Technical and Medical Translation with Translation Technology. Rafaella was supervised by Teaching Fellow and Lecturer Dr. EmmanouelaPatiniotaki and she used KantanMT.com for her research. This guest blog post looks at some of her conclusions on Machine Translation and the Localization Industry.
As Hutchins & Somers (c1992:1) argue, “the mechanization of translation has been one of humanity’s oldest dreams.” During the 20th century, the translation process changed radically. From spending endless hours in libraries to find the translation of a word, the translator has been placed in the centre of dozens of assistive tools. To name just a few, today, there are many translation software, terminology extraction tools, project management components, and machine translation systems, which translators have the opportunity to choose from while translating.
However, shifting the focus to audiovisual translation, it can be observed that not so many radical changes took place in that area, at least not until the introduction of machine translation systems in various projects (such as, the MUSA and the SUMAT project) that developed machine translation engines to optimise the subtitling process. Still, the results of such projects do not seem to be satisfactory enough to inspire confidence for the implementation of these engines in the subtitling process both by subtitling software developers and subtitlers.
Based on my personal research that focused primarily on the European setting, in the subtitling industry it seems that only freeware SRT Translator incorporates machine translation while also offering the features that subtitling software usually incorporate (i.e. uploading multimedia files and timecoding subtitles) at the moment. Nonetheless, SRT Translator, which is not very famous among subtitlers, uses solely Google Translator by default, which is a general-domain machine translation engine and not suitable for the purposes of audiovisual translation, one could argue. The quality of the output of Google Translator was tested by translating 35 subtitles of a comedy series. The output was incomprehensible and misleading in many cases.
Even though no further records of traditional subtitling software that incorporate machine translation could be found, there are many online translation platforms that allow users to upload and translate subtitles. Taking into consideration the European market, these can be either translation software like MemoQ, SDL Trados Studio and Wordfast that offer thability to load subtitle files and in some cases link them to the audiovisual content they are connected to, open source tools for translators like Google Translator Toolkit (GTT) or professional and private platforms like Transifex and XTM International that are used by companies and offered to their dedicated network of translators. Nonetheless, in order to enable machine translation in all the above applications, API keys must be purchased. GTT is an exception since it can be used for free anytime and only requires a Gmail account.
The fact that subscription fees have to be paid along with the costs of API keys for each machine translation engine provider puts their usability in question since costs may overweight subtitlers’ profits. Furthermore, these platforms cannot accommodate subtitlers’ needs; for instance, the option to upload and play multimedia files while translating the subtitles is not always possible nor any synchronization features for timecoding the subtitles to the audio track are offered. Transifex, however, is an exception since this localization platform offers users the option to upload multimedia files in the translation editor while translating the subtitles.
According to Macklovitch (2000:1) a translation memory is considered to be “a particular type of translation support tool that maintains a database of source and target language sentence pairs, and automatically retrieves the translation of those sentences in a new text which occur in the database.” Even though machine translation engines were developed through different projects to reduce subtitling time to the least possible degree, no attempts had been traced during this research to integrate a translation memory tool in a subtitling software for optimizing subtitling; at least in a European, Asian and Australian setting. As Smith (2013) argues, “traditionally subtitling has fallen outside the scope of translation memory packages, perhaps as it was thought to be too creative a process to benefit from the features such software offers.” However, as Diaz-Cintas (2015:638) discusses “DVD bonus material, scientific and technical documentaries, edutainment programmes, and corporate videos tend to contain the high level of lexical repetition that makes it worthwhile for translation companies to employ assisted translation and memory tools in the subtitling process.”
Even if such tools have not been integrated in subtitling software, translation memory components are used for subtitling purposes in cloud-based platforms such as GTT, Transifex and XTM International as well as in translation software, MemoQ, SDL Trados Studio, Wordfast Pro and Transit NXT by simply creating a translation memory before or while translating. It should be noted that Transit NXT is the only translation software that can accommodate the needs of subtitlers to a high level among the tools discussed in this research. Apart from the addition of specialized filters to load subtitles (that also exist in MemoQ, SDL Trados Studio and Wordfast Pro), subtitlers can upload multimedia files, translate subtitles while a translation memory component is active and also synchronise their subtitles with the Transit translation editor (Smith, 2013).
Figure 1: The translation editor of Transit NXT by Smith (2013)
The newly-founded company (2012) OOONA has taken a very interesting approach to subtitling by developing a unique cloud-based toolkit that is built exclusively for accommodating the needs of subtitlers. When asked the following question within the context of the MSc thesis,
Considering that other cloud-based translation platforms like GTT, Transifex and XTM International offer the option of uploading a TM or a terminology management component, do you think that it is important to offer it on a subtitling platform as well?
the representative of OOONA (Alex Yoffe) replied that not only will the company implement translation memory and terminology management components in the next phase of enhancing their platform but that they also consider these components to be very important for the subtitling process. In addition, Yoffe (2015) argued that OOONA intends to “add the option of using MT engines. Translators will be able to choose between Microsoft’s, Google’s, or customisable MT engines.” Therefore, it seems that OOONA will become a very powerful tool in the near future with features that will optimise the subtitling process to the maximum and shape the way that subtitling is carried out until now. The fact that Screen Systems, Cavena and EZTitles have partnered with OOONA is an indicator of how much potential there is in this toolkit.
As it can been argued based on the above, there is lack of subtitling software with incorporated translation memory tools. Therefore, this issue was further researched through the form of an online questionnaire that was disseminated to subtitling companies and freelance subtitlers. In addition, two companies that develop subtitling software, Screen Subtitling Systems and EZTitles, were asked to present their views on this topic. In both cases, their willingness to optimise the subtitling process in a semi-automated or a fully-automated way was apparent through their answers. The former company was in favour of a combination of machine translation tools with translation memory tools whereas the latter leaned towards a subtitling system with integrated translation memory and terminology management tools.
Nonetheless, the optimisation of the subtitling process has to coincide with the needs and preferences of subtitlers. Based on the respondents’ answers, it is clear that translation memory tools in subtitling software are desirable by subtitlers. In question,
Which tool would you prefer to have in a subtitling software? An integrated translation memory (TM) or machine translation (MT)?
more than half of the respondents (56.8%) chose TM. Interestingly, the answer Both received the second highest percentage (20.5%) which indicated that subtitlers demand as many assistive tools as possible.
One of the main conclusions that were drawn from this research was that machine translation engines need to be customised to produce good quality output and this can be achieved through customisable engines like KantanMT and Milengo. Moreover, translation memory tools are sought by subtitlers in subtitling software, while cloud-based platforms seem to occupy the translation industry today. Following this trend, subtitling software providers partner with online services/tools like the OOONA toolkit.
Based on the outcomes of this research, it could be said that we are certainly experiencing a new era in subtitling since the traditional PC-based subtitling software are now transforming into flexible and accessible platforms to enhance the subtitling experience as much as possible. It is a matter of time which tool and platform will rule the subtitling industry but one thing is for sure; the technologies of the future will bring a lot of changes in the traditional way of subtitling.
Diaz-Cintas, J., 2015. Technological Strides in Subtitling. In: S. Chan, ed. Routledge Encyclopedia of Translation Technology. London: Routledge, pp. 632-643.
Hutchins, J. W. & Somers, H. L. (c1992). An introduction to machine translation. London: Academic Press.
Macklovitch, E. (2000). Two Types of Translation Memory. In Proceedings of the ASLIB Conference on Translating and the Computer (Vol. 22).
If you are in the language service industry, you are undoubtedly on the lookout for ways in which you can improve the productivity of your team – more translated words in less time – that’s what drives your clients as well as you. Automated Machine Translation (MT) seems to be the logical step forward in today’s world of content explosion and tightening deadlines. However, for most Language Service Providers (LSPs), the challenge lies in the actual implementation of this sophisticated technology.
For this reason, it is important that no matter what translation management tools you use, it should be integrated with a powerful MT engine that is reliable, scalable, flexible, and can be trained and re-trained constantly for maximum efficiency and quick turnaround times.
In today’s fast-paced world of content explosion on the Internet, the need for translating this organically growing content with the help of machines has become inevitable. While post-editing the machine translated content will always be required, choosing the right MT features will ensure that translators do not spend countless frustrating hours on those edits.
In this Kantanwebinar, The KantanMT Professional Services Team, Tony O’Dowd and Louise Faherty (Quinn) will show how you can improve the translation productivity of your team, and manage effort estimations and project deadlines better with a powerful MT engine.
During this webinar you will learn:
Translation challenges (co-ordinating and managing translation projects)
About the necessity of Machine Translation to be competitive
How KantanMT.com can be integrated with other Translation Management Systems
The KantanPEX Rule Editor enables members of KantanMT reduce the amount of manual post-editing required for a particular translation by creating, testing and deploying post-editing automation rules on their Machine Translation engines (client profiles).
The editor allows users to evaluate the output of a PEX (Post-Editing Automation) rule on a sample of translated content without needing to upload it to a client profile and run translation jobs. Users can enter up to three pairs of search and replace rules, which will be run in descending order on your content.
How to use the KantanMT PEX Rule Editor
Login into your KantanMT account using your email and your password.
You will be directed to the ‘Client Profiles’ tab in the ‘My Client Profiles’ page. The last profile you were working on will be ‘Active’ and marked in bold.
To use the ‘PEX-Rule Editor’ with a profile other than the ‘Active’ profile, click on the new profile name to select that profile for use with the ‘Kantan PEX-Rule editor’.
Then click the ‘KantanMT’ tab and select ‘PEX Editor’ from the drop-down menu.
You will be directed to the ‘PEX Editor’ page.
Type the content you wish to test on, in the ‘Test Content’ box.
Type the content you wish to search for in the ‘PEX Search Rules’ box.
Type what you want the replacement to be in the ‘PEX Replacement Rules’ box and click on the ‘Test PEX Rules’ button to test the PEX-Rules.
The results of your PEX-Rules will now appear in the ‘Output’ box.
Give the rules you have created a name by typing in the ‘Rule Name’ box.
Select the profile you wish to apply this rule(s) to and then click on the ‘Upload Rule’ button.
KantanMT PEX editor helps reduce the amount of manual post-editing required for a particular translation, hence, reducing project turn-around times and costs. For additional information on PEX-RULES and the Kantan PEX-Rule editor please click on the links below. For more details about KantanMT localization products and ways of improving work productivity and efficiency please contact us at firstname.lastname@example.org.
In science fiction, translation of the potentially infinite number of languages spoken by alien species presents a dilemma. How to deal with communication between interplanetary species without resorting to contrivance, or spending the first twenty minutes of each episode’s dialogue clumsily showing characters learning one another’s diphthongs?
The notion of a ‘universal translator’ emanated from Murray Leinster’s novella First Contact, published in 1945 (and clearly that isn’t the only debt Gene Roddenberry owes to Leinster). It’s a greatly helpful – borderline miraculous, in fact – convention of sci-fi: a technological solution to the language barrier, leaving more time for the actual narrative to unfold in one language, typically English.
With the incredible advancements in technology we’re witnessing at the moment such as Microsoft’s pilots of a Skype Translator and the industry leading work KantanMT is achieving in this area, are we seeing the beginnings of live translation – well ahead of Star Trek’s 22nd century deadline? In the meantime, let’s take a look at five of sci-fi’s finest translation machines, which beat anything real-life technology can offer – for now.
1. Star Trek: Universal Translator
An important part of Star Trek’s near-utopian vision of the future is the Universal Translator. Translating any language into another even while a person is speaking, this exceptionally handy tool means Starfleet craft in any quadrant of the galaxy can speak to new life and new civilizations without confusion.
Voiced by Star Trek creator Roddenberry’s widow Majel Barrett until her death in 2008, the development of a universal translator was, in the Trek universe, a portent of Earth’s cultures achieving universal peace. It’s difficult to imagine Google Translate having the same impact.
This convenient concept has been often copied, and occasionally parodied: in Futurama, everyone in the universe speaks English, rendering Professor Farnworth’s one successful invention – a translation device – useless, as it merely translates English into the dead language, French!
2. The Hitchhikers’ Guide to the Galaxy: the Babel Fish
Some sci-fi plays with the concept in less serious ways. In Douglas Adams’ H2G2, to help Arthur Dent deal in some small way with anything that goes on around him, inserted into his ear is a Babel Fish, memorably described by the Guide as “small, yellow, leechlike and probably the oddest thing in the universe.”
The science (such as it is) behind the Babel Fish is that it can absorb the frequencies of outside speakers, and a translation is secreted by the fish into the hearer’s brain via his or her ear canal. In a witty reversal of Star Trek’s idealistic Federation, Adams reveals that, by allowing everyone to understand one another, the Babel Fish has actually caused more war than anything else in the universe.
3. Farscape: Translator microbes
In science fiction, as in reality, it is the individual idiosyncrasies of languages which are trickiest to master. When people in the UK from a hundred miles apart may speak different languages, not to mention a range of different dialects and accents, can auditory translation really be so smooth?
One series to acknowledge this is Farscape, where astronaut John Crichton is injected with bacteria-sized ‘translator microbes’, which are injected into – and colonise – his brain. The microbes work to make their host understand any spoken information in any language – except idioms are translated literally. This leads to a great deal of confusion for John, and opportunities for humour for the audience (all jokes are language, after all) – and also perhaps renders these microbes a more realistically-limited translator technology.
4. Doctor Who: The TARDIS’ Translation Circuit
As well as being telepathically linked with the Doctor, and granting the ability to travel to any time or place in history and the future, the TARDIS’ telepathic field is used to automatically translate what the Doctor and any companions hear or read into a language which they can understand.
While wonderfully convenient, the mind-meld involved does mean that the translation circuits won’t actually work when the Doctor is unconscious – not an outright impossibility. Also, because translations are time specific, ancient civilization won’t understand neologisms – and, neatly, the Romans have never heard the word ‘volcano’ – because they’ve not lived to see an eruption.
5. Star Wars: C-3PO
Luke Skywalker is the ultimate sci-fi everyman: he is every bit as much in need of a guide to the universe he finds himself in as the viewing audience are. Reinforcing this are his guides, C-3PO and R2D2, who Luke needs with him – despite their obvious drawbacks as travelling companions – because C-3PO is programmed with millions of languages, everything from Ewok to R2’s bleeps and whistles.
When the franchise returns with The Force Awakens later this year (which most fans will rightly consider the fourth, rather than seventh, Star Wars movie), C-3PO’s translation abilities are sure to make him at least partially useful to have around.
The KantanMT team say a big Thank You to Richard for a very savvy post on translation machines in science fiction.
KantanAPIenables KantanMTclients to interact with KantanMT as an on-demand web service. It also provides a number of different services including translation, file upload and retrieval and job launches.
With the KantanAPI you not only have the opportunity to integrate KantanMT into your workflow systems but also the ability to receive on-demand translations from your KantanMT engines. All these services make the experience with Machine Translation as seamless as possible.
Please Note: The API is only available to KantanMT members in the EnterprisePlan.
To access the KantanMT API you will first need your ‘API token’. This token can be found in the ‘API’ tab on the ‘My Client Profiles’ page of your KantanMT account.
Once you have your token you can use the API in a number of ways
Using the API tab on the ‘My Client Profiles’ page in the KantanMT Web interface
Using the REST interface via HTTP GET or POST requests
Using one of our various connectors, which are built using our KantanAPI
For more details on implementing your API solution via the REST interface, please see the full API technical documentation at the following link:
Login into your KantanMT account using your email and your password.
You will be directed to the ‘My Client Profiles’ page. You will be in the ‘Client Profiles’ section of the ‘My Client Profiles’ page. The last profile you were working on will be ‘Active’.
If you wish to use the ‘KantanAPI’ with another profile other than the ‘Active’ profile. Click on the profile you wish to use the ‘KantanAPI’ with, then click on the ‘API’ tab.
You will be directed to the ‘API Settings’ page. Now click on the ‘Launch API’ button.
A ‘Launch API’ pop-up will now appear on your screen asking you ‘Are you sure you want to launch the API?’ Click ‘OK’.
The ‘API Status’ will now change from ‘offline’ to ‘initialising’, the ‘Launch API’ button will now change to ‘Launching API’ .
When your KantanAPI launches the ‘API Status’ will now change from ‘initialising’ to ‘running’, the ‘Launching API’ button changes to ‘Shutdown API’ and you should now be able to click on the ‘Translate’ button.
Type the text you wish to translate in the text box and click on the ‘Translate’ button.
The translated text will now appear in the ‘Translated Text’ box. If you wish to make any changes to the translated text simply place the cursor inside the ‘Translated Text’ box and make the changes. Save these changes by clicking the ‘Retrain Engine’ button.
Test if your engine was successfully retrained by clicking the ‘Translate’ button. The retrained text will now appear in the ‘Translated Text’ box.
If you don’t wish to retrain your engine and you are happy with the translated text in the ‘Translated Text’ box. You may continue translating other text or shut down your KantanAPI by clicking the ‘Shutdown API’ button.
When you click the ‘Shutdown API’ button a pop-up will now appear asking you ‘Are you sure you want to shout down the API?’ Click ‘OK’.
The ‘Shutdown API’ button will now change to ‘Terminating API’, the ‘API status’ will now change from ‘running’ to ‘terminating’ and you shouldn’t be able to click on the ‘Translate’ or ‘Retrain Engine’ button.
You will now be directed back to the initial screen on the API Settings page.
KantanAPI™ is one of the various machine translation services offered by KantanMT to improve productivity for our clients and also enable them to be more efficient. For more information on KantanAPI or any KantanMT products please contact us at email@example.com.
For more details on the KantanMT API please see the following links and the video below:
Ease of use and simplicity are always on the minds of our Developers, hence the making of KantanTimeLine™. KantanTimeLine enables KantanMT clients to view the life cycle of their KantanMT engine. This empowers our clients as they are able to find exactly what is negatively or positively affecting the quality of their engines. Clients are able to keep track of things such as, Training Data uploads, Translation jobs, Engine Tuning, templates, Build jobs and so on through the KantanTimeLine.
How to use KantanTimeLine™
Login into your KantanMTaccount using your email and your password.
You will be directed to the ‘My Client Profiles’ page. You will be in the ‘Client Profiles’ section of the ‘My Client Profiles’ page. The last profile you were working on will be ‘Active’.
If you wish to use ‘KantanTimeLine’ with another profile other than the ‘Active’ profile. Click on the profile that you want to you wish to view the ‘KantanTimeLine’.
Click on the ‘TimeLine’ tab.
You will now be directed to the ‘TimeLine’ page for your chosen profile.
To restore an Archived Build select the Build you wish to restore from the ‘Archives’ drop-down menu and click on the ‘Restore’ button.
To delete an archived Build click on the ‘Delete’ button.
To archive a Build click on the ‘Archive’ button of the build you wish to archive.
To view or edit the description of a build click on the ‘Yellow Notepad’ icon.
To filter the timeline click on the ‘Filter’ drop down-menu and select the filter you wish to use.
Additional Information and Support
KantanTimeLine™ is one of the many products offered by KantanMT to make the integration of Machine Translation into the workflow of our clients seamless. For more information on TimeLine or any KantanMT products please contact us at firstname.lastname@example.org.
TimeLine can also be used in KantanBuildAnalytics. To learn how TimeLine is incorporated into KantanBuildAnalytics please click on the link below or contact us at email@example.com.
Gap Analysis identifies and reports any untranslated words in the training data set and allows you to take preventive measures quickly by fine tuning training data and filling data gaps.The KantanTimeLine™ provides a chronological history of activities for each engine and uses version control for precise management of released and production-ready engines.
Using Kantan TimeLine and Gap Analysis:
In KantanBuildAnalytics, click the Gap Analysis tab to see the amount of untranslated words that remain in the generated translations. You will be directed to the Gap Analysis page, where you will see a breakdown of any gaps in your training data.
A table appears with 3 headings: ‘#’, Unknown Word, Reference/Source, KantanMT Output. Under those headings you will find details of any untranslated words, their source and the KantanMT Output.
Click Download to download your Gap Analysis report.
Note: You can also click the Timeline tab to view your profiles’s Timeline, which is essentially a record of the changes you have made on your engine.
This is one of the many features provided in KantanBuildAnalytics, which aids Localization Project Managers in improving an engine’s quality after its initial training. To see other features used in KantanBuildAnalytics suite please see the links below.
It’s a fact, infiltrating new markets is the key to increasing profits, and the first item on any company’s internationalization checklist should be to make sure it communicates product information in a way its target customers can understand.
Leading on from the 2006 research, CSA’s updated survey in 2014 was based on a sample of three thousand global respondents, and it reinforced earlier results by showing that 55% only buy from websites in their native language. This jumped dramatically to 80% in cases where the buyers English language ability is limited.
When it comes to selling internationally, tapping into new revenue streams demands translated content. But, what happens when you have thousands of product descriptions that need to be localized into a plethora of languages?
This is where the fun begins for localization teams with well-established traditional translation workflows in place. Their existing method seems fine…but when it’s time to scale up, this is when cracks in the process begin to appear.
The translation workflow works best when it matches the scale and velocity for the content created whether it is product descriptions, manuals or online help documentation.
The challenging part –
How to translate product descriptions with velocity and to scale?
We have heard a great deal of arguments for and against machine translation and one of the most well known against arguments is “the quality is rubbish, sentences translated by machine translation are garbled and incomprehensible”. We in the language technology field hear this frequently and often shudder in disbelief at how these conclusions have been reached.
Generic or free machine translation systems in most cases do not produce great results, expecting such a system to produce publishable quality MT results or using it as benchmark for all MT systems is akin to extracting blood from a stone. Achieving good MT output takes time, care and the ability to customise the MT system properly.
Any company that is serious about breaking into international markets should also be serious about their MT strategy. They should be considering a customised MT solution that is tailored to their needs, not just by going for a cheap and/or supposedly free option.
Why is MT customisation so important?
Statistical machine translation is based on machine learning and pattern recognition. Segments with multiple word phrases or n-grams as they are known are identified with probability algorithms that select the most probable translation match. Generic or free MT systems typically have been built on a broad mix of content styles and types. This means it’s much harder for the MT system to identify the most likely or even relevant matches in generically built engines.
When the MT system is customised specifically for content that comes from a single domain, such as product descriptions for a specific categories e.g. Home and garden, fashion or electronic devices, the syntax, style and phraseology used will make sure that when an MT match is generated there will be a higher probability that the match will be closer to the desired output, resulting in a much more accurate translation.
How important is saving costs?
Of Course Machine Translation can save costs – if done properly, significant savings can be made. But, saving costs is often not the end goal for implementing a serious MT strategy. The real gains come from increasing productivity without a compromise in quality. Why translate 2000 words a day when you can machine translate and post-edit 8000 words with no loss of quality? Really it can be done! See an example first hand (Netthandelen’s case study PDF download).
When it comes to eCommerce and selling hundreds of products online the words to be translated are counted in billions not thousands, and without MT, traditional localization budgets would become more and more expensive, so MT is really the only practical solution. But, if MT is considered a way to save money by cutting corners then it is doomed to fail from the outset.
It will fail because it’s not sustainable, the effort and costs required to fix bad quality MT output are too great, and if fixing is neglected by publishing the content as is, it will result in angry customers who shop elsewhere – and they will, as the choice available now is greater than ever before!
Generic free MT will not generate the same quality as customised MT
Investing in a robust MT strategy will save time, costs and headaches in the long run
Keep focus on communicating with the customer, in their language and your eCommerce business will thrive
Email firstname.lastname@example.org if you have questions or want to learn more about how Machine Translation works for product descriptions.