In the U.S., the number of people who speak a language other than English is believed to be at an all-time high. As of 2015, translation and interpretation service was predicted to be the fastest growing industry in the country by CareerBuilder, while the U.S. Bureau of Labour Statistics predicted a 46% increase in translation job opportunities in the U.S. between 2012 and 2022. However, all those positive predictions were before the U.S. Presidential Election of 2016. After a whirl-wind campaign, President Elect Donald Trump is all set to take up his Presidential Oath on 20 January, 2017. But what could this mean for those in the localization industry? Continue reading
If the post-Black Friday sales numbers are anything to go by, there’s no question any more that the face of eCommerce is changing, and with it, the brick-and-mortar retailers have started rethinking their business strategy. As this news piece about Scotland experiencing a major dip in shoppers goes on to prove, demand for online shopping will increase substantially in 2016. This in turn means that the need for content localization and translation for eTailers (online retailers) will be even more pressing during the coming new year. As the often quoted Common Sense Advisory report points out, 72.4% of consumers are more likely to buy from a site, which is in their native language. Indeed, localization is no longer a good-to-have feature – it is now a must-have for all eCommerce businesses that aim to sell their products globally.
Chris Bishop, Managing Director of Microsoft Research, Cambridge, UK points out that “by 2026 we will have ubiquitous, human-quality translation among all European languages, thereby eliminating the language barrier throughout Europe.” Bishop’s prediction does not sound far off the mark at all when we take into account the fact that in the past ten years, Machine Translation (MT) has improved by leaps and bounds. Early MT was rules-based (RBMT) and required sets of linguistic rules, and it worked moderately well within a prescribed domain. However, this was resource intensive and cost prohibitive for many.
By 2026 we will have ubiquitous, human-quality translation among all European languages, thereby eliminating the language barrier throughout Europe
Chris Bishop, Managing Director of Microsoft Research, Cambridge, UK
The turning point for using MT in business came with the advent of the Internet, the SaaS model and the open source development model for software. These new changes in technology helped build the foundation for Statistical Machine Translation (SMT) research, and subsequently the open source development of the Moses Decoder. Moses enabled researchers and private companies to commercialise Statistical MT and develop it to the custom solutions it is today. The year of 2016 and beyond, will see further research in the fields of Natural Language Processing (NPL), Deep learning and machine learning, contributing directly to immense improvements in the fields of Custom MT.
The KantanMT Business Team published a new white paper, which provides an in depth understanding of how eTailers in 2016 will be affected by Machine Translation, and also goes on to discuss how Custom Machine Translation when compared to generic MT systems, will emerge as the clear winner in solving eTailing localization issues in the coming year.
Here are some of the highlights how MT will evolve in 2016 for eTailers:
- eTailers will use a combination of only CMT or CMT and Human Post-Editing to reach new markets ahead of their competitors
- With increased multilingual customer demand for products, content translation will find support in auto scaling
- Custom Machine Translation will be used more widely as eCommerce customers expand globally
Machine Translation is no longer a luxury. It is an essential component as a Tier 1 application to support global business. The purpose of this paper is to highlight how Machine Translation and more importantly Custom Machine Translation technology has come of age, in terms of quality, speed and scalability. During 2016 and beyond eTailers need to ensure that they review their globalization strategies to reflect these advances in technology, so they can maximise their global growth potential.
We have entered a new age, and a new technology has come into play: Machine Translation (MT). It’s globally accepted that MT systems dramatically increase productivity but it’s a hard struggle to integrate this technology into your production process. Apart from handling the engine building and optimizing procedures, you have to transform your traditional workflow:
The traditional roles of the linguists (translators, editors, reviewers etc.) are reconstructed and converged to find a suitable place in this new, innovative workflow. The emerging role is called ‘post-edit’ and the linguists assigned to this role are called ‘post-editors’. You may want to recruit some willing linguists for this role, or persuade your staff to adopt a different point of view. But whatever the case may be, some training sessions are a must.
What are covered in training sessions?
1. Basic concepts of MT systems
Post-editors should have a notion of the dynamics of MT systems. It is important to focus on the system that is utilized (RBMT/SMT/Hybrid). For widely used SMT systems, it’s necessary for them to know:
- how the systems behave
- the functions of the Translation Model and Language Model*
- input (given set of data) and output (raw MT output) relationship
- what changes in different domains
* It’s not a must to give detailed information about that topics but touching on the issue will make a difference in determining the level of technical backgrounds of candidates. Some of the candidates may be included in testing team.
2. The characteristics of raw MT output
Post-editors should know the factors affecting MT output. On the other hand, the difference between working on fuzzy TM systems and with SMT systems has to be mentioned during a proper training session. Let’s try to figure out what to be given:
- MT process is not the ‘T’ of the TEP workflow and raw MT output is not the target text expected to be output of ‘T’ process.
- In the earlier stages of SMT engines, the output quality varies depending on the project’s dynamics and errors are not identical. As the system improves quality level becomes more even and consistent within the same domain.
- There may be some word or phrase gaps in the systems’ pattern mappings. (Detecting these gaps is one of the main responsibilities of testing team but a successful post-editor must be informed about the possible gaps.)
3. Quality issues
This topic has two aspects: defining required target (end product) quality, and evaluation and estimation of output quality. The first one gives you the final destination and the second one makes you know where you are.
Required quality level is defined according to the project requirements but it mostly depends on target audience and intended usage of the target text. It seems similar to the procedure in TEP workflow. However, it’s slightly different; engine improvement plan should also be considered while defining the target quality level. Basically, this parameter is classified into two groups: publishable andunderstandable quality.
Evaluation and estimation aspect is a little bit more complicated. The most challenging factor is standardizing measurement metrics. Besides, the tools and systems used to evaluate and estimate the quality level have some more complex features. If you successfully establish your quality system, then adversities become easier to cope with.
It’s post-editors’duty to apprehend the dynamics of MT quality evaluation, and the distinction between MT and HT quality evaluation procedures. Thus, they are supposed to be aware of the expected error patterns. It will be more convenient to utilize the error categorization with your well-trained staff (QE staff and post-editors).
4. Post-editing Technique
The fourth and the last topic is the key to success. It covers appropriate method and principles, as well as the perspective post-editors usually acquire. Post-edit technique is formed using the materials prepared for the previous topics and the data obtained from the above mentioned procedures, and it is separately defined for almost every individual customized engines.
The core rule for this topic is that post-edit technique, as a concept, is likely to be definitely differentiated from traditional edit and/or review technique(s). Post-editors are likely to be capable of:
- reading and analyzing the source text, raw MT output and categorized and/or annotated errors as a whole.
- making changes where necessary.
- considering the post-edited data as a part of data set to be used in engine improvement, and performing his/her work accordingly.
- applying the rules defined for the quality expectation levels.
As briefly described in topic #3, the distance between the measured output quality and required target quality may be seen as the post-edit distance. It roughly defines the post-editor’s tolerance and the extent to which he/she will perform his work. Other criterion allowing us to define the technique and the performance is the target quality group. If the target text is expected to be of publishable quality then it’s called full post-edit and otherwise light post-edit. Light & full post-edit techniques can be briefly defined as above but the distinction is not always so clear. Besides, under/over edit concepts are likely to be included to above mentioned issues. You may want to include some more details about these concepts in the post-editor training sessions; enriching the training materials with some examples would be a great idea!
About Selçuk Özcan
Selçuk Özcan has more than 5 years’ experience in the language industry and is a co-founder of Transistent Language Automation Services. He holds degrees in Mechanical Engineering and Translation Studies and has a keen interest in linguistics, NLP, language automation procedures, agile management and technology integration. Selçuk is mainly responsible for building high quality production models including Quality Estimation and deploying the ‘train the trainers’ model. He also teaches Computer-aided Translation and Total Quality Management at the Istanbul Yeni Yuzyil University, Translation & Interpreting Department.
Read More about KantanMT’s Partnership with Transistent in the official News Release, or if you are interested in joining the KantanMT Partner Program, contact Louise (email@example.com) for more details on how to get involved.
KantanMT caught up with Milengo’s Machine Translation Solutions Architect, Deepan Patel earlier this week for a quick chat about his experience using machine translation. Next Month, Deepan will be joining Tony O’Dowd in a free live webinar, to talk about how Milengo maximized it’s ROI for machine translation.
KantanMT: Can you tell me a little about yourself and, how you got involved in the industry?
Deepan Patel: To be honest, I sort of fell into the localization industry but I am certainly very glad that I did! I am a Modern Languages graduate from the University of Oxford which provided a very traditional approach to translation, certainly a million miles away from the realities of life in the localization industry.
I moved to Berlin after graduating in late 2008 and within a year I was fortunate enough to be accepted on a trainee program by my current employer Milengo Ltd, a language services provider which was founded in 2005. The first ever project I ever worked on was one that involved the customization of statistical machine translation (SMT) engines for a customer wishing to test the long-term viability of incorporating machine translation and post-editing into their localization operations.
It was a tremendous experience for both myself and Milengo; it was really that initial project that has laid the foundations for the MT-related services that we now offer. The main focus of my work at Milengo relates to testing and deploying customized machine translation and post-editing workflows for clients requiring a completely outsourced MT solution.
KMT: How has MT affected or changed your business models at Milengo?
DP: I believe that having machine translation and post-editing as part of our service spectrum has lent us a significant competitive advantage. This was very apparent in September last year when we were approached by an eCommerce company with quite a formidable challenge: namely, they had 19 days in which to launch a new web shop for Sweden and around 780,000 words that needed to be localized from Danish into Swedish. And of course they had a very tight budget!
Through the experiences that we have gained running large-scale machine translation and post-editing projects over the years, we were able to confidently provide a compelling MT-based workflow solution which fell within our client’s budget and would deliver high-quality translated content before their launch date. When providing their reasons for choosing us as for that project, it was our confidence in stating that we could deliver in time that was the main factor. Without our experience with machine translation, we would not have been able to win that project – it is as simple as that. We were able to deliver high-quality localized content within budget and before the initial deadline request. And now we enjoy regular work from this client, localizing all the updates to their product descriptions across three language pairs.
So in essence, MT has enabled us to win those large-scale projects where customer budgets are limited, turnaround time is crucial but quality expectations are high, that we may not have stood a favourable chance of winning previously.
KMT: How do you use machine translation for your clients?
DP: When answering this question I must take pains to emphasize that our MT service offerings always involve post-editing. For one of our clients within the IT domain, we localize the online help to their software products across five language pairs using customized engines that have been built using their own language assets. The requirement there is to deliver high-quality localized content at a significant cost reduction to a human-only translation model. For this particular customer we have achieved cost savings of between 27 – 40 % depending on the language pair.
For another of our clients within the automotive sector, we have built custom MT systems across 3 language pairs to provide a cost-effective but high-quality localization solution for their huge volume of parts data. The initial challenge presented to us was to localize around 300,000 words of this data within a fairly tight timeframe – though not as challenging as our eCommerce client! We were first able to demonstrate the viability of customized machine translation and post-editing for this type of content via our free Machine Translation and Post-editing (MT-PE) feasibility study, after which point we deployed our workflow solution for their three requested target languages. Again for this customer, we have implemented cost savings of between 25 – 40% when compared to the traditional translation model and are enjoying continued business from them.
The third main scenario where we apply MT-PE is for our eCommerce client that I mentioned in my response to your previous question. They add new products to their web shop on a weekly basis and their very repetitive product descriptions need to be localized as soon as possible, so the content can go “live” on the different language sites. Together with this customer we are now focusing on automating as much of the project process as possible with regard to transfer of content via API connectors and using our customized MT systems as a fully-integrated part of their localization project workflow.
For all of these clients, we have been able to offer tiered-pricing packages based on the premise that the more content that we post-edit and feed back into their MT systems during re-training cycles, the better the system will perform on future projects. Consequently we can offer lower rates for localization at defined intervals. Really it is all about being able to demonstrate the long-term cost-savings possible with a customized MT-PE solution.
KMT: What advice can you give to translation buyers, interested in implementing a machine translation workflow strategy?
DP: Well, firstly I would encourage translation buyers to evaluate whether they have the time, budget and most importantly the relevant personnel within their organization to develop a custom MT solution, or whether it would make sense to turn to external help in the form of MT tech providers like KantanMT, or LSPs such as Milengo who would additionally be able to provide post-editing solutions as well.
I would also encourage translation buyers to evaluate how MT can be applied in different usage scenarios. For example, it would certainly be worth investigating MT-PE for large volume, highly repetitive content (user manuals, support documentation, catalogue data) where you can achieve significant cost-savings and quicker turnaround without compromise on the language quality (with excellent post-editors of course). Another worthwhile scenario for MT would be if your company produces a lot of short life-cycle or customer support content which needs to be available in the languages of your customers as quickly possible, and where transfer of meaning takes precedence over linguistic quality.
Thirdly I would ask the respective translation buyer to examine the state and volume of any language assets that they can use for customizing MT systems. Do you have enough of a training corpus to build MT systems which produce good quality MT output? Have your language assets been maintained well enough to ensure as much consistency in translation as possible? Remember that an MT system will only ever be as good as the material you use to train it. Again here external help may be useful in terms of applying data cleaning and normalization to the training corpus before you get round to building your MT systems.
Finally, I would always advise prospective translation buyers to consider the wider impact benefits of incorporating MT into their localization practices. The more you make use of your custom MT systems and more post-edited content you incorporate into system re-training cycles, the better your systems will perform. This of course leads to greater productivity benefits and reduced costs for localization. Which in turn means that you should free up more of your budget to turn your attentions towards localizing content that was previously considered too cost-prohibitive.
Thank you Deepan, for taking time out of your busy schedule to take part in this interview, and we look forward to hearing more from you in KantanMT’s upcoming partner webinar. The webinar, Maximizing ROI for Machine Translation will be held on Wed, Mar 11, 2015 3:00 PM – 4:00 PM GMT.
SDL Trados Studio is one of the most popular Computer Aided Translation (CAT) tools available on the market today, and is used by thousands of Language Service Providers (LSPs) and Translators worldwide.
To accommodate the high numbers of SDL Trados Studio users, the KantanMT development team released a new and improved KantanAPI Connector™, which is compatible with the latest versions of SDL Trados Studio (2011, 2014). The beauty of using this connector means you can quickly and easily configure both your SDL Trados Studio account and your KantanMT account, so there is a straightforward and seamless integration between both platforms.
As a member of the KantanMT Community, using SDL Trados Studio 2011 or 2014 you can launch and shutdown your KantanMT engines and retrieve translations on demand via the API from your KantanMT account, all you need to provide is your KantanMT account name, token and profile.
Once you have your KantanAPI Connector™ token, it’s a simple three step process to set up the integration.
Integrating SDL Trados Studio with KantanMT
Step 1: Login to the SDL Open Exchange App Store and Download the Installer
To download the app you will need a valid SDL Trados Studio license. Login to the SDL Translationzone App store using the same email address and password you use for SDL Trados Studio.
Step 2: Launch SDL Trados Studio
As soon as you have downloaded and installed the SDL Trados Studio installer from SDL’s Translationzone, you will need to launch Trados Studio.
Step 3: Select and Run the KantanAPI Connector
The KantanAPI Connector™ will appear in the list of plugins available for download, making it very straightforward to input your API token and select the profile that you want to use. The connector is completely free to download and requires the .NET Framework 3.5 to run correctly.
By using SDL Trados Studio you can easily access the KantanMT features available within the SDL Trados Studio interface based on your KantanMT subscription plan.
What do I do if I don’t have a KantanAPI Connector™ token?
Simply contact the KantanMT Sales Team (firstname.lastname@example.org) to get your unique KantanAPI Connector token.
About KantanAPI Connector™ v2.0
The KantanAPI Connector™ allows you, and other members of the KantanMT Community to interact with the cloud based MT platform; KantanMT.com. You can submit individual segments or groups of segments for translation, and receive those translations immediately. The API operates as a REST web service, this means that a client program needs only to be able to perform HTTP GET requests to interact with the API. So, the API is not limited to interacting with clients developed using a particular programming language or operating system.
Read the press release: KantanMT Announces Faster SDL Trados Studio 2011 and 2014 Integration
Product Sheet: KantanAPI Connector™
Download the Trados 2015 Plugin
The KantanMT Team would love to hear about your experience using the KantanMT/SDL Trados Studio connector. Please send your feedback or questions to Louise (email@example.com).
The ‘quality debate’ is old news and the conversation, which is now heavily influenced by ‘big data’ and ‘cloud computing’ has moved on. Instead it is focusing on the ability to scale translation jobs quickly and efficiently to meet real-time demands.
Translation buyers expect a system or workflow that provides high quality, fit-for-purpose translations. And it’s because of this that Language Service Providers (LSPs) have worked tirelessly, perfecting their systems and orchestrating the use of Translation Memories (TM) within well managed workflows that combine the professionalization of the translator industry – quality is now a given in the buyers eyes.
What is the translation buyers’ biggest challenge?
The Translation buyers’ biggest challenge now is scale – scaling their processes, their workflows and supply chains. Of course, the caveat is that they want scale without jeopardizing quality! They need systems that are responsive, are transparent and scale gracefully in step with their corporate growth and language expansion strategy.
Scale with quality! One without the other is as useless as a wind-farm without wind!
What makes machine translation better than other processes? Looking past the obvious automation of the localization workflow, the one thing that MT can do above all other translation methods is its ability to combine automation and scalability.
KantanMT recognizes this and has developed a number of key technologies to accelerate the speed of on-demand MT engines without compromising quality.
- KantanAutoScale™ is an additional divide and conquer feature that lets KantanMT users distribute their translation jobs across multiple servers running in the cloud.
- Engine Optimization technology means KantanMT engines now operate 5-10 times faster, reducing the amount of memory and CPU power needed so MT jobs can be processed faster and are more efficiently when using features like KantanAutoScale.
- API optimization, KantanMT engineers went back to basics, reviewing and refining the system, which enabled users to achieve improvements from 50-100% performance in translation speed. This meant translation jobs that took five hours can now be completed in less than one hour.
Scalability is the key to advancement in machine translation, and considering the speed at which people are creating and digesting content we need to be able to provide true MT scalability to all language pairs for all content.
KantanMT’s Tony O’Dowd and bmmt’s Maxim Khalilov will discuss the scalability challenge and more, in a free webinar for translation buyers; 5 Challenges of Scaling Localization Workflows in the 21st Century on Thursday November 20th at 4pm GMT, 5pm CET, 8am PST.
To hear more about optimizing or improving the scalability of your engine please contact Louise Irwin (firstname.lastname@example.org).
The KantanMT team are excited to be exhibiting for the first time at the tekom Trade Fair and tcword conference. This year, the event has found a new home at the International Congress Centre (ICS) Messe Stuttgart. This event, which is the largest of its kind is the biggest market place for technical communication in the world.
Not only will the KantanMT flag be flying high at the largest global TC event. But, KantanMT will also be taking part in sessions, tool presentations and offering personalized demos throughout the conference week. KantanMT are also offering its members a complimentary ticket to the tekom Fair with their registration.
Session: How does your machine Translation system measure up?
Tony O’Dowd, Founder and Chief Architect will be giving a presentation on evaluating machine translation. The presentation; ‘How does your machine Translation system measure up?’ is for localization professionals and will cover some of the most common yet critical issues for users of machine translation:
- Measuring performance of Statistical MT
- Recent advances in MT and data visualization techniques
- Tracking MT efficiency in the translation process
Where: Room C7.1OG
When: Wednesday 12th November @16:00 – 16:45
Joint Tool Presentation – Machine Translation for Translation Buyers: What is available and what is expected!
On the following day, KantanMT will be taking part in a joint tool presentation with German Language Service Provider (LSP) bmmt. Tony O’Dowd and Maxim Khalilov from bmmt will discuss ‘machine translation for translation buyers: what is available and what is expected’. In this presentation, Tony and Maxim will give an overview of the current post-edited MT landscape and discuss with examples the formula for successful MT adoption, as well as what tools are available for global translation buyers. The full tool presentation program is available online on the tekom website.
What: Tool Presentation
Where: Room C10.1
When: Thursday 13th November @ 11:15 – 12:00
Personalized Platform Demonstrations
At the KantanMT exhibition booth, the KantanMT team will be giving personalized platform demonstrations that provide an ‘under the bonnet’ look at the cloud-based platform. The booth will be located in Hall C2 at booth A10, right next to bmmt; German LSP and KantanMT preferred partner.
What: Personalized Platform Demonstrations
Where: KantanMT exhibition booth Hall C2, booth A10
When: Tuesday 11th – Thursday 13th November
Get the most out of the tekom/tcword conference – meet the teams
Large conferences and events can often be overwhelming and it’s easy to lose track of time and get wrapped up in the buzz and excitement of the event. To make sure you get the most from the conference, keep organized and make an appointment to speak with a member of the KantanMT or bmmt team.
KantanMT team – contact Louise Irwin (email@example.com)
Bmmt team – contact Peggy Lindner (firstname.lastname@example.org)
See you in Stuttgart!