KantanMT.com was used in the course ‘Machine Translation and Post-editing,’ which was taught for the first time in the ‘Degree in Modern Languages Applied to Translation’ in UAH. English and Spanish were the main languages used during this course.
If the post-Black Friday sales numbers are anything to go by, there’s no question any more that the face of eCommerce is changing, and with it, the brick-and-mortar retailers have started rethinking their business strategy. As this news piece about Scotland experiencing a major dip in shoppers goes on to prove, demand for online shopping will increase substantially in 2016. This in turn means that the need for content localization and translation for eTailers (online retailers) will be even more pressing during the coming new year. As the often quoted Common Sense Advisory report points out, 72.4% of consumers are more likely to buy from a site, which is in their native language. Indeed, localization is no longer a good-to-have feature – it is now a must-have for all eCommerce businesses that aim to sell their products globally.
Chris Bishop, Managing Director of Microsoft Research, Cambridge, UK points out that “by 2026 we will have ubiquitous, human-quality translation among all European languages, thereby eliminating the language barrier throughout Europe.” Bishop’s prediction does not sound far off the mark at all when we take into account the fact that in the past ten years, Machine Translation (MT) has improved by leaps and bounds. Early MT was rules-based (RBMT) and required sets of linguistic rules, and it worked moderately well within a prescribed domain. However, this was resource intensive and cost prohibitive for many.
By 2026 we will have ubiquitous, human-quality translation among all European languages, thereby eliminating the language barrier throughout Europe
Chris Bishop, Managing Director of Microsoft Research, Cambridge, UK
The turning point for using MT in business came with the advent of the Internet, the SaaS model and the open source development model for software. These new changes in technology helped build the foundation for Statistical Machine Translation (SMT) research, and subsequently the open source development of the Moses Decoder. Moses enabled researchers and private companies to commercialise Statistical MT and develop it to the custom solutions it is today. The year of 2016 and beyond, will see further research in the fields of Natural Language Processing (NPL), Deep learning and machine learning, contributing directly to immense improvements in the fields of Custom MT.
The KantanMT Business Team published a new white paper, which provides an in depth understanding of how eTailers in 2016 will be affected by Machine Translation, and also goes on to discuss how Custom Machine Translation when compared to generic MT systems, will emerge as the clear winner in solving eTailing localization issues in the coming year.
Here are some of the highlights how MT will evolve in 2016 for eTailers:
- eTailers will use a combination of only CMT or CMT and Human Post-Editing to reach new markets ahead of their competitors
- With increased multilingual customer demand for products, content translation will find support in auto scaling
- Custom Machine Translation will be used more widely as eCommerce customers expand globally
Machine Translation is no longer a luxury. It is an essential component as a Tier 1 application to support global business. The purpose of this paper is to highlight how Machine Translation and more importantly Custom Machine Translation technology has come of age, in terms of quality, speed and scalability. During 2016 and beyond eTailers need to ensure that they review their globalization strategies to reflect these advances in technology, so they can maximise their global growth potential.
It’s tough to make it in business today – price pressures, quality demand, speed, and the need to be ‘different’ is something that every entrepreneur and indeed every CEO and manager struggle with on a daily basis. But not every entrepreneur, CEO or manager has the energy, time or means to do this.
People in these positions need business solutions that enable them to create value for their customers and create value internally both at the same time. They need a differentiation strategy that’s easy to implement – because it makes sense!
Wouldn’t efficiency makes sense?
Efficiency is defined as “a measure of whether the right amount of resources have been used to deliver a process, service or activity. An efficient process achieves its objectives with the minimum amount of time, money, people or other resources.”
This would mean that efficient companies:
- Maximised the use of their resources, thus, reducing costs.
And inefficient companies:
- Do not maximise their use of resources, and so take on more costs.
So who does efficiency benefit? Well both the company and the customer, generally. Companies who carry out production processes in more efficient ways tend to pass on some of these savings to their customers, and ultimately create more value for both existing and potential clients.
Efficiency Approaches in Translation and Localization:
Productive and efficient processes seek to use businesses and organisations resources in the most reliable and repeatable way in order to achieve the desired outcome. The language industry has long been a champion of using resources in this manner – to reduce costs and increase efficiencies. Some of the most pioneering technologies which were developed to do this include Translation Memory (TM), glossaries, and more recently Machine Translation (MT).
Translation Memory is a database that stores previously translated segments, in source and target format. Language Service Providers (LSPs) and translators use this tool to recycle previously translated segments (sentences) and reduce the number of new segments that translators need to translate. Research suggests that nearly all translators have experience using Translation Memories, but only roughly 30% use this technology on every project.
It is generally argued that using TMs greatly increases consistency throughout a project, however, some translators still persist that this is not the case. Either way, it is clear that LSPs are in favour of TMs, as too are their clients. They see the use of TMs as an economic efficiency measure – recycling previous translations means that they can achieve their objectives with less use of expensive resources.
While Translation Memory was having its day (in the nineties), Statistical Machine Translation (SMT) very much sat in the background. When SMT was first exposed to the public domain, it was received with much applause, however limited computational resources, high prices, and low quality meant that the industry could not feasible integrate MT systems into their workflows.
Now, after many years of continuous development, we are witnessing a huge re-emergence of MT. LSPs today have multiple options for implementing MT within their translation workflows – consultancy companies, self-serve platforms (like KantanMT), and in-house systems.
Increasingly LSPs are turning to MT not only because of quality and accessibility improvements, but also because they view it as an incredibly efficient use of resources. Machine Translation engines are developed through the statistical analysis of bilingual data (Translation Memories), meaning that LSPs can use TMs for both their original purpose and for also for generating Machine Translation engines – another tool for translation. Using TMs for MT development is an example of how smart LSPs are getting maximum value from their resources.
Glossaries are another helpful tool used by translation companies and translators to increase efficiencies during the translation process. This tool provides context for where translators should use terms or phrases, such as where a company name should remain untranslated – this reduces uncertainty in the minds of translators.
Maintaining consistent and up-to date glossaries for translation jobs/companies can have a huge impact on both the length of time it takes to complete a project and also the quality of the translations. KantanMT users can incorporate the use of glossaries or terminology when deploying their MT engines by uploading glossaries with translation files, and also using them as additional MT engine training material.
There are multiple online terminology management systems to choose from, the most popular being SDLs Multiterm. For companies who haven’t yet developed glossaries, there are different options out there to help get you started. Many LSPs, like Sajan, offer translation memory software which can crawl your TMs and find the most used terms – a quick approach that is perhaps best suited to companies who need glossaries ASAP but can afford for a few imperfections. An alternative is to have an LSP work closely with in-country linguists to develop glossaries for your company. Whichever company or method you choose, you should see results in terms of productivity and quality in very short space of time.
Efficiencies during the post-editing process:
Earlier we discussed how Machine Translation was helping to increase efficiencies during the translation process. On average KantanMT clients have reported increases in translation productivitiy rates of 60% – most reaching translation rates of between 525 and 625 words per hour (on technical translation).
However, while MT has been proven to be a great efficiency tool, there are still times when it lets us down. One of the ways in which MT projects are slowed down is when repetitive errors occur thoughout a document e.g. mispellings or capitalizations. The process of manually correcting these repetitive mistakes is the cause of much heart-ache from translators carrying out post-editing tasks. Manually correcting repetitive errors is a time consuming and laborious activity, which requires a lot of self-discipline. Most people carrying out these activities will become very bored and uninterested in their task (naturally!).
But, as Machine Translation technology continues to develop, so too does complementary post-editing technology. Post-editing tools can help to dramatically reduce the level of human post-editing required for an MT project because it allows MT users to automatically fix repetitive errors throughout a document by administrating just one action. KantanMT’s PEX technology enables users to do just that. PEX is a post-editing automation tool developed by KantanMT which reduces the manual post-editing effort required of post-editors. Using simple find and replace rules – users can easily fix repetitive errors within an entire document. By applying PEX rules to a machine translated document, MT users can significantly reduce the amount of human post-editing required.
These are some of the steps that users of KantanMT technologies implement in order to increase efficiencies in their companies. What steps are you and your company taking to increase translation efficiencies? Are you using translation technologies or do you have a difference approach? Please share, we’d love to hear your comments.
Of course, all of the tools above are not for every project. However, as industries becomes increasingly global and the demand for a wider variety of translation solutions continues to grow, greater pressure is being put on LSPs to increase efficiencies, and reduce costs. By maximising the use of TMs, glossies and post-editing software, companies can work towards establishing efficiency measures which will result in increased productivity, increased profits, and higher staff morale.
Thanks for reading 🙂