Cutting PEMT Times

KantanMT Cutting PEMT timesIn our last post, The Rise of PEMT, we discussed what automated post-editing means and why it is becoming more and more popular among Language Service Providers (LSPs). One of the most important things to remember about the post-editing process is that the less of it, the better.

In this post, we are going to look at some of the ways that you can keep your post-editing times to a minimum. This post is based on post-editing guidelines that have been developed by TAUS with, among others, KantanMT’s partners DCU and CNGL. A link to these guidelines is available at the end of this post.

7 steps to reducing your post-editing times

1. Train your KantanMT engine to improve translations
The quality of a KantanMT engine’s output increases as it is re-trained. This means running high quality training data through it and re-training using post-edited translations. The more you train your KantantMT engine with good training data, the more accurate your engine’s output will be. All of this means less post-editing time.

2. Make sure your training data is high quality
This rule stems directly from the previous point; a KantanMT engine’s accuracy will not improve if it is trained with poor quality data. Poor quality training data can be diagnosed by a number of factors such as poor writing style, unaligned segments, and data that is not specific to the client’s domain. Keep your training data clean and well-written.

3. Writing style/Pre-editing
It is very important to make sure that pre-translated documents are well written and grammatically correct. That means you should avoid misspellings, ambiguities, and make sure that sentences are grammatically complete. A Machine Translation engine does not correct writing errors so make sure that these mistakes are corrected before the source text is translated. See our previous blogs, Style Guides in MT and How to Write for MT for more information on this topic.

4.Terminology management
Ensure that terminology management is integrated “across source text authoring, Machine Translation and TM systems” (TAUS). Terminology management means defining terms and their rules of usage, and implementing these definitions and rules throughout a document. This safeguards a consistent level of accuracy and legibility across translation outputs.

easelly_visual(3)

5. Set realistic timelines
Make sure that you assess the quality of raw Machine Translation output before agreeing upon a price and the size of the translation order. Naturally, the poorer the output, the more post-editing time that will be required.

6. Decide upon a quality standard of post-editing
For some clients, an understandable document is all that is required. This means that stylistic issues are generally ignored but the meaning of the document is still accurately conveyed. For many clients however, the content must be perfect and this requires a degree of post-editing that also incorporates corrections to stylistic issues. O’ Brien et al, quoting Allen, say that the standard of post-editing output is determined by

•    “User Requirements
•    Volume
•    Quality Expectations
•    Turn-Around Time
•    Perishability
•    Text Function”

Remember to agree upon a post-editing standard with your client. The lower the expected standard of output, the less time consuming the post-editing process should be.

7. Use KantanMT’s Post-Editing Automation technology (PEX)
In our last post, The Rise of PEMT, we discussed the benefits for post-editors in using automated post-editing within their workflow. Here is a quick reminder:

A document has been translated by a KantanMT engine but there is a word that begins with a lower case letter which should begin with a capital letter. This mistake has been repeated throughout the document several hundred times. Rather than a post-editor having to manually find and correct each occurrence of this error, KantanMT’s PEX technology can find and correct the mistake with its rule system. You can find out more about PEX by clicking here.

This means that post-editors can save time and turn their attention to fixing more complex stylistic errors. All of this results in faster project completion times and lower costs.

In our next post, we will look at guidelines to achieving both understandable post-editing output and high quality post-editing output.

TAUS Machine Translation Post-Editing Guidelines

You can find out more about KantanMT by visiting KantanMT.com and signing up to our free 14 day trial.

The Rise of PEMT

KantanMT The Rise of PEMTMore companies want multilingual content produced cheaply and quickly by Language Service Providers (LSPs); Machine Translation is becoming a more popular choice as a result.

TechNavio predicted that the market for Machine Translation will grow at a compound annual growth rate (CAGR) of 18.05% until 2016, and the report attributes a large part of this rise to “the rapidly increasing content volume”. Of course, while Machine Translation may help to cut costs and turnaround times, its success is ultimately judged on whether it can not only produce correct translations-but also content that meets the quality standards of each individual client.

This places the spotlight firmly on the post-editing stage of the Machine Translation process. In this post, we are going to examine the Machine Translation post-editing stage and discuss how automatic post-editing can be incorporated into it.

What is Machine Translation post-editing?
Jeff Allen says the purpose of the post-editing stage, or more specifically the post-editor, is to “edit, modify, and/or correct pre-translated text that has been processed by an MT system from a source language into (a) target language(s)”. The most important thing to take from this is that post-editing is not the same as translation.

The fundamental aim of the post-editing process is to make Machine Translation output understandable or stylistically appropriate (depending on client requirements). Automatic post-editing is when computer technology is used to complete parts of the post-editing process.

post-editing

Does this mean some stages of the post-editing process can be completely automated?
Not exactly. Automated post-editing is not an entirely mechanised process whereby a machine parses and corrects a document without human intervention. Humans must still proofread translation output and make sure that the each client’s standards are met.  However, post-editing technologies can automate a number of steps that would have previously required manual intervention and multiple edits by the post-editor.

As Bartolomé Mesa-Lao of Copenhagen Business School in Denmark says, the less edits required the better a post-editors productivity. This is one of the main reasons why, in an age where companies want multilingual user content on-demand, post-editing technologies are becoming increasingly more important to LSPs. If we take an example of using KantanMT’s post-editing technologies as part of the post-editing process, we can see how it works:

A document has been translated by a KantanMT engine but there is a word that begins with a lower case letter which should begin with a capital letter. This mistake has been repeated throughout the document several hundred times. Rather than a post-editor having to manually find and correct each occurrence of this error, KantanMT’s PEX technology can find and correct the mistake using its “find and replace” rules. This means that post-editors can save time and turn their attention to fixing more complex stylistic errors. All of this results in faster project completion times and lower costs.

In our next post, we will look at some of the best practices you can use to make sure that you keep your post-editing times to a minimum.

You can find out more about Machine Translation and KantanMT by going to KantanMT.com and signing up to our free 14 day trial.