In our last post, The Rise of PEMT, we discussed what automated post-editing means and why it is becoming more and more popular among Language Service Providers (LSPs). One of the most important things to remember about the post-editing process is that the less of it, the better.
In this post, we are going to look at some of the ways that you can keep your post-editing times to a minimum. This post is based on post-editing guidelines that have been developed by TAUS with, among others, KantanMT’s partners DCU and CNGL. A link to these guidelines is available at the end of this post.
7 steps to reducing your post-editing times
1. Train your KantanMT engine to improve translations
The quality of a KantanMT engine’s output increases as it is re-trained. This means running high quality training data through it and re-training using post-edited translations. The more you train your KantantMT engine with good training data, the more accurate your engine’s output will be. All of this means less post-editing time.
2. Make sure your training data is high quality
This rule stems directly from the previous point; a KantanMT engine’s accuracy will not improve if it is trained with poor quality data. Poor quality training data can be diagnosed by a number of factors such as poor writing style, unaligned segments, and data that is not specific to the client’s domain. Keep your training data clean and well-written.
3. Writing style/Pre-editing
It is very important to make sure that pre-translated documents are well written and grammatically correct. That means you should avoid misspellings, ambiguities, and make sure that sentences are grammatically complete. A Machine Translation engine does not correct writing errors so make sure that these mistakes are corrected before the source text is translated. See our previous blogs, Style Guides in MT and How to Write for MT for more information on this topic.
Ensure that terminology management is integrated “across source text authoring, Machine Translation and TM systems” (TAUS). Terminology management means defining terms and their rules of usage, and implementing these definitions and rules throughout a document. This safeguards a consistent level of accuracy and legibility across translation outputs.
5. Set realistic timelines
Make sure that you assess the quality of raw Machine Translation output before agreeing upon a price and the size of the translation order. Naturally, the poorer the output, the more post-editing time that will be required.
6. Decide upon a quality standard of post-editing
For some clients, an understandable document is all that is required. This means that stylistic issues are generally ignored but the meaning of the document is still accurately conveyed. For many clients however, the content must be perfect and this requires a degree of post-editing that also incorporates corrections to stylistic issues. O’ Brien et al, quoting Allen, say that the standard of post-editing output is determined by
• “User Requirements
• Quality Expectations
• Turn-Around Time
• Text Function”
Remember to agree upon a post-editing standard with your client. The lower the expected standard of output, the less time consuming the post-editing process should be.
7. Use KantanMT’s Post-Editing Automation technology (PEX)
In our last post, The Rise of PEMT, we discussed the benefits for post-editors in using automated post-editing within their workflow. Here is a quick reminder:
A document has been translated by a KantanMT engine but there is a word that begins with a lower case letter which should begin with a capital letter. This mistake has been repeated throughout the document several hundred times. Rather than a post-editor having to manually find and correct each occurrence of this error, KantanMT’s PEX technology can find and correct the mistake with its rule system. You can find out more about PEX by clicking here.
This means that post-editors can save time and turn their attention to fixing more complex stylistic errors. All of this results in faster project completion times and lower costs.
In our next post, we will look at guidelines to achieving both understandable post-editing output and high quality post-editing output.