Motivate Post-Editors

KantanMT motivate post-editorsPost-editing is a necessary step in the Machine Translation workflow, but the role is still largely misunderstood. Language Service Providers (LSPs) are now experimenting more with the best practices for post-editing in the workflow. The lack of consistent training and reluctance within the industry to accept importance of the role are linked to the post-editors motivation. KantanMT looks at some of the more conventional attitudes towards motivation and their application to post-editing.

What is motivation and what studies have been done so far?

Understanding the concept of motivation has been a hot topic in many areas of organisation theory. Studies in the area really began to kick off with their application in the workplace, opening doors for pioneers to understand how employees could be motivated to do more work, and do better work.

Motivation Pioneers

  • Abraham Maslow and his well-known ‘Hierarchy of Needs’ indicates a person’s motivations are based on their position in the hierarchy pyramid.
  • Frederick Herzberg’s ‘two Factor Theory’ or Herzberg’s motivation-hygiene theory suggests professional activities like; professional acknowledgement, achievements and work responsibility, or job satisfiers have a positive effect on motivation.
  • Douglas McGregor used a black and white approach to motivation in his ‘Theory X and Theory Y’. He grouped employees into two categories; those who will only do the minimum and those who will push themselves.

As development of theories continued…

  • John Adair came up with the ‘fifty-fifty theory’ . According to the fifty-fifty theory, motivation is fifty percent the responsibility of the employee and fifty percent outside the employee’s control.

Even more recently, in 2010

  • Teresa Amabile and Steven Kramer carried out a study on the motivation levels of employees in a variety of settings. Their findings, suggest ‘Progress’ as the top performance motivator identified from an analysis of approx. 12,000 diary entries, daily ratings of motivation and emotions from hundreds of study participants.

To understand post-editor motivation we can combine the top performance motivator; progress with fifty-fifty theory.

Progress is a healthy motivator in the post-editing profession, it can help Localization Project Managers understand and encourage post-editor satisfaction and motivation. But while progress can be deemed an external factor, if we apply Adair’s ‘fifty-fifty’ rule, post-editors are also at least fifty percent responsible for their own motivation.

Post-editing as a profession is still only finding its feet, TAUS carried out a study in 2010 on the post editing practices of global LSPs. The study showed that, while post-editing is becoming a standard activity in the translation workflow it only accounts for a minor share of LSP business volume. This indicates that post-editors see their role as one of lesser importance because the industry views it as a role of lesser importance.

This attitude in the industry is highlighted by the lack of industry standards for post-editing best practices. Without evaluation practices to train post-editors and improve the post-editing process, post-editors are not making progress. This quite naturally is demotivating for the post-editor.

How to motivate post-editors

The first step in motivating post-editors is to recognise their role as autonomous to the role of a translator. The best post-editors are those, who are at least bilingual with some form of linguistic training, like a translator. Linguistic training is a major asset for editing the Machine Translated output.

TAUS offer a comparison of the translation process versus the post-editing process, highlighting the differences in the post-editing and translation processes.

KantanMT, Translator process Taus 2010
Translation process of a Translator (TAUS 2010)
KantanMT, Motivating Post-editors,
Translation process of a Post-editor (TAUS 2010)

One process is not more complicated that the other, only different. Translators, translate internally, while post-editors make “snap editing decisions” based on client requirements. As LSPs recognise these differences, they can successfully motivate their post-editors by providing them with the most suitable support, and work environment.

Progress as a Motivator

Translators make good post-editors, they have the linguistic ability to understand both the source and target texts, and if they enjoy editing or proof-reading, then the post-editing role will suit them. The right training is also important, if post-editors are trained properly they will become more aware of potential improvements to the workflow.

These improvements or ideas can be a great boost to post-editor motivation, if implemented the post-editor can take on more responsibility, which helps improve the translation workflow. A case where this could be applied is; if the post-editor is made responsible for updating the language assets used to retrain a Machine Translation system, they can take ownership and become responsible for the output quality rather than just post-editing Machine Translation output in isolation.

Fixing repetitive errors, can be frustrating for anyone, not just post-editors. But if they are responsible for the output quality, understand the system and can control the rules used to reduce these repetitive errors, they will experience motivation through progress.

This is only the tip of the iceberg on what motivates post-editors, each post-editor is different and how they feel about the role, whether it is just ‘another job’ or a major step in their career all play a part. The key is to provide proper training, foster an environment where post-editors can make progress by positively contributing to the role.

Translators often take pride and ownership of their translations, post-editors should also have the opportunity to take pride in their work, as it is their skills and experience that make it ‘publishable’ or even ‘fit for purpose’ quality.

Repetitive errors like diacritic marks or capitalisation can be easily fixed using KantanMT’s Post-Editing Automation (PEX) rules. PEX rules allow repetitive errors in a Machine Translation engine to be easily fixed using a ‘find and replace’ tool. These rules can be checked on a sample of the text by using the PEX Rule Editor.

The post-editor can correct repetitive errors during post-editing process, so the same errors don’t appear in future MT output, giving them responsibility over the Machine Translation engines quality.

Automatic Post-Editing

KantanMT - PEX Post EditorPost-Editing Machine Translation (PEMT) is an important and necessary step in the Machine Translation process. KantanMT is releasing a new, simple and easy to use PEX rule editor, which will make the post-editing process more efficient, saving both time, costs and the post-editors sanity.

As we have discussed in earlier posts, PEMT is the process of reviewing and editing raw MT output to improve quality. The PEX rule editor is a tool that can help to save time and cut costs. It helps post-editors, since they no longer have to manually correct the same repetitive mistakes in a translated text.

Post-editing can be divided into roughly two categories; light and full post-editing.  ‘Light’ post-editing, also called ‘gist’, ‘rapid’ or ‘fast’ post-editing focuses on transferring the most correct meaning without spending time correcting grammatical and stylistic errors. Correcting textual standards, like word order and coherence are less important in a light post-edit, compared to a more thorough ‘full’ or ‘conventional’ post-edit. Full post-edits need the correct meaning to be conveyed, correct grammar, accurate punctuation, and the correct transfer of any formatting such as tags or place holders.

The Client often dictates the type of post-editing required, whether it’s a full post-edit to get it up to ‘publishable quality’ similar to a human translation standard, or a light post-edit, which usually means ‘fit for purpose’. The engine’s quality also plays a part in the post-editing effort; using a high volume of in-domain training data during the build produce higher quality engines, which helps to cut post-editing efforts. Other factors such as language combination, domain and text type all contribute to post-editing effort.

Examples of repetitive errors

Some users may experience the following errors in their MT output.

  • Capitalization
  • Punctuation mistakes, hyphenation, diacritic marks etc.
  • Words added/omitted
  • Formatting – trailing spaces

SMT engines use a process of pattern matching to identify different regular expressions. Regular expressions or ‘regex’ are special text strings that describe patterns, these patterns need no linguistic analysis so they can be implemented easily across different language pairs. Regular expressions are also important components in developing PEX rules. KantanMT have a list of regular expressions used for both GENTRY Rule files (*.rul) and PEX post-edit files (*.pex).

Post-Editing Automation (PEX)

Repetitive errors can be fixed automatically by uploading PEX rule files. These rule files allow post-editors to spend less time correcting the same repetitive errors by automatically applying PEX constructs to translations generated from a KantanMT engine.

PEX works by incorporating “find and replace” rules. The rules are uploaded as a PEX file and applied while a translation job is being run.

PEX Rule Editor

KantanMT have designed a simple way to create, test and upload post-editing rules to a client profile.

KantanMT Pex Rule Editor

The PEX Rule editor, located in the ‘MykantanMT’ menu, has an easy to use interface. Users can copy a sample of the translated text into the upper text box ‘Test Content’ then input the rules to be applied in the ‘PEX Search Rules’ and their corrections to the ‘PEX Replacement Rules’ box. The user can test the new rules by clicking ‘test rules’ and instantly identify any incorrect rules, before they are uploaded to the profile.

The introduction of tools to assist in the post-editing process helps remove some of the more repetitive corrections for post-editors. The new PEX Editor feature helps improve the PEMT workflow by ensuring all uploaded rule files are correct leading to a more effective method for fixing repetitive errors.

Conference and Event Guide – December 2013

KantanMT eventsThings are winding down as we are getting closer to the end of the year, but there are still some great events and webinars coming up during the month of December that we can look forward to.

Here are some recommendations from KantanMT to keep you busy in the lead up to the festive season.

Listings

Dec 02 – Dec 05, 2013
Event: IEEE CloudCom 2013, Bristol, United Kingdom

Held in association with Hewlett-Packard Laboratories (HP Labs), the conference is open to researchers, developers, users, students and practitioners from the fields of big data, systems architecture, services research, virtualization, security and high performance computing.


Dec 04, 2013
Event: LANGUAGES & BUSINESS Forum – Hotel InterContinental Berlin

The forum highlights key issues in language education, particularly in the workplace and the new technologies that are becoming a key part of the process. The event, will promote international networking and has four main themes; Corporate Training, Pre-Experience Learners, Intercultural Communication and Online Learning.


Dec 05, 2013
Webinar: Effective Post-Editing in Human and Machine Translation Workflows

Stephen Doherty and Federico Gaspari, CNGL (Centre for Next Generation Localisation) will give an overview of post-editing and different post-editing scenarios from ‘gist’ to ‘full’ post-edits. They will also give advice on different post-editing strategies and how they differ for Machine Translation systems.


Dec 07 – Dec 09, 2013
Event: 6th Language and Technology Conference, Poznan, Poland

The conference will address the challenges of Human Language Technologies (HLT) in computer science and linguistics. The event covers a wide range of topics including; electronic language resources and tools, formalisation of natural languages, parsing and other forms of NL processing.


Dec 09 – Dec 13, 2013
Event: IEEE GLOBECOM 2013 – Power of Global Communications, Atlanta, Georgia USA

The conference, which is the second largest of the 38 IEEE technical societies will focus on the latest advancements in broadband, wireless, multimedia, internet, image and voice communications. Some of the topics presented referring to localization occur on the 10th December and include; Localization Schemes, Localization and Link Layer Issues, and Detection, Estimation and Localization.


Dec 10 – Dec 11, 2013
Event: Game QA & Localization 2013, San Francisco, California USA

This event brings together QA and Localisation Managers, Directors and VPs from game developers around the world to discuss key game localization industry challenges. The event in London, June 2013 was a huge success, as more than 120 senior QA and localization professionals from developers, publishers and 3rd party suppliers of all sizes and platforms came to learn, benchmark and network.


Dec 11 – Dec 15, 2013
Event: International Conference on Language and Translation, Thailand, Vietnam and Cambodia

The Association of Asian Translation Industry (AATI) is holding an International Conference on Language and Translation or “Translator Day” in three countries; Thailand on December 11, 2013, Vietnam on December 13, 2013, and Cambodia on December 15, 2013. The events provide translators, interpreters, translation agencies, foreign language centres, NGO’s, FDI financed enterprises and other translation purchasers with opportunities to meet.


Dec 12, 2013
Webinar: LSP Partnerships & Reseller Programs 16:00 GMT (11:00 EST/17:00 CET)

This webinar, which is hosted by GALA and presented by Terena Bell covers how to open up new revenue streams by introducing reseller programs to current business models. The webinar is aimed at world trade associations, language schools, and other non-translation companies wishing to offer their clients translation, interpreting, or localization services.


Dec 13 – Dec 14 2013
Event: The Twelfth Workshop on Treebanks and Linguistic Theories (TLT12), Sofia (Bulgaria)

The workshops, hosted by BulTreeBank Group­­­­­­­ serve to promote new and ongoing high-quality work related to syntactically-annotated corpora such as treebanks. Treebanks are important resources for Natural Language processing applications including Machine Translation and information extraction. The workshops will focus on different aspects of treebanking; descriptive, theoretical, formal and computational.


Are you planning to go to any events during December? KantanMT would like to hear about your thoughts on what makes a good event in the localization industry.

Crowdsourcing vs. Machine Translation

KantanMT CrowdsourcingCrowdsourcing is becoming more popular with both organizations and companies since the concept’s introduction in 2006, and has been adopted by companies who are using this new production model to improve their production capacity while keeping costs low. The web-based business model, uses an open call format to reach a wide network of people willing to volunteer their services for free or for a limited reward, for any activity including translation. The application of translation crowdsourcing models has opened the door for increased demand of multilingual content.

Jeff Howe, Wired magazine defined crowdsourcing as:

“…the act of taking a job traditionally performed by a designated agent (usually an employee) and outsourcing it to an undefined, generally large group of people in the form of an open call”.

Crowdsourcing costs equate to approx. 20% of a professional translation. Language Service Providers (LSPs) like Gengo and Moravia have realised the potential of crowdsourcing as part of a viable production model, which they are combining with professional translators and Machine Translation.

The crowdsourcing model is an effective method for translating the surge in User Generate Content (UGC). Erratic fluctuations in demand need a dynamic, flexible and scalable model. Crowdsourcing is definitely a feasible production model for translation services, but it still faces some considerable challenges.

Crowdsourcing Challenges

  • No specialist knowledge – crowdsourcing is difficult for technical texts that require specialised knowledge. It often involves breaking down a text to be translated into smaller sections to be sent to each volunteer. A volunteer may not be qualified in the domain area of expertise and so they end up translating small sections text, out of context, with limited subject knowledge which leads to lower quality or mistranslations.
  • Quality – translation quality is difficult to manage, and is dependent on the type of translation. There have been some innovative suggestions for measuring quality, including evaluation metrics such as BLEU and Meteor, but these are costly and time consuming to implement and need a reference translation or ‘gold standard’ to benchmark against.
  • Security – crowd management can be a difficult task and the moderator must be able to vet participants and make sure that they follow the privacy rules associated with the platform. Sensitive information that requires translation should not be released to volunteers.
  • Emotional attachment – humans can become emotionally attached to their translations.
  • Terminology and writing style inconsistency – when the project is divided amongst a number of volunteers, the final version’s style needs to be edited and checked for inconsistencies.
  • Motivation – decisions on how to motivate volunteers and keep them motivated can be an ongoing challenge for moderators.

Improvements in the quality of Machine Translation have had an influence on crowdsourcing popularity and the majority of MT post-editing and proofreading tasks fit into crowdsourcing models nicely. Content can be classified into ‘find-fix-verify’ phases and distributed easily among volunteers.

There are some advantages to be gained when pairing MT technology and collaborative crowdsourcing.

Combined MT/Crowdsourcing

Machine Translation will have a pivotal role to play within new translation models, which focus on translating large volumes of data in cost-effective and powerful production models. Merging both Machine Translation and crowdsourcing tasks will create not only fit-for-purpose, but also high quality translations.

  • Quality – as the overall quality of Machine Translation output improves, it is easier for crowdsourcing volunteers with less experience to generate better quality translations. This will in turn increase the demand for crowdsourcing models to be used within LSPs and organizations. MT quality metrics will also make post-editing tasks more straightforward and easier to delegate among volunteers based on their experience.
  • Training data word alignment and engine evaluations can be done through crowd computing, and parallel corpora created by volunteers can be used to train and/or retrain existing SMT engines.
  • Security – customized Machine Translation engines are more secure when dealing with sensitive product or client information. General or publicly available information is more suited to crowdsourcing.
  • Terminology and writing style consistency – writing style and terminology can be controlled and updated through a straightforward process when using MT. This avoids the idiosyncrasies of volunteer writing styles. There is no risk of translator bias when using Machine Translation.
  • Speed – Statistical Machine Translation (SMT) engines can process translations quickly and efficiently. When there is a need for a high volume of content to be translated within a short period of time it is better to use Machine Translation. Output is guaranteed within a designated time and crowdsourcing post-editing tasks speeds up the production process before final checks are carried out by experienced translators or post-editors.
crowdsource and Machine Translation model
Use of crowdsourcing for software localization. Source: V. Muntes-Mulero and P. Paladini, CA Technologies and M. Solé and J. Manzoor, Universitat Politècnica de Catalunya.

Last chance for a FREE TRIAL for KantanAnalytics™ for all members until November 30th 2013. KantanAnalytics will be available on the Enterprise Plan.

Pricing PEMT 2

KantanMT blog, Pricing PEMT

Segment-by-segment Machine Translation Quality Estimation (QE) scores are reforming current Language Service Provider (LSP) business models.

Pricing Machine Translation is one of the most widely debated topics within the translation and localization industries. Many agree that there is no ‘black and white’ approach, because a number of variables must always be taken into consideration when costing a project. Industry experts are in agreement that levels of post-editing effort and payment should be calculated through a fair and easily replicated formula. This transparency is the goal KantanMT had in mind during the development of KantanAnalytics™, a “game-changing” technology in the localization industry.

New Business Model

The two greatest challenges facing Localization Project Managers are; how to cost and schedule Machine Translation projects. Experienced PM’s can quickly gauge how long a project will take to complete, but there is still an element of guesswork and contingency planning involved. This is intensified when you add Machine Translation. Although, not a new technology, the practical application in a business environment is still in infancy stages.

Powerful Machine Translation engines can be easily integrated into an LSP workflow. Measuring Machine Translation quality on a segment-by-segment basis and calculating post-editing effort on those segments allows LSPs to create more streamlined business models.

Studies have shown post-editing Machine Translation can be more productive than translating a document from scratch. This is especially true when translators or post-editors have a broad technical or subject knowledge of the text’s domain. In these cases they can capitalise on their knowledge with higher post-editing productivity.

So, how should a Machine Translation pricing model look?

The development of a technology that can evaluate a translation on a segment-by-segment basis and assign an accurate QE score to a Machine Translated text is critical for the successful integration of this technology into a project’s workflow.

The segment-by-segment breakdown and ‘fuzzy match’ percentage scoring system ensured the commercialisation of Translation Memories into LSP workflows. This system has been adopted as an industry standard for pricing translation jobs where translation memories or Computer Aided Translation (CAT) tools can be implemented. The next natural evolution, is to create a similar tiered ‘fuzzy’ matching system for Machine Translation.

Segment level QE technology is now available where Machine Translated segments are assigned percentage match values, similar to translation memory match values. Post-editing costs, similar to the costing of translation memory matches can be assigned. The match value also gives a clear indication of how long a project should take to post-edit based on the quality of the match and the post-editors skills and experience.

How can we trust the quality score?

The Machine Translation engine’s quality is based on the quality of the training data used to build the engine. The engines quality can be monitored with BLEU scores, F-measure and TER scoring. These automatic evaluation metrics indicate the engines quality, and combined with the ‘fuzzy’ match score, can be adjusted to get a more accurate picture of how post-editing effort is calculated and how projects should be priced. There are a number of variables that dictate how to create and implement a pricing model.

Variables to be considered when creating a pricing model

The challenge in measuring PEMT stems from a number of variables, which need to be considered by PMs when creating a pricing model:

  • Intended purpose – does the text require; a light, fast or full post-edit
  • Language pair and direction – Roman languages tend to provide better MT output
  • Quality of the MT system – better quality, domain specific engines produce better results
  • Post-editing effort – degree of editing required – minor edits or full retranslate
  • Post-editor skill and experience – post-editors with extensive domain expertise

Traditional Models

To overcome these challenges PMs traditionally opted for hourly or daily rates. However, hourly rates do not provide enough transparency or cost breakdown and can make a project difficult to schedule. These rates must also be calculated to take into consideration the translator or post-editors productivity and language pair.

Rates are usually calculated based on the translator or post-editor’s average post-editing speed within the specified domain. Day rates can be a good cost indicator for PMs based on the post-editors capabilities and experience, but again the cost breakdown is not completely transparent. Difficulties usually occur when a post-editor comes across a part of the text that requires more time or effort to post-edit, then productivity automatically drops.

As an example of the differing opinions in the translation community, pricing PEMT is dependent on the post-editing circumstances. Some posters on the Proz.com forum suggest that PEMT is priced as 30-50% or similar to editing a human translation. Others suggest, the output quality of a Machine Translation system is priced around the same as a ‘fuzzy’ match of 50-74% from a translation memory. These are broad subjective figures which do not take variables into consideration.

Calculation of the Machine Translated text on a segment-by-segment basis allows PMs to calculate post-editing effort based on the quality of customised Machine Translation engines. PMs can then use these calculations to build an accurate pricing model for the project, which incorporates all relevant variables. It also makes it possible to distribute post-editing work evenly across translators and post-editors making the most efficient use of their skills. Benefits to calculating post-editing effort are also seen in scheduling and project turnaround times.


KantanAnalytics™ is a segment-by-segment quality estimation scoring technology, which when applied to a Machine Translated text will generate a quality score for each segment, similar to the fuzzy match scoring system used in translation memories.

Sign up for a free trail to experience KantanAnalytics until November 30th 2013 KantanAnalytics will be available on the Enterprise Plan to sign up or upgrade to this plan please email KantanMT’s Success Coach, Kevin McCoy (kevinmcc@kantanmt.com).

Training Data

KantanMT Training DataBuilding a KantanMT Engine: Training Data

When the decision is made to incorporate a KantanMT engine into a translation model, the next obvious and most difficult question to answer is what to use to train the engine? This is often followed by: what are the optimum training data requirements to yield a highly productive engine? And how will I curate my training data?

The engine’s target domain and objectives should be clearly mapped out ahead of the build. If the documents are for a specific client or domain then the relevant in-domain training data should be used to build the engine. This also ensures the best possible translation results.

KantanMT recommends a minimum of 2 million training words for each domain specific engine. Higher quantities of in-domain “unique words” will also improve the potential for building an “intelligent” engine.

The quality of the engine is based on the language or translation assets used to build the engine. Studies by TAUS have shown quality is more important than quantity. “Intelligently selected training data” generated higher BLEU scores than an engine built with more generic data. The studies also indicated, a proactive approach in customising or adapting the engine with translation assets led to better quality results.

Translation assets are the best source of suitable training data for building KantanMT engines, they include:

Stock Training Data: KantanMT stock engines are collections of highly cleansed bi-lingual training data sets. Quality is ensured as each data set shows the source corpora and approximate number of words used to create each stock engine. These can be added to client data to produce much larger and more powerful engines. There are over a hundred different stock engines to choose from, including industry specific sets such as IT, Legal, Medical and Finance. Find a list of KantanMT Stock engines here >>

Stock engines are a good starting point if you have limited TMX (Translation Memory Exchange) files in the required domain, or if you would simply like to build bigger KantanMT engines.

Translation Memory Files: This is the best source of high quality training data since both source and target texts are aligned. Translation Memories used for previous translations in a similar domain will also have been verified for quality. This guarantees the engine’s quality will be representative of the Translation Memory quality. As the old expression in the translation industry goes “garbage in, garbage out”, good quality Translation Memory files will yield a good quality Machine Translation engine. The TMX file format is the optimal format for use with KantanMT, however, text files can also be used.

Monolingual Translated Text Files: Monolingual text files are used to create language models for a KantanMT engine. Language models are used for word and phrase selection and have a direct impact on the fluency and recall of KantanMT engines. Translated monolingual training data should be uploaded alongside bi-lingual training data when building KantanMT engines.

Glossary Files: Terminology or glossary files can also be used as training material. Including a glossary improves terminology consistency and translation quality. Terminology files are uploaded with your ‘files to be translated’ and should also be in a TBX file format.

KantanISR™: Instant segment retraining technology allows users to input edited segments via the KantanISR editor. The segments then become training data and are stored in the KantanISR cache. The new segments are incorporated into the engine, avoiding the need to rebuild. As corrected data is included, the engine will improve in quality becoming an even more powerful and productive KantanMT engine.

KantanISR Instant Segment Retrainer
KantanISR editor

Building your KantanMT engine can be a very rewarding process. While some time is needed to gather the best data for a domain specific engine, there are many ways to enhance your engine that require little effort.

For more information about preparing training data or engine re-training, please contact Kevin McCoy, KantanMT Success Coach.

PEMT Standards

KantanMT PEMT standardsIn this blog series, we are discussing the area of post-editing. In our earlier posts, ‘The Rise of PEMT‘ and ‘Cutting PEMT Times‘ we have discussed the meaning of automated post-editing, why its popularity is growing among Language Service Providers (LSPs), and how you can cut your post-editing times.

Machine Translated text can be post-edited to different quality levels. This post is based on post-editing guidelines that have been developed by TAUS with, among others, KantanMT’s partners DCU and CNGL. A link to these guidelines is available at the end of this post.

Post-editing to an understandable level
An understandable level of post-editing is a standard by which the main content of the message is correct and understandable for the user. However, the documents readability may not be perfect and there may be a number of styling errors. Correct styling however is not essential as long as the main message content is understandable.

Follow these rules to post-edit a translated text to an understandable level

  • Ensure that the meaning of the translated text is the same as the source text and that it is understandable to the user
  • Read through the document to make sure that there is no missing or excess information
  • Because the translation is part of the localization process, make sure that the content is not offensive or culturally insensitive
  • Correct basic spelling errors
  • Errors that only effect the styling of the document do not need to be changed, so, there is no need to correct the following sentence, “Kantanmt is cloud based statistical machine translator platform”. Note: The stylistically correct version is “KantanMT is a cloud-based Statistical Machine Translation platform”
  • Remember that the fewer post-edits there are the better – use as much of the original Machine Translation output as possible
  • Don’t restructure sentences to improve the flow if the meaning is comprehensible

easelly_visual(4)

Post-editing to a quality standard similar to human translation
TAUS defines this level as being, “comprehensible (i.e. an end-user perfectly understands the content of the message), correct (i.e. it communicates the same meaning as the source text), stylistically fine, though the style may not be as good as that achieved by a native-speaking human translator. Syntax is normal, grammar and punctuation are correct”

Follow these rules to post-edit a translated text to this standard

  • Ensure that content is grammatically complete and structured logically, and that the meaning of the message is clear to the user
  • Check the translation of terms that are essential to the document and make sure that any untranslated terms have been requested to stay as such by the client
  • Read through the document to make sure that there is no missing or excess information
  • Because the translation is part of the localization process, make sure that the content is not offensive or culturally insensitive
  • Remember that the fewer post-edits there are the better – use as much of the original MT output as possible
  • Correct spelling errors and make sure that the document is correctly punctuated and well formatted

And that’s it! For errors such as misspellings or formatting mistakes, you can use KantanMT’s PEX technology to find and correct any repetitive errors throughout a document. This will help to speed up post-editing times while reducing post-editing costs.

TAUS Machine Translation Post-Editing Guidelines

You can find out more about KantanMT by visiting KantanMT.com and signing up to our free 14 day trial.