MT Quality Estimation – KantanAnalytics™

kantanmt, KantanAnalytics

The newest addition to the KantanMT technology portfolio is KantanAnalytics™.  KantanAnalytics, which has been co-developed with the CNGL Centre for Global Intelligent Content (Dublin City University, Ireland), assigns a quality estimation score for each automated translation generated by a KantanMT engine. Expressed as a percentage – this predicts the score a human translator would likely assign as to the utility of the translation. KantanAnalytics help Project Managers predict the cost and schedule of Machine Translation projects and creates new business model opportunities for the localization industry.

The commercialisation of Translation Memory technology in the early 1990’s revolutionised the localization industry and led to increased productivity and translation performance. It also provided a new pricing model for the industry – one based on the type of translation memory match (referred to as a ‘fuzzy-match’). This pricing structure, which was tied to the fuzzy-match score, became an industry standard and an invaluable tool Project Managers could use for providing an accurate cost analysis on translation projects. It was also used to predict the time to complete a project.

The use of KantanAnalytics technology means Project Managers can apply a similar pricing structure when calculating the cost of Machine Translation or Post-Edited Machine Translation (PEMT) projects. Currently, Project Managers and translators use fixed charges, such as calculating hourly rates or a fixed number of words for Machine Translation and PEMT. This method lacks precision and transparency and is not a sufficient cost calculation method to drive the wide scale adoption of Machine Translation.

What this means for KantanMT Members

KantanMT Enterprise Members can use a two-pronged approach to measure Machine Translation quality. Using KantanWatch, BLEU, TER and F-measure scores can show the engine’s overall quality level during the training or development stage, then KantanAnalytics is used to analyse the quality of each segment generated by a KantanMT engine.

By using the KantanAnalytics reports, akin to a ‘fuzzy-match’ report, Project Managers can then determine the number of segments, the quality of each segment and estimate how long a project will take to complete and what the cost should be.

This quality estimation score is expressed as a percentage – the higher the score, the better the quality and consequently the less effort required to post-edit it.

KantanAnalytics can be quickly deployed by Project Managers and Enterprise Members can implement a tiered pricing model on Machine Translation jobs similar to Translation Memory jobs. This is an excellent fit within existing business models, fusing two important industry technologies Machine Translation and Translation Memory.

KantanAnalytics creates the framework for a more accurate, more efficient cost management and deployment of Machine Translation throughout the localization industry.

KantanAnalytics User Interface (UI)

KantanAnalytics report
KantanAnalytics Report

Here is a quick look at the new KantanAnalytics interface. The KantanAnalytics report can be viewed in the Project Dashboard on KantanMT or downloaded as a Microsoft Excel file. The report is generated by clicking the graph icon located in the job status column.

The report results are shown when the report is expanded. To expand the report click on ‘summary’ or ‘file name’. The results are represented in three graphs at the top of the report. In the screen shot below, Total Recall technology shows that 76% of the file for translation generated matches 85% or higher. The second graph, shows that 24% of the document had matches less than 85%. The third graph then shows the quality estimation scores in 10% increments. This data is also listed below the graphs in numerical form.

KantanAnalytics dashboard
KantanAnalytics Report
KantanMT Analytics will be available to Enterprise Members of the KantanMT.com platform from 30th October. To sign up for the Enterprise Plan or to upgrade to this plan please email KantanMT’s Success Coach, Kevin McCoy (kevinmcc@kantanmt.com).

4 thoughts on “MT Quality Estimation – KantanAnalytics™

  1. […] Segment level QE technology is now available where Machine Translated segments are assigned percentage match values, similar to translation memory match values. Post-editing costs, similar to the costing of translation memory matches can be assigned. The match value also gives a clear indication of how long a project should take to post-edit based on the quality of the match and the post-editors skills and experience. […]

    Like

  2. […] Quality – as the overall quality of Machine Translation output improves, it is easier for crowdsourcing volunteers with less experience to generate better quality translations. This will in turn increase the demand for crowdsourcing models to be used within LSPs and organisations. MT quality metrics will also make post-editing tasks more straightforward and easier to delegate among volunteers based on their experience. […]

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s