Last Thursday KantanMT hosted a webinar which introduced some of the latest breakthroughs in machine translation technology. Joined by Maxim Khalilov, Sr. Machine Translation Lead at bmmt, Tony O’Dowd (Founder, KantanMT.com) explained how these technologies are helping Language Service Providers and Enterprises to develop and manage MT systems in a completely transparent environment.
BuildAnalytics offers MT developers deep insight into their MT engines using distributed scoring, Gap Analysis, Rejects Report and Timeline features. Incorporating these insight tools into the MT development process means a shift away from the “Black Box” that many MT users have experienced, and a move to MT developer empowerment.
After the presentation, both presenters engaged in a Question and Answer session. The Q+A including the questions which were not addressed during the session are listed below:
1Q. KantanMT can only be used in your remote server? Is it possible to have an own server running it (because we have a confidentiality agreement with our customers)?
A. Yes – You can install a Run-time license of KantanMT.com onto your own server.
2Q. Is KantanMT an “out-of-the box” solution or do you have to program it to fit your particular environment?
A. KantanMT.com is a platform that you can use to customise, improve and deploy MT systems for your company. While KantanMT.com comes with over 5 billion words of stock training data sets, our clients typically customise their own engines for their content and translation style. This ensures that the engine is finely tuned to their content, reducing post-editing and improving quality and consistency.
3Q. Is there support from KantanMT in building your MT engine or is it mainly a (self-explanatory) DIY process?
A. KantanMT.com provides all the tools and utilities you need to customise your own engines – however, some clients want us to build these engines for them. This is a service engagement and can be organised by contacting email@example.com. There are also a range of quick-start videos available for free that walk you through the steps in building your own engine.
4Q. Do you provide consultant services concerning guidelines/training for post editing?
A. Yes, KantanMT provides both consultancy and training for companies that wish to deploy MT within their organisations. You can contact firstname.lastname@example.org for more pricing and timing information in regards to these services.
5Q. If I train my KantanMT engine, do you have access to this material? Are you using it?
A. No – everything you upload to the KantanMT.com servers is full encrypted and stored under your account name and secure password. No one else has access to your files or training material. KantanMT will never re-task, re-purpose or re-publish your data in any form and should you cancel your account you data will be removed from our servers automatically.
6Q. What differentiates you from your major competitors?
A. KantanMT.com is a complete MT development and management platform, and enables our clients to customise, improve and deploy high quality production ready MT systems. The platform is 100% cloud-based, easy to access and extremely fast to operate! No other platform provides the detailed data analytics and visualisation we do – KantanAnalytics is designed to provide quality estimation scores for every segment, similar to Translation Memory fuzzy match scores it is revolutionising the MT market. Kantan BuildAnalytics helps our clients customise, improve and deploy high quality engines in a fraction of the time traditional MT providers deliver. Coupled together with GENTRY, PEX and Kantan Pre-processor technology, KantanMT is simply the most technologically advanced MT system available in the market today.
7Q. Does KantanMT (Machine Translation) work well with Trados Studio Professional 2014?
A. Yes – You can translate TRADOS Studio files directly or can use our Studio plug-in (developed in conjunction with SDL) to provide in-place real-time MT services.
8Q. Are all language combinations equally suitable for MT?
A. No – all languages are not created equally when it comes to MT. Romance languages tend to do better than Germanic languages and Japanese has a range of language specific nuances. That is why KantanMT provides specially handling for a range of languages to make them easier for our clients MT systems to translate. For example, we use our own re-ordering models for German and Japanese to ensure higher consistency and transformation into these morphologically challenging languages.
9Q. As I understand it, my data is added to data provided by KantanMT. How do you ensure that terminology in your data does not create inconsistent translations?
A. We add more weight in the data model to your training data at all times. This ensures that it is always picked up first during translation and styling. Additionally, you can upload your terminology/glossary files directly into the training data so that guarantees the selection of your terminology.
10Q. What is the most words that you have processed for a single customer in a day?
A. The most words processed by a client in a single session is just under 2m words and this took approximately 3 hours. The throughput of a single instance of an engine is around 400K words per hour so it is possible to process 9.6m words per 24 hours. However, this can be improved substantially by adding more servers and a load balancer. For each additional server, you can process just under 10m words a day and this improves linearly as you add more servers.
11Q. What kind of methods are you using to get a quality estimation score? Is it comparable to what the QuEst project generates?
A. The KantanAnalytics QE scores were developed in conjunction with the CNGL (Centre of Next Generation localisation) and the Kantan Development team at DCU, Dublin. It uses a series of factors taken directly from the MOSES engine (nbest list, word alignment, complexity scores etc.) to determine the estimated quality for each segment. In this regards it is similar in approach to QuEst.
12Q. What is the feedback from KantanMT system users who handle translation in the Double bite languages, such as Chinese, Japanese, Korean?
A. Our first client is one of the largest companies in the world who selected us for the English to Simplified Chinese quality. We support all DBCS languages at present.
13Q. I would like to know if your MT systems are only cloud-based?
A. Yes – the KantanMT.com platform is 100% cloud-based. You need so special hardware, no special software, just an account name and secure password to start using our platform.
14Q. Do you mean 30% cost savings (cost reduction) or 30% increase in productivity? If cost savings, what was the increase in productivity?
A. 30% is cost savings – post-edit rate improved from 350 words per hour to 550 words per hour.
You can watch the presentation section of the webinar here:
For more information about KantanMT and MT technologies, please visit http://www.kantanMT.com