Peter Coyle, a Transition Year student from Blackrock College spent a week at KantanMT, learning about the language and localization industry and the KantanMT technology. We were delighted to have him in our office and really appreciated the hard work he put in during his time here. While Peter had a packed schedule over the week, he found some time to blog about his TY work experience with us. Read his opinion about KantanMT, the ever-changing world of technology and his key takeaways from the work experience.
The Rosetta stone paradigm translations. Hans Hillewaert Wikimedia (CC)
This article is written by José Pichel Andrés and was originally published in Spanish in the online journal, El Español. It has been translated into English by Carlos Collantes from the Professional Services team at KantanMT. The article has been edited slightly for readability, but we have made all attempts possible to retain the original flavour of José’s article.
Researchers today are redefining Machine Translation. Though it is still a far cry from being completely satisfactory, it displays a rapid development, thanks to new systems like Neural Networks. Continue reading
Welcome to our second post in the ‘5 Questions’ series, which will give you a deeper insight into the people at KantanMT.
Kirti Vashee, a well-known Machine Translation veteran and independent MT consultant, is currently writing a series on expert MT systems in his blog eMpTy pages. The in-depth posts and interviews by Kirti not only highlight the MT buyer’s expectations, but also stress what the Expert MT Developers are doing differently.
In his blog Kirti informs and introduces the reader to “competent MT technology alternatives available in the market today.” To date he has spoken about tauyou, Iconic and KantanMT. As Kirti points out, our client base consists of Language Service Providers as well as multinational enterprises. What makes KantanMT attractive to both of these diverse client bases is its extremely customisable, bespoke solution, which can be tailored according to the requirements of each client. Our clients can easily build their own Custom Machine Translation (CMT) engines, or they can opt for our Professional Services team to do it for them. Continue reading
This week, Dr Dimitar Shterionov, Machine Translation Researcher at KantanMT, presented at the Cloud Security workshop conducted by Irish Centre for Cloud Computing and Commerce (IC4). The information-packed workshop, which was a huge success, aimed to draw back the curtain on cloud security and help companies make more informed choices regarding cloud security within their organisation.
In this post we will highlight some of the issues discussed during the workshop as well as the best practices, tools and guidelines that will help decision making for businesses making the move to the cloud.
The future of content production, distribution and consumption is here. With the number of websites at 949,891,800 as of the time of publication of this blog, and increasing every second, the importance of developing structured content and its distribution has become more important than ever. It is no coincidence that in the 2015 tcworld Conference held in Stuttgart, Germany, one of the main themes of discussion is Intelligent Information management.
The speakers will present on the “megatrend of our time” where content needs to cater to “smart” users through “smart” services and not “just” products. As a result of this new trend, companies need to step up to the challenge and provide users with individualised information at the right time, in the right place and in the medium of their choice. Tekom calls this the “Intelligent Information Initiative – in3.” Before talking a bit more about what Intelligent Content is all about, it is important to remember that while the structure of the content is incredibly important, it is also equally important to have content that’s relevant, reusable and above all, targeted to the “smart” users that companies aim to attract.
To know more about the speakers for this theme and the topics being presented, have a look at the tcworld conference schedule.
What’s Intelligent Content?
So what is intelligent content, and how can businesses effectively manage their content to get their products and services to market faster, without having to create new content for each new platform or medium.
Intelligent content adopts digital texts and multimedia with coding. This allows the coded content to be automatically processed for being accessed across various devises and interface.
The Intelligent bit is created by removing the formatting and adding metadata, which summarises the information related to the data. This makes finding and reusing the data/ content much easier. The metadata adds information to segments of the content, which in turn makes the content easy to be disseminated, discovered and reused by businesses.
Ann Rockley of the Rockley group fame has spoken in depth about Intelligent Content development in her work Managing Enterprise Content and she describes the importance of adopting the structure of intelligent content as follows:
Intelligent content enables automated multi-channel delivery, adaptive content, improved content discovery, and personalized content delivery in an agile world. But the power of your content to respond to tight timelines, new customer requirements, and increasing costs is based on the quality of your intelligent content strategy
It is this “quality” of content strategy that will either make you a leader in your industry or slow you down from being able to be the first to bring your product to market. If you are taking your “smart” services and products to users across the globe, or even starting your content strategy from scratch, it is important that you structure your content. This will not only make it easy to reuse and tag, but will also be extremely helpful for you when you have to get your content localized and translated into the languages of the markets you want to penetrate.
Structured content is incredibly suited to Machine Translation. And a quality (intelligent) content strategy should always plan ahead for the need of localization in the future. Even if your business is just starting off, you should begin planning your content strategy intelligently because an unstructured website can escalate into an unmanageable mess very quickly.
The real success of a company’s content strategy is knowing how and where it is consumed. Luckily the ‘always on’ availability of content means anyone can access information regardless of nationality and geographical location. If a company is serious about its content strategy, then it will put a process in place to create multilingual content that can be distributed to a multilingual audience.
Including Machine Translation into your Translation and localization Workflow will make it easier to translate more content, faster than traditional human only workflows. When your MT engine is integrated into your content management workflow, translations for your structured content can be sent to your global websites seamlessly, with minimal manual intervention.
Thanks to high success rates KantanMT has working with Structured Content, we believe that integrating Machine Translation into the workflow and planning an “quality” Intelligent Content Strategy go hand in hand.
Meet us at booth 2/A09 during the tekom trade fair and tcworld conference between 10th abd 12th November to learn more about Custom Machine Translation and how it can fit in your intelligent content production workflow.
We have a few FREE Tekom Fair tickets to give away, so to be in with a chance to win, send an email to email@example.com ‘FREE Ticket’ in the subject line and we will add you to a draw. Winners will be notified by email.
Master’s student, Rafaella Athanasiadi of the University College London submitted her thesis as part of the MSc degree in Scientific, Technical and Medical Translation with Translation Technology. Rafaella was supervised by Teaching Fellow and Lecturer Dr. Emmanouela Patiniotaki and she used KantanMT.com for her research. This guest blog post looks at some of her conclusions on Machine Translation and the Localization Industry.
As Hutchins & Somers (c1992:1) argue, “the mechanization of translation has been one of humanity’s oldest dreams.” During the 20th century, the translation process changed radically. From spending endless hours in libraries to find the translation of a word, the translator has been placed in the centre of dozens of assistive tools. To name just a few, today, there are many translation software, terminology extraction tools, project management components, and machine translation systems, which translators have the opportunity to choose from while translating.
However, shifting the focus to audiovisual translation, it can be observed that not so many radical changes took place in that area, at least not until the introduction of machine translation systems in various projects (such as, the MUSA and the SUMAT project) that developed machine translation engines to optimise the subtitling process. Still, the results of such projects do not seem to be satisfactory enough to inspire confidence for the implementation of these engines in the subtitling process both by subtitling software developers and subtitlers.
Based on my personal research that focused primarily on the European setting, in the subtitling industry it seems that only freeware SRT Translator incorporates machine translation while also offering the features that subtitling software usually incorporate (i.e. uploading multimedia files and timecoding subtitles) at the moment. Nonetheless, SRT Translator, which is not very famous among subtitlers, uses solely Google Translator by default, which is a general-domain machine translation engine and not suitable for the purposes of audiovisual translation, one could argue. The quality of the output of Google Translator was tested by translating 35 subtitles of a comedy series. The output was incomprehensible and misleading in many cases.
Even though no further records of traditional subtitling software that incorporate machine translation could be found, there are many online translation platforms that allow users to upload and translate subtitles. Taking into consideration the European market, these can be either translation software like MemoQ, SDL Trados Studio and Wordfast that offer thability to load subtitle files and in some cases link them to the audiovisual content they are connected to, open source tools for translators like Google Translator Toolkit (GTT) or professional and private platforms like Transifex and XTM International that are used by companies and offered to their dedicated network of translators. Nonetheless, in order to enable machine translation in all the above applications, API keys must be purchased. GTT is an exception since it can be used for free anytime and only requires a Gmail account.
The fact that subscription fees have to be paid along with the costs of API keys for each machine translation engine provider puts their usability in question since costs may overweight subtitlers’ profits. Furthermore, these platforms cannot accommodate subtitlers’ needs; for instance, the option to upload and play multimedia files while translating the subtitles is not always possible nor any synchronization features for timecoding the subtitles to the audio track are offered. Transifex, however, is an exception since this localization platform offers users the option to upload multimedia files in the translation editor while translating the subtitles.
According to Macklovitch (2000:1) a translation memory is considered to be “a particular type of translation support tool that maintains a database of source and target language sentence pairs, and automatically retrieves the translation of those sentences in a new text which occur in the database.” Even though machine translation engines were developed through different projects to reduce subtitling time to the least possible degree, no attempts had been traced during this research to integrate a translation memory tool in a subtitling software for optimizing subtitling; at least in a European, Asian and Australian setting. As Smith (2013) argues, “traditionally subtitling has fallen outside the scope of translation memory packages, perhaps as it was thought to be too creative a process to benefit from the features such software offers.” However, as Diaz-Cintas (2015:638) discusses “DVD bonus material, scientific and technical documentaries, edutainment programmes, and corporate videos tend to contain the high level of lexical repetition that makes it worthwhile for translation companies to employ assisted translation and memory tools in the subtitling process.”
Even if such tools have not been integrated in subtitling software, translation memory components are used for subtitling purposes in cloud-based platforms such as GTT, Transifex and XTM International as well as in translation software, MemoQ, SDL Trados Studio, Wordfast Pro and Transit NXT by simply creating a translation memory before or while translating. It should be noted that Transit NXT is the only translation software that can accommodate the needs of subtitlers to a high level among the tools discussed in this research. Apart from the addition of specialized filters to load subtitles (that also exist in MemoQ, SDL Trados Studio and Wordfast Pro), subtitlers can upload multimedia files, translate subtitles while a translation memory component is active and also synchronise their subtitles with the Transit translation editor (Smith, 2013).
Figure 1: The translation editor of Transit NXT by Smith (2013)
The newly-founded company (2012) OOONA has taken a very interesting approach to subtitling by developing a unique cloud-based toolkit that is built exclusively for accommodating the needs of subtitlers. When asked the following question within the context of the MSc thesis,
Considering that other cloud-based translation platforms like GTT, Transifex and XTM International offer the option of uploading a TM or a terminology management component, do you think that it is important to offer it on a subtitling platform as well?
the representative of OOONA (Alex Yoffe) replied that not only will the company implement translation memory and terminology management components in the next phase of enhancing their platform but that they also consider these components to be very important for the subtitling process. In addition, Yoffe (2015) argued that OOONA intends to “add the option of using MT engines. Translators will be able to choose between Microsoft’s, Google’s, or customisable MT engines.” Therefore, it seems that OOONA will become a very powerful tool in the near future with features that will optimise the subtitling process to the maximum and shape the way that subtitling is carried out until now. The fact that Screen Systems, Cavena and EZTitles have partnered with OOONA is an indicator of how much potential there is in this toolkit.
As it can been argued based on the above, there is lack of subtitling software with incorporated translation memory tools. Therefore, this issue was further researched through the form of an online questionnaire that was disseminated to subtitling companies and freelance subtitlers. In addition, two companies that develop subtitling software, Screen Subtitling Systems and EZTitles, were asked to present their views on this topic. In both cases, their willingness to optimise the subtitling process in a semi-automated or a fully-automated way was apparent through their answers. The former company was in favour of a combination of machine translation tools with translation memory tools whereas the latter leaned towards a subtitling system with integrated translation memory and terminology management tools.
Nonetheless, the optimisation of the subtitling process has to coincide with the needs and preferences of subtitlers. Based on the respondents’ answers, it is clear that translation memory tools in subtitling software are desirable by subtitlers. In question,
Which tool would you prefer to have in a subtitling software? An integrated translation memory (TM) or machine translation (MT)?
more than half of the respondents (56.8%) chose TM. Interestingly, the answer Both received the second highest percentage (20.5%) which indicated that subtitlers demand as many assistive tools as possible.
One of the main conclusions that were drawn from this research was that machine translation engines need to be customised to produce good quality output and this can be achieved through customisable engines like KantanMT and Milengo. Moreover, translation memory tools are sought by subtitlers in subtitling software, while cloud-based platforms seem to occupy the translation industry today. Following this trend, subtitling software providers partner with online services/tools like the OOONA toolkit.
Based on the outcomes of this research, it could be said that we are certainly experiencing a new era in subtitling since the traditional PC-based subtitling software are now transforming into flexible and accessible platforms to enhance the subtitling experience as much as possible. It is a matter of time which tool and platform will rule the subtitling industry but one thing is for sure; the technologies of the future will bring a lot of changes in the traditional way of subtitling.
Diaz-Cintas, J., 2015. Technological Strides in Subtitling. In: S. Chan, ed. Routledge Encyclopedia of Translation Technology. London: Routledge, pp. 632-643.
Hutchins, J. W. & Somers, H. L. (c1992). An introduction to machine translation. London: Academic Press.
Macklovitch, E. (2000). Two Types of Translation Memory. In Proceedings of the ASLIB Conference on Translating and the Computer (Vol. 22).
Smith, Steve (2013). New Subtitling Feature in Transit NXT. November 11 2013. [Online]. Available from: http://www.star-uk.co.uk/blog/subtitling/working-with-subtitles-in-transit-nxt/. [Accessed 01 Sept. 2015].
Yoffe, A (2015). MT and TM tools in subtitling. [Interview]. 13 August 2015.
 Relevant data are available in Appendix 1 of the MSc thesis.
The KantanPEX Rule Editor enables members of KantanMT reduce the amount of manual post-editing required for a particular translation by creating, testing and deploying post-editing automation rules on their Machine Translation engines (client profiles).
The editor allows users to evaluate the output of a PEX (Post-Editing Automation) rule on a sample of translated content without needing to upload it to a client profile and run translation jobs. Users can enter up to three pairs of search and replace rules, which will be run in descending order on your content.
How to use the KantanMT PEX Rule Editor
Login into your KantanMT account using your email and your password.
You will be directed to the ‘Client Profiles’ tab in the ‘My Client Profiles’ page. The last profile you were working on will be ‘Active’ and marked in bold.
To use the ‘PEX-Rule Editor’ with a profile other than the ‘Active’ profile, click on the new profile name to select that profile for use with the ‘Kantan PEX-Rule editor’.
Then click the ‘KantanMT’ tab and select ‘PEX Editor’ from the drop-down menu.
You will be directed to the ‘PEX Editor’ page.
Type the content you wish to test on, in the ‘Test Content’ box.
Type the content you wish to search for in the ‘PEX Search Rules’ box.
Type what you want the replacement to be in the ‘PEX Replacement Rules’ box and click on the ‘Test PEX Rules’ button to test the PEX-Rules.
The results of your PEX-Rules will now appear in the ‘Output’ box.
Give the rules you have created a name by typing in the ‘Rule Name’ box.
Select the profile you wish to apply this rule(s) to and then click on the ‘Upload Rule’ button.
KantanMT PEX editor helps reduce the amount of manual post-editing required for a particular translation, hence, reducing project turn-around times and costs. For additional information on PEX-RULES and the Kantan PEX-Rule editor please click on the links below. For more details about KantanMT localization products and ways of improving work productivity and efficiency please contact us at firstname.lastname@example.org.
Ease of use and simplicity are always on the minds of our Developers, hence the making of KantanTimeLine™. KantanTimeLine enables KantanMT clients to view the life cycle of their KantanMT engine. This empowers our clients as they are able to find exactly what is negatively or positively affecting the quality of their engines. Clients are able to keep track of things such as, Training Data uploads, Translation jobs, Engine Tuning, templates, Build jobs and so on through the KantanTimeLine.
How to use KantanTimeLine™
Login into your KantanMT account using your email and your password.
You will be directed to the ‘My Client Profiles’ page. You will be in the ‘Client Profiles’ section of the ‘My Client Profiles’ page. The last profile you were working on will be ‘Active’.
If you wish to use ‘KantanTimeLine’ with another profile other than the ‘Active’ profile. Click on the profile that you want to you wish to view the ‘KantanTimeLine’.
Click on the ‘TimeLine’ tab.
You will now be directed to the ‘TimeLine’ page for your chosen profile.
To restore an Archived Build select the Build you wish to restore from the ‘Archives’ drop-down menu and click on the ‘Restore’ button.
To delete an archived Build click on the ‘Delete’ button.
To archive a Build click on the ‘Archive’ button of the build you wish to archive.
To view or edit the description of a build click on the ‘Yellow Notepad’ icon.
To filter the timeline click on the ‘Filter’ drop down-menu and select the filter you wish to use.
Additional Information and Support
KantanTimeLine™ is one of the many products offered by KantanMT to make the integration of Machine Translation into the workflow of our clients seamless. For more information on TimeLine or any KantanMT products please contact us at email@example.com.
TimeLine can also be used in KantanBuildAnalytics. To learn how TimeLine is incorporated into KantanBuildAnalytics please click on the link below or contact us at firstname.lastname@example.org.