KantanMT has an ongoing Academic Partnership with Centre for Multidisciplinary and Intercultural Inquiry (CMII) at University College London to accelerate research and learning in the field of Machine Translation (MT). The postgraduate students of the department were able to use the KantanMT platform to update or gain new skills in Translation Technology. With help of the KantanMT platform, the students learnt how to build and customise their own Statistical Machine Translation (SMT) systems in a real world scenario.
KantanMT recently published a brand-new white paper on what global companies can expect to see in 2016 for Machine Translation (MT). The MT industry is rapidly changing and moulding itself to the technical needs and globalization requirements of the present day. Our white paper puts forward six major MT trends that all businesses need to KNOW in order to stay relevant and ahead of their competitors.
LocWorld is coming to Dublin in June and the KantanMT team are once again planning to bring together the toughest and smartest localization professionals, to challenge themselves in a Flag Challenge and Coastal Treasure Hunt hike to raise funds for the well known translation and localization NGO, Translators Without Borders.
For those of you who might remember, when LocWorld came to Dublin in 2014, many companies from the industry braved the (wet and windy) Irish weather to be the first to plant their company flag at the top of Djouce, Wicklow’s highest mountain. The event, Mountain Flag Challenge raised more than €10,000 for Pádraig Schäler.
16 years ago when the Web was strictly 1.0, Google was but in its nascent state, and there were a mere 361 million Internet users, David Bowie had made one of the most visionary statements about the future of the Internet:
What the Internet is going to do to society is unimaginable
Indeed, with more than 3 billion Internet users today, one can safely accede that Bowie’s prediction has come to fruition.
While chatting over a mouthful of mince pies, some tourtière and a few classy glasses of mulled wine this week, we at KantanMT were suddenly struck by the realisation that 2015 was perhaps one of the most sensational, successful and eventful years for us in the company! And the fact is, we can’t wait to start working on everything that we have planned for 2016 – we are certain that the new year is going to be even more exciting for us.
If the post-Black Friday sales numbers are anything to go by, there’s no question any more that the face of eCommerce is changing, and with it, the brick-and-mortar retailers have started rethinking their business strategy. As this news piece about Scotland experiencing a major dip in shoppers goes on to prove, demand for online shopping will increase substantially in 2016. This in turn means that the need for content localization and translation for eTailers (online retailers) will be even more pressing during the coming new year. As the often quoted Common Sense Advisory report points out, 72.4% of consumers are more likely to buy from a site, which is in their native language. Indeed, localization is no longer a good-to-have feature – it is now a must-have for all eCommerce businesses that aim to sell their products globally.
Chris Bishop, Managing Director of Microsoft Research, Cambridge, UK points out that “by 2026 we will have ubiquitous, human-quality translation among all European languages, thereby eliminating the language barrier throughout Europe.” Bishop’s prediction does not sound far off the mark at all when we take into account the fact that in the past ten years, Machine Translation (MT) has improved by leaps and bounds. Early MT was rules-based (RBMT) and required sets of linguistic rules, and it worked moderately well within a prescribed domain. However, this was resource intensive and cost prohibitive for many.
By 2026 we will have ubiquitous, human-quality translation among all European languages, thereby eliminating the language barrier throughout Europe
Chris Bishop, Managing Director of Microsoft Research, Cambridge, UK
The turning point for using MT in business came with the advent of the Internet, the SaaS model and the open source development model for software. These new changes in technology helped build the foundation for Statistical Machine Translation (SMT) research, and subsequently the open source development of the Moses Decoder. Moses enabled researchers and private companies to commercialise Statistical MT and develop it to the custom solutions it is today. The year of 2016 and beyond, will see further research in the fields of Natural Language Processing (NPL), Deep learning and machine learning, contributing directly to immense improvements in the fields of Custom MT.
The KantanMT Business Team published a new white paper, which provides an in depth understanding of how eTailers in 2016 will be affected by Machine Translation, and also goes on to discuss how Custom Machine Translation when compared to generic MT systems, will emerge as the clear winner in solving eTailing localization issues in the coming year.
Here are some of the highlights how MT will evolve in 2016 for eTailers:
- eTailers will use a combination of only CMT or CMT and Human Post-Editing to reach new markets ahead of their competitors
- With increased multilingual customer demand for products, content translation will find support in auto scaling
- Custom Machine Translation will be used more widely as eCommerce customers expand globally
Machine Translation is no longer a luxury. It is an essential component as a Tier 1 application to support global business. The purpose of this paper is to highlight how Machine Translation and more importantly Custom Machine Translation technology has come of age, in terms of quality, speed and scalability. During 2016 and beyond eTailers need to ensure that they review their globalization strategies to reflect these advances in technology, so they can maximise their global growth potential.
KantanMT is the number one provider of custom machine translation services in the world. The Software-as-a-Service platform, KantanMT.com is used by some of the world’s largest enterprises to increase translation and localization productivity.
KantanMT needs somebody who already is making a difference to the sales performance of their current organisation, who can hit the ground running and who can take on direct responsibility for seeking out and delivering new business.
The ideal candidate needs to manage the daily activities of the operational sales support functions.
- Sales Pipeline – take responsibility for top-of-funnel cold-contact lead generation and qualification in order to achieve new lead generation goals.
- Conferences/Events/Contacts Lists – find and make contact with ToF prospects in advance of all conferences and events.
- Business Development/Prospecting – Find lead sources via LinkedIn and other tools
- Maintaining the CRM database system and Mail – Ensure the CRM system is updated with all interactions.
- Provide Customer Service – Be able to engage with customers at the mid/snr. Mgmt. level.
- Use Social Media to KantanMT’s advantage – Need to be expert in LinkedIn Sales Navigator specifically.
- Self-manage – Be familiar with the sales process from ‘cold calling to ‘Closing’
- Team Member – As part of the Commercial team you need to be able to assist with key marketing tasks such as list cleaning, event marketing, post-event follow-up etc.
- Document Management – Set up proposals/NDA’s and pricing packages
- Follow through – with strategic action plans.
- Use Microsoft Office Suite
Background of Ideal Candidate:
- Ideally the candidate will have a strong solution selling sales background.
- Excellent Oral and Communication skills – be able to communicate with and sell to middle and senior management decision makers in global multi-national organisations.
- Be familiar with the sales process from ‘cold calling to ‘Closing’
- Min 3 years’ experience working in Sales
- Software sales an advantage
- Also knowledge of languages and or previous experience in Translation (MT) would be a distinct advantage!
Personality of candidate:
- The person that would suit this role should have excellent people skills/Out-Going personality
- They MUST possess the ability to persevere when dealing with new leads and follow ups
- They also have to have excellent listening skills and huge customer empathy
- Reliability, Honesty and a ‘can do’ attitude when dealing with customers (do what you say you will do!)
- Know when a potential opportunity is ripe in order to offer certain technologies and cross-sell
- The ideal person must ‘know’ the right way and professional way to deal with clients
- They must be confident in their ability to ask for the ‘Business’!!
Also the right candidate must stand firm on company policy and be able to assert themselves in a professional manner.
If this sounds like the role for you, send an email with your CV to Brian Coyle (firstname.lastname@example.org) with ‘Internal Sales Person’ in the subject line.
You can also apply directly from LinkedIn, by clicking the Apply Now button on the job post.
Welcome to Part II of the Q&A blog on How Machine Translation Helps Improve Translation Productivity. In case you missed the first part of our post, here’s a link to quickly have a look at what was covered.
Tony O’Dowd, Chief Architect of KantanMT.com and Louise Faherty, Technical Project Manager presented a webinar where they showed how LSPs (as well as enterprises) can improve the translation productivity of the language team, manage post-editing effort estimations and easily schedule projects with powerful MT engines. For this section, we are accompanied by Brian Coyle, Chief Commercial Officer at KantanMT, who joined the team on October, 2015 to strengthen KantanMT’s strategic vision.
We have provided a link to the slides used during the webinar below, along with a transcript of the Q&A session.
Please note that the answers below are not recorded verbatim and minor edits have been made to make the text more accessible.
Question: We are a mid-sized LSP and we would like to know what benefits would we enjoy if we choose to work with KantanMT, over building our own systems from scratch? The latter would be cheaper, wouldn’t it?
Answer (Brian): Tony and Louise have mentioned a lot of features available in KantanMT – indeed, the platform is very feature-rich and provides a great user experience. But on top of that, what’s really underneath KantanMT is the fact that it has access to a massive computing power, which is what Statistical Machine Translation requires in order to perform efficiently and quickly. KantanMT has the unique architecture to help provide instant on-demand access at scale.
As Louise Faherty mentioned, we are currently translating half a billion words per month and we have 760 servers deployed currently. So if you were trying to develop something yourself, it would be hard to reach this level of proficiency in your MT. Whilst no single LSP would probably need this total number of servers, to give you an idea of the cost involved, that kind of server deployment in a self-build environment would cost in the region of €25m.
We also offer 99.99% up time with triple data-centre disaster recovery. It would be very difficult and costly to build this kind of performance yourself. Also, with this kind of performance at your client’s disposal, you can offer Customised MT for mission critical web-based applications such as eCommerce sites.
Finally, a lot of planning, thought, development hours and research has gone into creating what we believe is the best user interface and the platform for MT, which also has the best functionality set with extreme ease of integration in the market place. So, it would be difficult for you to start on your own and build your own system that would be as robust and high quality as KantanMT.com.
Question: Could you also establish KantanNER rules to convert prices on an eCommerce websites?
Answer (Louise Faherty ): Yes, absolutely! With KantanNER, you can also establish rules, convert prices and so on. The only limitation with that being is that the exchange range will of course fluctuate. But there could be options as well of calculating that information dynamically – otherwise you would be looking at a fixed equation to convert those prices.
Question: My client does not want us to use MT because they have had bad experience in the past with Bing Translate – what would convince them to use KantanMT? How will the output be different?
Answer (Tony O’Dowd): One of things that you have to recognise in terms of using the KantanMT platform is that you are using MT to build customised machine translation engines. So you are not going to create generic engines (Bing Translate and Google Translate are generic engines). You would be building customised engines that are trained on the previous translations, glossaries that you clients have provided. You will also be using some of our stock engines that are relevant to your client’s domain.
So when you combine that, you get an engine that will mimic the translation style of your client. Indeed, instead of generic translation engines, you are using an engine that is designed to mirror the terminology and stylistic requirements of your client. If you can achieve this through Machine Translation, you will see that there is a lot less requirement for Post-Editing, and this is one of the most important things that drives away translators from using generic systems or broad-based systems and that’s why they choose customised systems. Clients and LSPs have tested the generic systems as well as customisable engines and found that cloud-based customisable MT add a value, which is not available on free, non-customisable MT platforms.
End of Q/A session
The KantanMT Professional Services Team would once again like to thank you for all your questions during the webinar and for sending in your questions by email.
Have more burning questions? Or maybe you would like to see the brilliant platform translate in a live environment? No problem! Just send an email to email@example.com and we will take care of the rest.
Want to stay informed about our new webinars? You can bookmark this page, or even better – sign up for our newsletter and ensure that you never miss a post!
We had so many questions during the Q&A in our last webinar session ‘How to Improve Translation Productivity‘ by the KantanMT Professional services team, that we decided to split the answers into two blog posts. So, if you don’t find your questions answered here, check out our blog next week for the remaining answers.
Internet today is experiencing what is generally referred to as a ‘content explosion!’ In this fast-paced world, businesses have to strive harder and do more to stay ahead of the game – especially if they are a global business or if they have globalization aspirations. One fool-proof way in which a business can successfully go global is through effective localization. Yet, the huge amount of content available online makes human translation for everything almost impossible. The only viable option then in today’s competitive online environment is through the use of Machine Translation (MT).
On Wednesday 21st October, Tony O’Dowd, Chief Architect of KantanMT.com and Louise Faherty, Technical Project Manager at KantanMT presented a webinar where they showed how Language Service Providers (LSPs) (as well as enterprises) can improve the translation productivity of the team, manage post-editing effort and easily schedule projects with powerful MT engines. Here is a link to the recording of the webinar on YouTube along with a transcript of the Q&A session.
The answers below are not recorded verbatim and minor edits have been made to make the text more readable.
Question: Do you have clients doing Japanese to English MT? What are the results, and how did you get them? (i.e., do you pre-process the Japanese?)
Answer (Tony O’Dowd): English to Japanese Machine Translation (MT) has indeed always posed a challenge in the MT industry. So is it possible to build a high quality, high fidelity MT system for this language combination? Well, there have been quite a few developments recently to improve the prospect of building effective engines in this language combination. For example, one of the latest changes we made on the KantanMT platform for improving the quality of MT is by using new and improved reordering models to make the translation from English to Japanese and Japanese to English much smoother, so we deliver a higher quality output. In addition to that, higher quality training data sets are now available for this language pair, compared to a couple of years ago, when I had started building English to Japanese engines. Back then it was really challenging. It is still requires some effort to build English to Japanese MT engines, but the fact that there’s more content available in these languages makes it slightly easier for us to build high-quality engines.
We are also developing example-based MT for these engines and it so far this is showing encouraging signs of improving quality for this language pair. However, we have not started deploying this development on the platform yet.
KantanMT note: For more insights into how you can prepare high-quality training data, read these tips shared by Tony O’Dowd, and Selçuk Özcan, co-founder of Transistent Language Automation Services during the webinar ‘Tips for Preparing Training Data for High Quality MT.’
Question: Have you got a webinar recorded or scheduled, where we could see how the system works hands-on?
Answer (Tony O’Dowd): If you go on to the KantanMT website, we have video links on the product features pages. So you can actually watch an explanation video while you are looking at the component.
We work in a very visual environment, and we think videos are a great way of explaining how the platform works. And, if you go on to the website, on the bottom left corner of the page, you will find our YouTube channel, which contains videos on all sorts of topics, including how to build your first engine, how to translate your first document and how to improve the output of your engines.
If you click on the Resources menu on our site, you can access a number of tutorials that will talk you through the basics of Statistical Machine Translation Systems. In other words, explore the website and you should find what you need.
KantanMT note: Some other useful links for resources are listed below:
- The KantanMT blog is full of helpful tips, tricks, information and guides on using MT effectively
- You can access KantanMT company slides on our SlideShare page
- Read our client success stories, KantanMT Case Studies
- Find answers in our FAQs
- See specs of our products on our product sheets section
- Read our whitepapers and view past webinars KantanMT webinars
- Check out our help section for help on Getting Started, File Parsing, Post-Editing and Preprocessors
Question: Do you provide any Post-Editing recommendations or standards for standardising the PE process? You said translation productivity rose to 8k words per day – this is only PE, correct?
Answer (Tony O’Dowd): I will take the second question first! The 8,000 words per day is the Post-Editing (PE) rate, yes. It is not the raw translation rate. In Machine Translation, everything comes out pretranslated. So this number refers to the Post-Editing effort – like insertions, deletions, substitution of words, and so on that you need to do to get the content to publishable quality.
Louise Faherty: What we recommend to our clients is that when it comes to PE, they should try to use MT. A lot of translators who are new to using MT will try and translate manually, which is a natural tendency, of course. But what we advise our clients is to copy and paste the translation (MT) in the engine and use the MT. The more you use MT and the more you Post-Edit, the better your engine will become.
Tony O’Dowd: I will add something to Louise Faherty ’s comments there. The best example of PE recommendations that I have come across is provided by a group called TAUS. They are at the pivot of educating the industry on how to develop a proficiency in PE.
Question: What do ‘PPX’ and ‘PEX’ stand for (as abbreviations)?
Answer (Louise Faherty and Tony O’Dowd): PEX stands for Post-Editing Automation. PEX allows you to take the output of an MT engine and dynamically alter that. When would you need to use PEX? Suppose there is a situation where your engine is repeating the same error over and over again. What you can do in such cases is write a PEX file (developed in the GENTRY programming language). This allows the engine to look for patterns in the output of the engine and to dynamically change that in the output.
For example, one of our French clients did not want to have a space preceding a colon mark in the output of their MT (because this was one of their typographical standards and repeated throughout the content). So we wrote a PEX rule that forced a stylistic change in the output of the engine. This enabled the client to reduce the number of Post-Edits substantially.
PPX stands for Preprocessor automation. You can use PPX files for to normalise or improve the training data. It is based on our GENTRY programming language which is available to all our clients for free.
In short then, PPX is for your training data, while PEX is for the actual raw output of your engine.
For more questions and answers, stay tuned for the next part of this post!
The future of content production, distribution and consumption is here. With the number of websites at 949,891,800 as of the time of publication of this blog, and increasing every second, the importance of developing structured content and its distribution has become more important than ever. It is no coincidence that in the 2015 tcworld Conference held in Stuttgart, Germany, one of the main themes of discussion is Intelligent Information management.
The speakers will present on the “megatrend of our time” where content needs to cater to “smart” users through “smart” services and not “just” products. As a result of this new trend, companies need to step up to the challenge and provide users with individualised information at the right time, in the right place and in the medium of their choice. Tekom calls this the “Intelligent Information Initiative – in3.” Before talking a bit more about what Intelligent Content is all about, it is important to remember that while the structure of the content is incredibly important, it is also equally important to have content that’s relevant, reusable and above all, targeted to the “smart” users that companies aim to attract.
To know more about the speakers for this theme and the topics being presented, have a look at the tcworld conference schedule.
What’s Intelligent Content?
So what is intelligent content, and how can businesses effectively manage their content to get their products and services to market faster, without having to create new content for each new platform or medium.
Intelligent content adopts digital texts and multimedia with coding. This allows the coded content to be automatically processed for being accessed across various devises and interface.
The Intelligent bit is created by removing the formatting and adding metadata, which summarises the information related to the data. This makes finding and reusing the data/ content much easier. The metadata adds information to segments of the content, which in turn makes the content easy to be disseminated, discovered and reused by businesses.
Ann Rockley of the Rockley group fame has spoken in depth about Intelligent Content development in her work Managing Enterprise Content and she describes the importance of adopting the structure of intelligent content as follows:
Intelligent content enables automated multi-channel delivery, adaptive content, improved content discovery, and personalized content delivery in an agile world. But the power of your content to respond to tight timelines, new customer requirements, and increasing costs is based on the quality of your intelligent content strategy
It is this “quality” of content strategy that will either make you a leader in your industry or slow you down from being able to be the first to bring your product to market. If you are taking your “smart” services and products to users across the globe, or even starting your content strategy from scratch, it is important that you structure your content. This will not only make it easy to reuse and tag, but will also be extremely helpful for you when you have to get your content localized and translated into the languages of the markets you want to penetrate.
Structured content is incredibly suited to Machine Translation. And a quality (intelligent) content strategy should always plan ahead for the need of localization in the future. Even if your business is just starting off, you should begin planning your content strategy intelligently because an unstructured website can escalate into an unmanageable mess very quickly.
The real success of a company’s content strategy is knowing how and where it is consumed. Luckily the ‘always on’ availability of content means anyone can access information regardless of nationality and geographical location. If a company is serious about its content strategy, then it will put a process in place to create multilingual content that can be distributed to a multilingual audience.
Including Machine Translation into your Translation and localization Workflow will make it easier to translate more content, faster than traditional human only workflows. When your MT engine is integrated into your content management workflow, translations for your structured content can be sent to your global websites seamlessly, with minimal manual intervention.
Thanks to high success rates KantanMT has working with Structured Content, we believe that integrating Machine Translation into the workflow and planning an “quality” Intelligent Content Strategy go hand in hand.
Meet us at booth 2/A09 during the tekom trade fair and tcworld conference between 10th abd 12th November to learn more about Custom Machine Translation and how it can fit in your intelligent content production workflow.
We have a few FREE Tekom Fair tickets to give away, so to be in with a chance to win, send an email to firstname.lastname@example.org ‘FREE Ticket’ in the subject line and we will add you to a draw. Winners will be notified by email.