RBMT vs SMT

Image

A commonly asked question within the localization industry is which is better: Rule Based or Statistical Machine Translations systems.  While both approaches have merits and advantages, the question in my mind is which offers the best future potential and best value for LSPs who are considering a future offering which includes an element of Machine Translation?

According to Don DePalma and his team at Common Sense Advisory, if you’re an LSP and haven’t been asked to provide an RFQ (Request for Quotation) that includes an element of Machine Translation, then you’re rapidly becoming the exception!

So as a successful LSP entrepreneur, which is the best wagon to hitch your horses to: Rule Based or Statistical Machine Translation?

First of all, what is Machine Translation?

Machine translation (MT) is automated translation or “translation carried out by a computer” – as defined in the Oxford English dictionary. It is the process by which computer software is used to translate a text from one natural language to another.

Machine Translation systems have been in development since the 1950s, however the technology required to develop successful MT systems was not up to par at this time and so research was largely put to the side. But in the last 15 years, as computational resources have became more mainstream and the internet opening up a wider multilingual and global community, interest in Machine Translation has been renewed.

There are three different types of Machine Translation systems available today. These are Rule-Based Machine Translation (RBMT), Statistical Machine Translation (SMT) and hybrid systems – a combination of RBMT and SMT.

Rule-Based Machine Translation Technology

Rule-based machine translation relies on countless built-in linguistic rules and gigantic bilingual dictionaries for each language pair. RBMT system works by parsing text and creating a transitional representation from which the text in the target language is generated. This process requires extensive lexicons with morphological, syntactic, and semantic information, and large sets of rules. RBMT uses a complex rule set and then transfers the grammatical structure of the source language into the target language.

In most cases, there are two steps: an initial investment that significantly increases the quality at a limited cost, and an ongoing investment to increase quality incrementally. While rule-based MT brings companies to a reasonable quality threshold, the quality improvement process is generally long and expensive. This has been a contributing factor to the slow adoption and usage of MT in the localization industry.

Surely, there must be a better approach!

Statistical Machine Translation Technology

Statistical Machine Translation (SMT) utilizes statistical translation models generated from the analysis of monolingual and bilingual content. Essentially this approach uses computing power to build sophisticated data models to translate one source language into another. This makes the use of SMT a far simpler option, and a significant factor in the broader adoption of statistical machine translation technology in the localization industry.

Building SMT models is a relatively quick and simple process. Using current systems – users can upload  training material and have an MT engne generated in a matter of hours. While it is genereally thought that a minimum of two million words are required to train an engine for a specific domain, it is possible to reach an acceptable quality threshold with much less.  The technology relies on bilingual corpora such as translation memories and glossaries for the system to learn the language patterns, and monolingual data is used to improve the fluency of the output as the engine has more text examples to choose from. SMT engines will prove to have a higher output quality if trained using domain specific training data such as; medical, financial or technical domains.

SMT technology is CPU intensive and requires an extensive hardware configuration to run translation models for acceptable performance levels. However, the introduction of cloud services, and the increasing availability of bilingual corpora are having a dramatic effect on the popularity of SMT systems, which is leading to a higher adoption rate in the language services industry.

RBMT vs. SMT

  • ŸRBMT can achieve good results but the training and development costs are very high for a good quality system. In terms of investment, the customization cycle needed to reach the quality threshold can be long and costly.
  • ŸRBMT systems can be built with much less data than SMT systems, instead using dictionaries and language rules to translate. This sometimes results in a lack of fluency.
  • Ÿ  Language is constantly changing, which means rules must be managed and updated where necessary in RBMT systems.
  • Ÿ  SMT systems can be built in much less time and do not require linguistic experts to apply language rules to the system.
  • Ÿ  SMT models require state-of the-art computer processing power and storage capacity to build and manage large translation models.
  • Ÿ SMT systems can mimic the style of the training data to generate output based on the frequency of patterns allowing them to produce more fluent output.

The Verdict

Statistical Machine Translation technology is growing in acceptance and is by far, the clear leader between both technologies. The increasing availability of cloud-based computing is providing a solution to the high computer processing power and storage capacity required to run SMT technology effectively, making SMT a game changer for the localization industry.

Training data for SMT engines is becoming more widely available, thanks to the internet and the increasing volumes of multilingual content being created by both companies and private internet users. High quality aligned bilingual corpora is still expensive and time consuming to create but, once created becomes a valuable asset to any organization implementing SMT technology, with translations benefiting from economies of scale over time.

Tony O’Dowd, Founder and Chief Architect, KantanMT.com

Innovation Strategy

Machine Translation innovation strategy

Welcome back to the second part of this blog series, which examines ‘innovation as strategy’. Please feel free to comment and share.

The primary goal of an “Innovation Strategy”, as defined by Porter, is to leapfrog competitors via the introduction of a completely new, or notably better product or service. The best example I can think of is Apple and its introduction of the Apple iPod.

I was part of the Sony Walkman generation (I even had a Sony Discman!). But when Apple released the iPod – well it was a no brainer, Sony was ditched and I happily joined the hip new iPod generation!

In the 90’s LSPs were viewed as innovative if they were using Translation Memory (TM) technologies such as TRADOS and Alchemy CATALYST. Today this is no longer the case. TM technologies are now considered as standard, and are an expected part of the process. Translation Memories are no longer differentiators!

As Machine Translation becomes more accessible, both in terms of cost and ease of use, progressive mid-sized LSPs are increasingly more eager to integrate this technology into their workflows.

Easy access to affordable MT has given many Language Service Providers (LSPs) the opportunity to become innovative, inching ahead of competitors. It has also given them the opportunity to offer the same Machine Translation services that in the past were only provided by large LSPs.

The technological playing field is now being levelled. Ignoring an Innovation Strategy that includes the introduction of Machine Translation may well leave some LSPs on the side-line in future project negotiations, as they compete with more progressive LSPs who have adopted the latest technologies.

Have you tried Machine Translation on KantanMT.com? It’s easy, and free to get started. Sign up for your 14 day free trial today and start translating within hours.

Watch out for KantanMT’s post on differentiation strategies.

Tony O’Dowd, Founder and Chief Architect, KantanMT.com

Business Strategies

KantanMT Business StrategyWelcome. This is a four part blog series which will examine Porter’s core strategies for competitive advantage. During the series we will look at how these strategies can be applied to companies working in the translation industry.

Introduction

Michael Porter, Harvard Business School, explains that competitive advantage occurs when an organisation “acquires or develops an attribute or combination of attributes that allows it to outperform its competitors.”

Expanding on this concept, in his book “Competitive Strategy” (1980, a book which was voted the ninth most influential management book of the 20th century) – and again in “Competitive Advantage” (1985, a book I read during my years in college) – he surmised four core strategies companies should embrace in order to create a clear and superior competitive advantage in their markets.

I thought it would be interesting to see how Machine Translation – as a growing service differentiator in the LSP world – would fit into Porter’s four strategies, and to examine if it ticks all of the Competitive Advantages check boxes!

Cost Leadership Strategy

Porter defines “Cost Leadership” as offering products or services at the lowest possible cost in the industry. The emphasis here is on cost rather than price; cost is what you purchase your products/services at and well, price is what you sell these on at – hopefully obtaining a nice profit in the process, helping your company grow and thrive. I guess in a nutshell, it’s all about avoiding operating at a loss by optimising this cost/price ratio.

But the devil of achieving that cost/price optimisation is in the detail of efficiently running a day-to-day innovative business. And by running a business that develops an attribute, or attributes, that differentiates it from its competitors. Successful companies that embrace Porter’s Cost concept must by necessity strategically vary their Cost attributes through the product/service they offer. A good example is Walmart, where they offer key items at deep discounts, while selling other products at less aggressive discounts. It is different sides of the same cost/price coin and taken holistically can be a very successful strategy. Walmart has successfully beaten off all of its major competitors in the US domestic market for decades by pursuing this particular Cost Leadership strategy.

So what’s the take-out here for Localization Service Providers (LSPs) on cost/price? Well, for the majority of translation quotes, the per-word translation-costs represent the lion’s share of the total project costs: in many cases this is as much as 85%. So while some LSPs may focus on containing the costs of their support services (such as engineer, project management, review and edit etc.), the really successful ones realise that it is by focusing on the translation-costs – that 85% of cost area – that they can gain most competitive advantage.

This reality has been manifesting itself as a significant and wholehearted move by many LSPs. Many are now moving towards Translation Automation as a cost saver. Clearly, for an LSP to embrace a “Cost Leadership Strategy”, it must be relentless in pursuing a translation automation strategy. Only by developing such a strategy will an LSP give itself the strong differentiating cost attribute that allows it to outperform its competitors.

Machine Translation is a key component of any translation automation strategy, and its use can positively impact on the translation-cost component of any localization project. For instance, one of our KantanMT members reported a 37% reduction in translation costs as a result of integrating MT into their automated translation workflow.

…Read more about Porter’s strategies in Friday’s blog.
Tony O’Dowd, Founder and Chief Architect

GALA Webinar: KantanMT Quality Analysis Technology

KantanMT WebinarTony O’Dowd, founder and Chief Architect of KantanMT will be presenting a GALA webinar on the 12th of September 2013. The Webinar entitled ‘KantanMT Analytics – The Missing Link in Machine Translation’ will present the evolution of Machine Translation technology with special focus on quality measurement.

Having recently released KantanMT Analytics with the CNGL Centre for Global Intelligent Content, Tony demonstrates how this technology can be applied to measure MT output. He also shows how it will assist LSPs and other users of MT to build strategic business models on Machine Translation.

Quality Analysis

Join this webinar to see how segment based quality analysis technology will transform your Machine Translation functions. Register here>>

*Attendees can avail of a 30 day trial on KantanMT and Free usage of the KantanMT Analytics technology.

Conference and Events – How to Decide what to Budget for?

KantanMT technology conference

Attending any Localization Conferences this year?

There are hundreds of translation and localization conferences to choose from each year, but how do you know which conferences and events will to help you to achieve your business goals?

Some conference websites are great, others not so much. The great ones, however, will give you a detailed description of exactly what will happen during the event. These make the decision of whether to attend or exhibit an easy one. You should be able to quickly determine whether the programme, the topics, the speakers and the attendees fit in with your needs.

Unfortunately, as you have all probably have experienced, the conferences websites that do not offer a very detailed overview of the conference plan, make it much more difficult to finalise a decision on whether or not to go.

How do you choose?

Are you seeking funding? Looking to secure new customers? Want to increase brand awareness? Hoping for some good press? It may be difficult to get all of these requests from one event so the first thing you need to do is establish your goals.

Establish Goals…

Once you establish your main goals, you then need to ask yourself what types of interactions will help you to accomplish these goals. If you are a start-up software solution, then you are likely to need a lot of interaction and hands on time with potential customers. You should also be investigating whether or not you can give a live demos or presentations of the product.

If, you are an established company, the number of leads most likely outweighs the depth of interaction.For Language Service Providers (LSPs) looking to gain new business customers, networking will be a big part of your conference activities. Make sure to set up plenty of meetings before the event so you can

1. Build stronger relationships with your current customers

2. Engage with prospects and sell your offering.

Should you sponsor?
Sponsoring a conference can sometimes be a great method to gain exposure, connect with potential clients, and grow brand awareness. There are times however, that sponsorship can eat up a whole lot of money and return very little reward.

sponsorship-opportunities

Things to remember about sponsorship…

Each event is an experience and the key to a successful partnership is to find an event or conference that will provide an experience that people can associate positively with your brand and reinforce your brands core messages.

Once you decide on an event to sponsor, you then need to leverage both the event and the experience. You can do this by integrating all your marketing activities to align with the event. This includes:

  • Organising competitions
  • Setting up conference landing pages
  • Setting up meetings with key customers
  • Aligning all your social media
  • Carrying out event marketing activities before, during and after the event

Later this year KantanMT will be attending TEKOM, exhibiting at Web Summit, Localization World and are Silver sponsors of ASLIB (Translation and the Computer Conference).

Attending any of these conferences? Get in touch with us to set up a meeting or pop by our booth to say hello. (Niamhl@kantanmt.com)

See a full listing of this year’s translation and localization events and conferences here>>

KantanMT: KantanThe Cat

A picture says a thousand words, and we’ve added in a few more 🙂KantanMT Infographic

For more information on KantanMT please visit our website, www.KantanMT.com. 

Still haven’t signed up for a Free trial? You can do so here >>

Meet the Team

Meet the KantanMT team.

As a start up, KantanMT is constantly growing its staff base. But for the minute, this is the team that we have working in Galway and Dublin

Tony O'Dowd

 

Tony O’Dowd             Founder and Chief Architect

 

 

Noreen Headshot B+W

 

Noreen O’Dowd        HR and Accounting

 

 

Aidan Headshot B+W

 

Aidan Collins             User Engagement Manager

 

 

Kevin Headshot B+W

 

Kevin McCoy             Customer Relationship Manager

 

 

Niamh Headshot B+W

 

Niamh Lacy               Digital Marketing (well mainly digital)

 

 

Luke Headshot B+W

 

Luke Curran              Site Reliability Engineer

 

 

Marek Headshot B+W

 

Marek Mazur             Software Development Engineer

 

 

You’ll be hearing from us over the coming months and we’d also love to hear from you. Contact Niamh at niamhl@kantanmt.com and submit your comments.