What Makes a Start-up Stand Out?

Irish Software Association

KantanMT was recently announced as a finalist in three out of eight categories of the Irish Software Awards 2016 (ISA 2016); ‘Emerging Company of the Year’, ‘Technology Innovation of the Year’ and ‘Outstanding Achievement in International Growth’. This is very exciting news for us, and our success has only been made possible thanks to our brilliantly supportive clients and partners. The announcement led us to walk down the memory lane, and think about things that we did right over the past couple of years.

We would like to share some basic principles that we followed as a company, which helped us succeed and made us one of the most recognisable brands, not only within the translation and localization industry, but also within the wider Software Service scene.

If you are in a start-up mode, these pointers will help you achieve full commercial exploitation within the span of a year: Continue reading

Improving workflow integration and efficiency with KantanAPI

What is the KantanAPI?

KantanAPI enables KantanMT clients to interact with KantanMT as an on-demand web service. It also provides a number of different services including translation, file upload and retrieval and job launches.

With the KantanAPI  you not only have the opportunity to integrate KantanMT into your workflow systems but also the ability to receive on-demand translations from your KantanMT engines. All these services make the experience with Machine Translation as seamless as possible.

Accessing KantanAPI

Please Note: The API is only available to KantanMT members in the Enterprise Plan.

To access the KantanMT API you will first need your ‘API token’. This token can be found in the ‘API’ tab on the ‘My Client Profiles’ page of your KantanMT account.

Once you have your token you can use the API in a number of ways

  1. Using the API tab on the ‘My Client Profiles’ page in the KantanMT Web interface
  2. Using the REST interface via HTTP GET or POST requests
  3. Using one of our various connectors, which are built using our KantanAPI

For more details on implementing your API solution via the REST interface, please see the full API technical documentation at the following link:

How to use KantanAPI?

Login into your KantanMT account using your email and your password.

You will be directed to the ‘My Client Profiles’ page. You will be in the ‘Client Profiles’ section of the ‘My Client Profiles’ page. The last profile you were working on will be ‘Active’.

If you wish to use the ‘KantanAPI’ with another profile other than the ‘Active’ profile. Click on the profile you wish to use the ‘KantanAPI’ with, then click on the ‘API’ tab.

API tab

You will be directed to the ‘API Settings’ page. Now click on the ‘Launch API’ button.

Launching API

A ‘Launch API’ pop-up will now appear on your screen asking you ‘Are you sure you want to launch the API?’ Click ‘OK’.

launch Pop-up alert

The ‘API Status’ will now change from ‘offline’ to ‘initialising’, the ‘Launch API’ button will now change to ‘Launching API’ .

Launching API

When your KantanAPI launches the ‘API Status’ will now change from ‘initialising’ to ‘running’, the ‘Launching API’ button changes to ‘Shutdown API’ and you should now be able to click on the ‘Translate’ button.

API running

Type the text you wish to translate in the text box and click on the ‘Translate’ button.

Translating

The translated text will now appear in the ‘Translated Text’ box. If you wish to make any changes to the translated text simply place the cursor inside the ‘Translated Text’ box and make the changes. Save these changes by clicking the ‘Retrain Engine’ button.

Retrain Engine

Test if your engine was successfully retrained by clicking the ‘Translate’ button. The retrained text will now appear in the ‘Translated Text’ box.

If you don’t wish to retrain your engine and you are happy with the translated text in the ‘Translated Text’ box. You may continue translating other text or shut down your KantanAPI by clicking the ‘Shutdown API’ button.

When you click the ‘Shutdown API’ button a pop-up will now appear asking you ‘Are you sure you want to shout down the API?’ Click ‘OK’.

Shutdown Pop-up alert

The ‘Shutdown API’ button will now change to ‘Terminating API’, the ‘API status’ will now change from ‘running’ to ‘terminating’ and you shouldn’t be able to click on the ‘Translate’ or ‘Retrain Engine’ button.

Terminating API

You will now be directed back to the initial screen on the API Settings page.

API settings page

 

Additional Support

KantanAPI™ is one of the various machine translation services offered by KantanMT to improve  productivity for our clients and also enable them to be more efficient. For more information on KantanAPI or any KantanMT products please contact us at info@kantanmt.com.

For more details on the KantanMT API please see the following links and the video below:

Understanding and Improving your KantanMT Engine with KantanTimeLine™

Ease of use and simplicity are always on the minds of our Developers, hence the making of KantanTimeLine™. KantanTimeLine enables KantanMT clients to view the life cycle of their KantanMT engine. This empowers our clients as they are able to find exactly what is negatively or positively affecting the quality of their engines. Clients are able to keep track of things such as, Training Data uploads, Translation jobs, Engine Tuning, templates, Build jobs and so on through the KantanTimeLine.

How to use KantanTimeLine™

Login into your KantanMT account using your email and your password.

You will be directed to the ‘My Client Profiles’ page. You will be in the ‘Client Profiles’ section of the ‘My Client Profiles’ page. The last profile you were working on will be ‘Active’.

Active profile

If you wish to use ‘KantanTimeLine’ with another profile other than the ‘Active’ profile. Click on the profile that you want to you wish to view the ‘KantanTimeLine’.

Click on the ‘TimeLine’ tab.

TimeLine tab

You will now be directed to the ‘TimeLine’ page for your chosen profile.

TimeLine

To restore an Archived Build select the Build you wish to restore from the ‘Archives’ drop-down menu and click on the ‘Restore’ button.

Archive and Restore

To delete an archived Build click on the ‘Delete’ button.

Delete

To archive a Build click on the ‘Archive’ button of the build you wish to archive.

Archive

To view or edit the description of a build click on the ‘Yellow Notepad’ icon.

Yellow Notepad

To filter the timeline click on the ‘Filter’ drop down-menu and select the filter you wish to use.

Filters

Additional Information and Support

KantanTimeLine™ is one of the many products offered by KantanMT to make  the integration of Machine Translation into the workflow of our clients seamless. For more information on TimeLine or any KantanMT products please contact us at info@kantanmt.com.

TimeLine can also be used in KantanBuildAnalytics. To learn how TimeLine is incorporated into KantanBuildAnalytics please click on the link below or contact us at  info@kantanmt.com.

Cloud Technology Translates into Success for KantanMT

 

Tony O'DowdKantanMT Founder and Chief Architect, Tony O’Dowd was recently featured in one of Ireland’s major national newspapers; The Irish Times.

The author of the news article, Olive Keogh is a business journalist who specialises in writing about innovative Irish enterprises and startups. With Olive’s kind permission, we are republishing the Irish times article.

 

“It’s not widely known at home but Ireland has developed an international reputation for research in statistical machine translation. Trinity, DCU and UL are all recognised worldwide and 120 PhD students have graduated here with skills in the field in the last five years. That’s more than in any other country in Europe,” says Tony O’Dowd the man behind KantanMT, a new scalable, high-speed machine translation system based on the Moses decoder and the Amazon Web Services and Cloud Computing infrastructure.

O’Dowd has spent almost 30 years in the software localization sector with companies such as Lotus Development Corporation and Symantec. Xcelerator, the company behind KantanMT, is O’Dowd’s second start-up, but he was also involved in the formation of FIT, a training organisation set up in 1998 to provide IT skills and training for the long-term unemployed.

Economics of the Cloud

“We are leveraging the Moses MT decoder and multiple streams of research from the Centre for Global Intelligent Content to make statistical machine translation (SMT) technology available to the masses,” he says.

“Traditional SMT systems are slow, expensive to deploy, time-consuming to customise and complex to manage. In short, not for the faint-hearted. I wanted to harness the economics of the cloud to solve these problems. Using hundreds of high-powered cloud-based severs to convert training data into data models also accelerated the process of customisation and the development of SMT engines.”

O’Dowd points out that in addition to the cost factor, traditional SMT solutions can produce translations of dubious quality. By focusing on advanced natural language processes and data processing algorithms, KantanMT also addresses these quality issues.

“Because of the costs involved, SMT tends to be used by large organisations with big budgets and plenty of people available to work on the system. The KantanMT platform removes this expense and complexity and makes it a far more practical and usable tool for businesses both big and small. Our clients can customise, improve and deploy their own engines in a matter of days,” O’Dowd says.

Software Localization

O’Dowd took his first steps as an entrepreneur in 2000 when he set up Alchemy Software Development. It quickly became a leading player in the software localization sector with over 27,000 licences in use worldwide. This success didn’t go unnoticed. The company was sold to the largest privately owned localization service provider, Translations.com, in March 2007.

Prior to setting up Alchemy O’Dowd was technology manager for Symantec Corporation Ireland and responsible for establishing the organisation’s Asian localization hub in Japan. He was also executive vice-president of Corel Corporation and spent three years as a lecturer in Trinity College Dublin teaching microprocessor design and assembly language programming.

O’Dowd began working on the idea for KantanMT in 2011 while on a year “off” to retrain himself on cloud-based technologies. He employed an MBA student to do detailed research into the barriers preventing companies using SMT and says the major leap forward in computing and storage capacity provided by the cloud enabled him to build a platform for SMT systems that would have been inconceivable without it.

Xcelerator recently raised €1.1 million in seed funding from venture capital company Delta Partners and the Enterprise Ireland High Potential Start Up fund. Early versions of KantanMT were given away free to kill competition and grab market share but first revenues (based on a usage pricing model) began flowing this time last year and O’Dowd says it is now profitable. A second round of funding is planned for later this year.

The company currently employs 11 people in its offices in Dublin and Galway, but this is expected to rise to 20-25 by the end of 2015. Its focus is the export market and its biggest customers are independent software vendors from industries such as ecommerce, finance and electronics. The company also provides MT services to the language industry.

School of Hard Knocks

“Starting your first business is definitely daunting as everything is new and you’re travelling down every road for the first time,” O’Dowd says.

“Next time around there is a lot of commonality and because you’ve learned by engaging with the school of hard knocks, you’re better at anticipating the problems and meeting the challenges. You also have a better network of contacts, you’re less frazzled when things don’t go right and you can actually grow the business faster and at a higher level. You also get a better hearing from the funding community as they view you as a safe pair of hands.”

KantanMT is based in the Invent Building at DCU and O’Dowd says the resources and expertise provided by the Invent team were instrumental in getting KantanMT.com off the ground.

“KantanMT.com is the fastest growing SMT platform in the localization industry today. So far over 80.5 billion words have been uploaded to the platform as training data and more than 750 million words have been translated by our clients. When you consider this has all happened in the last nine months, the company is rapidly becoming one of the biggest translation hubs in the market,” O’Dowd says.

Irish Times Business

 The original article was published on Mon, Apr 27, 2015

.

Email info@kantanmt.com to learn more about how the KantanMT platform operates, or if you would like to set up a personalised demo with Tony.

Machine Translation Technology and Internet Security

Joseph Wojowski, Machine Translation Technology and internet security
Joseph Wojowski

KantanMT is delighted to republish, with permission a post on machine translation technology and internet security that was recently written by Joseph Wojowski. Joseph Wojowski is the Director of Operations at Foreign Credits and Chief Technology Officer at Morningstar Global Translations LLC.

Machine Translation Technology and Internet Security

An issue that seems to have been brought up once in the industry and never addressed again are the data collection methods used by Microsoft, Google, Yahoo!, Skype, and Apple as well as the revelations of PRISM data collection from those same companies, thanks to Edward Snowden. More and more, it appears that the industry is moving closer and closer to full Machine Translation Integration and Usage, and with interesting, if alarming, findings being reported on Machine Translation’s usage when integrated into Translation Environments, the fact remains that Google Translate, Microsoft Bing Translator, and other publicly-available machine translation interfaces and APIs store every single word, phrase, segment, and sentence that is sent to them.

Terms and Conditions

What exactly are you agreeing to when you send translation segments through the Google Translate or Bing Translator website or API?

1 – Google Terms and Conditions

Essentially, in using Google’s services, you are agreeing to permit them to store the segment to use for creating more accurate translations in the future, they can also publish, display, and distribute the content.

“When you upload, submit, store, send or receive content to or through our Services, you give Google (and those we work with) a worldwide license to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content.” (Google Terms of Service – 14 April 2014, accessed on 8 December 2014)

Oh, and did I mention that in using the service, the user is bearing all liability for“LOST PROFITS, REVENUES, OR DATA, FINANCIAL LOSSES OR INDIRECT, SPECIAL, CONSEQUENTIAL, EXEMPLARY, OR PUNITIVE DAMAGES.” (Google Terms of Service – 14 April 2014, accessed on 8 December 2014)

So if it is discovered that a client’s confidential content is also located on Google’s servers because of a negligent translator, that translator is liable for losses and Google relinquishes liability for distributing what should have been kept confidential.

Alright, that’s a lot of legal wording, not the best news, and a lot to take in if this is the first time you’re hearing about this. What about Microsoft Bing Translator?

2 – Microsoft Services Agreement (correction made to content – see below)

In writing their services agreement, Microsoft got very tricky. They start out positively by stating that you own your own content.

“Except for material that we license to you that may be incorporated into your own content (such as clip art), we do not claim ownership of the content you provide on the services. Your content remains your content, and you are responsible for it. We do not control, verify, pay for, or endorse the content that you and others make available on the services.” (Microsoft Services Agreement – effective 19 October 2012, accessed on 8 December 2014)

Bing! Bing! Bing! Bing! Bing! We have a winner! Right? Hold your horses, don’t install the Bing API yet. It continues on in stating,

“When you transmit or upload Content to the Services, you’re giving Microsoft the worldwide right, without charge, to use Content as necessary: to provide the Services to you, to protect you, and to improve Microsoft products and services.”(Microsoft Services Agreement – effective 19 October 2012, accessed on 8 December 2014)

So again with Bing, while they originally state that you own the content you submit to their services, they also state that in doing so, you are giving them the right to use the information as they see fit and (more specifically) to improve the translation engine.

How do these terms affect the translation industry, then?

The problem arises whenever translators are working with documents that contain confidential or restricted-access information. Aside from his/her use of webmail hosted by Microsoft, Google, Apple, etc. – which also poses a problem with confidentiality – contents of documents that are sent through free, public machine translation engines; whether through the website or API, are leaking the information the translator agreed to keep confidential in the Non-Disclosure Agreement (if established) with the LSP; a clear and blatant breach of confidentiality.

But I’m a professional translator and have been for years, I don’t use MT and no self-respecting professional translator would.

Well, yes and no; a conflict arises from that mode of thinking. In theory, yes, a professional translator should know better than to blindly use Machine Translation because of its inaccurate and often unusable output. A professional translator; however, should also recognize that with advancements in MT Technology, Machine Translation can be a very powerful tool in the translator’s toolbox and can, at times, greatly aid in the translation of certain documents.

The current state of the use of MT more echoes the latter than the former. In 2013 research conducted by Common Sense Advisory, 64% of the 239 people who responded to the survey reported that colleagues frequently use free Machine Translation Engines; 62% of those sampled were concerned about free MT usage.

In the November/December 2014 Issue of the ATA Chronicle, Jost Zetzsche relayed information on how users were using the cloud-based translation tool MemSource. Of particular interest are the Machine Translation numbers relayed to him by David Canek, Founder of MemSource. 46.2% of its around 30,000 users (about 13,860 translators) were using Machine Translation; of those, 98% were using the Google Translate or a variant of the Bing Translator API. And of still greater alarm, a large percentage of users using Bing Translator chose to employ the “Microsoft with Feedback” option which sends the finalized target segment back to Microsoft (a financially appealing option since when selected, use of the API costs nothing).

As you can imagine, while I was reading that article, I was yelling at all 13.9 thousand of them through the magazine. How many of them were using Google or Bing MT with documents that should not have been sent to either Google or Microsoft? How many of these users knew to shut off the API for such documents – how many did?

There’s no way to be certain how much confidential information may have been leaked due to translator negligence, in the best scenario perhaps none, but it’s clear that the potential is very great.

On the other hand, in creating a tool as dynamic and ever-changing as a machine translation engine, the only way to train it and make it better is to use it, a sentiment that is echoed throughout the industry by developers of MT tools and something that can be seen in the output of Google translate over the past several years.

So what options are there for me to have an MT solution for my customers without risking a breach in confidentiality?

There are numerous non-public MT engines available – including Apertium, a developing open-source MT platform – however, none of them are as widely used (and therefore, as well-trained) as Google Translate or Bing Translator (yes, I realize that I just spent over 1,000 words talking about the risk involved in using Google Translate or Bing Translator).

So, is there another way? How can you gain the leverage of arguably the best-trained MT Engines available while keeping confidential information confidential?

There are companies who have foreseen this problem and addressed it, without pitching their product, here’s how it works. It acts as an MT API but before any segments are sent across your firewall to Google, it replaces all names, proper nouns, locations, positions, and numbers with an independent, anonymous token or placeholder. After the translated segment has returned from Google and is safely within the confines of your firewall, the potentially confidential material then replaces the tokens leaving you with the MT translated segment. On top of that, it also allows for customized tokenization rules to further anonymize sensitive data such as formulae, terminology, processes, etc.

While the purpose of this article was not to prevent translators from using MT, it is intended to get translators thinking about its use and increase awareness of the inherent risks and solution options available.

— Correction —

As I have been informed, the information in the original post is not as exact as it could be, there is a Microsoft Translator Privacy Agreement that more specifically addresses use of the Microsoft Translator. Apparently, with Translator, they take a sample of no more than 10% of “randomly selected, non-consecutive sentences from the text” submitted. Unused text is deleted within 48 hours after translation is provided.

If the user subscribes to their data subscriptions with a maximum of 250 million characters per month (also available at levels of 500 million, 635 million, and one billion) , he or she is then able to opt-out of logging.

There is also Microsoft Translator Hub which allows the user to personalize the translation engine where “The Hub retains and uses submitted documents in full  in order to provide your personalized translation system and to improve the Translator service.”  And it should be noted that, “After you remove a document from your Hub account we may continue to use it for improving the Translator service.”

***

So let’s analyze this development. 10% of the full text submitted is sampled and unused text is deleted within 48 hours of its service to the user. The text is still potentially from a sensitive document and still warrants awareness of the issue.

If you use The Translator Hub, it uses the full document to train the engine and even after you remove the document from your Hub, and they may also use it to continue improving the Translator service.

Now break out the calculators and slide rules, kids, it’s time to do some math.

In order to opt-out of logging, you need to purchase a data subscription of 250 million characters per month or more (the 250 million character level costs $2,055.00/month). If every word were 50 characters each, that would be 5 million words per month (where a month is 31 days)  and a post-editor would have to process 161,290 words per day (working every single day of this 31-day month). It’s physically impossible for a post-editor to process 161,290 words in a day, let alone a month (working 8 hours a day for 20 days a month, 161,290 words per month would be 8,064.5 words per day). So we can safely assume that no freelance translator can afford to buy in at the 250 million character/month level especially when even in the busiest month, a single translator comes no where near being able to edit the amount of words necessary to make it a financially sound expense.

In the end, I still come to the same conclusion, we need to be more cognizant of what we send through free, public, and semi-public Machine Translation engines and educate ourselves on the risks associated with their use and the safer, more secure solutions available when working with confidential or restricted-access information.

The KantanMT team would like to thank Joseph Wojowski for allowing us to republish his very interesting and topical post on machine translation security. You can view the original post here.

KantanMT Security Key to translation success

At KantanMT, security, integrity and the privacy of our customers’ data is a top priority. We believe this is vital to their business operations and to our own success. Therefore, we use a multilayered approach to protect and encrypt this information. The KantanMT Data Privacy statement ensures that no client data is re-published, re-tasked or re-purposed and will also be fully encrypted during storage and transmission.

Read more about the KantanMT Data Privacy Infrastructure (PDF Download)

For more information about our security infrastructure please contact the KantanMT Sales Team (sales@kantanmt.com).

Scalability or Quality – Can we have both?

KantanMT Engine optimization, machine translationThe ‘quality debate’ is old news and the conversation, which is now heavily influenced by ‘big data’ and ‘cloud computing’ has moved on. Instead it is focusing on the ability to scale translation jobs quickly and efficiently to meet real-time demands.

Translation buyers expect a system or workflow that provides high quality, fit-for-purpose translations. And it’s because of this that Language Service Providers (LSPs) have worked tirelessly, perfecting their systems and orchestrating the use of Translation Memories (TM) within well managed workflows that combine the professionalization of the translator industry – quality is now a given in the buyers eyes.

What is the translation buyers’ biggest challenge?

The Translation buyers’ biggest challenge now is scale – scaling their processes, their workflows and supply chains. Of course, the caveat is that they want scale without jeopardizing quality! They need systems that are responsive, are transparent and scale gracefully in step with their corporate growth and language expansion strategy.

Scale with quality! One without the other is as useless as a wind-farm without wind!

What makes machine translation better than other processes? Looking past the obvious automation of the localization workflow, the one thing that MT can do above all other translation methods is its ability to combine automation and scalability.

KantanAutoScale, KantanMT product, machine translationKantanMT recognizes this and has developed a number of key technologies to accelerate the speed of on-demand MT engines without compromising quality.

  • KantanAutoScale™ is an additional divide and conquer feature that lets KantanMT users distribute their translation jobs across multiple servers running in the cloud.
  • Engine Optimization technology means KantanMT engines now operate 5-10 times faster, reducing the amount of memory and CPU power needed so MT jobs can be processed faster and are more efficiently when using features like KantanAutoScale.
  • API optimization, KantanMT engineers went back to basics, reviewing and refining the system, which enabled users to achieve improvements from 50-100% performance in translation speed.  This meant translation jobs that took five hours can now be completed in less than one hour.

Scalability is the key to advancement in machine translation, and considering the speed at which people are creating and digesting content we need to be able to provide true MT scalability to all language pairs for all content.

KantanMT’s Tony O’Dowd and bmmt’s Maxim Khalilov will discuss the scalability challenge and more, in a free webinar for translation buyers; 5 Challenges of Scaling Localization Workflows in the 21st Century on Thursday November 20th at 4pm GMT, 5pm CET, 8am PST.

KantanMT and bmmt webinar presenters Tony O'Dowd and Maxim Khalilov

To hear more about optimizing or improving the scalability of your engine please contact Louise Irwin (louisei@kantanmt.com).

Leveraging MT to Improve Productivity

KantanMT Leveraging MT in BusinessCommunication is the one of the most important elements of business, and Machine Translation is a flexible tool that can be used to facilitate communication in a wide variety of scenarios and situations. Multinationals and other companies operating globally can take advantage of Machine Translation to achieve productivity gains.

This two part blog series examines two very different examples of implementing Machine Translation. This first post will look at what multinational organizations should consider before introducing Machine Translation to their business, and the second post will discuss the productivity gains and competitive advantages that can be achieved by Language Service Providers (LSPs) who adopt MT.

What is a multinational and why should it use Machine Translation?

Multinational corporations or global businesses are organizations operating in more than one country or region. The concept of an ‘international company’ has been around for hundreds of years, going back to the trading companies, which were established in the 1700s. Outside political agendas, their main purpose was to trade in spices and other commodities throughout Asia and Europe exposing traders to different languages and cultures.

Hundreds of years later, global communication is common place as more businesses operate internationally. There are no boundaries, and companies with worldwide operations require a constant flow of multilingual communication in order to maintain relationships between global employees, customers and stakeholders.

Multinational organizations typically have two types of content; external and internal. External content is created and released to the public; corporate documents, investor information, Corporate Social Responsibility (CSR) and marketing communications. On the other hand, internal content is created for use within the company, this is usually in the form of email and chat communications, memos and other internal documents.

To Translate or not to translate

Organizations without an in house translation team, often outsource the translation of external content to a reputable LSP. This ensures a guaranteed level of quality for the translation, and it also means that the process of localization is more efficient and cost effective. This is because, over time language assets in the form of translation memories, can be built up and leveraged to off-set the cost of future translations.

Internal content, however, is mostly comprised of communications between departments; emails, chats and information on sales and marketing activities. These are usually not translated professionally for a number of reasons:

  • Cost – the volume to be translated can make costs unmanageable
  • Confidentiality – managing sensitive information is more difficult
  • Real-time translation – emails and chat conversations generally requires real-time speed

As an example, if a company is headquartered in the United States, but operates in both Asia and Europe there is a very high possibility that more than one language is used in the company’s internal communication.

Multinational companies often select working languages that must be used for internal communications and department managers are sometimes required to have a certain level of proficiency in the company’s designated working languages, which usually includes English.

Large organizations like the United Nations also have official languages. In this case, documents are not published until a translation has been prepared in each official language.

So, what happens when an email with a client’s product specifications and sales information is sent to a group of employees who speak different languages? Some of those readers may have limited knowledge of the language being used, and only be able to understand the communication, but are not familiar enough with the language to write a coherent response. This can result in them responding in their native language. Suddenly, a single conversation thread contains more than one language, with a greater potential for miscommunication.

Why use Machine Translation?

Multinationals with global operations often have issues with the quantity and flow of internal information between departments operating in different languages. If the corporate headquarters uses a different language than its global subsidiaries, corporate documents need to be translated into each language as the internal information moves down the organizational hierarchy.

Machine Translation is a solution that can provide an instant, understandable ‘gist’ of internal information across a company operating in different languages and the use of MT can serve two purposes:

  • Documents that require a professional human translation are easily identified
  • Internal documents can be translated instantly so employees can get an understanding of the content

In order to understand internal content, employees often might use an open source MT solution such as Google Translate. While this is useful, it does not take into consideration any proprietary jargon or writing styles specific to the organization, and it also raises the question of confidentiality.

Challenges of MT

Many organizations may be interested in taking steps to deploy their own MT systems rather than outsourcing translation jobs or asking bilinguals in the company to do ad hoc translations. Those considering MT have two options; develop their own in house system or use a cloud-based subscription model.

Implementing any new process has challenges and MT is no exception. Some challenges traditionally associated with implementing MT systems are:

  • High costs
  • Complex technology
  • Long deployment times

How should an MT system be integrated?

Before going ahead with an MT solution, an organization needs to carefully consider what it hopes to achieve from implementing Machine Translation. The company should evaluate all the perceived benefits thoroughly, including managing any and all expectations about using Machine Translation.

Organizations thinking of implementing MT should ask:

  • What is its purpose? – Will MT be used as a management tool to improve internal communication and productivity, or to make decisions on what documents require professional outside translation? The purpose should be clearly defined at the outset.
  • Do we have enough language assets to build high quality engines? Bilingual language assets are a key ingredient for building MT engines. The quality of the training data will have a direct impact on the MT engines output “garbage in, garbage out”.
  • Should we invest in building our own system or buy a cloud-based subscription service? MT systems can be rule-based (RBMT), statistical (SMT) and hybrid. In house development of a propriety MT system requires a heavy technology, HR and training investment, unless those assets are readily available. Cloud-based subscription models do not require such a heavy initial investment and are often more cost effective than developing and managing an in house MT system.
  • Is the Machine Translation option scalable? How many language combinations will be needed? If each language pair requires its own unique engine, how simple is it to build additional engines with new language combinations? Scalability will be determined by translating capacity and the ability to add new language combinations, this would be especially important when entering different language markets or expanding the business to new regions. The MT solution should align itself with the company’s long term goals.
  • How will MT be integrated into everyday workflows?  Users need to be able to easily access translation functions through their existing applications like email or the company intranet system to make it accessible and viable.
  • What indirect costs and planning will be involved? RBMT and hybrid systems require qualified linguists or language experts to develop and manage the engines. SMT systems use algorithms to identify probable translations based on the frequency, therefore, storage capacity is essential for the large volumes of training data required. Cloud options eliminate the need for in house technology investment, but extra costs might be incurred for going over the subscription plans, similar to the minutes allowance with mobile phone usage.

In carefully answering these questions, any organization planning to implement MT can stay focused on using the most cost-effective solution and achieve productivity gains with less miscommunication and more time savings.

The next part of this blog will look at how LSPs can leverage Machine Translation technology for productivity gains and competitive advantage.