Localizing in Indian – Oh Wait! That’s Not a Language


Localizing for the Indian buyer can be an extremely daunting task – especially if it is a company’s first foray into the Asian market. The tastes, expectations and buying habits of the Indian shopper is as varied and diverse as the population of the country. So, localizing in India requires additional planning, strategizing and cultural understanding. The most important thing to keep in mind while planning to localize in India is the sheer variety of languages spoken in India – it’s not Indian – it’s Hindi, Punjabi, Bengali, Tamil, Telugu, Assamese and numerous other languages! Continue reading

KantanMT – 2013 Year in Review

KantanMT 2013 year in ReviewKantanMT had an exciting year as it transitioned from a publicly funded business idea into a commercial enterprise that was officially launched in June 2013. The KantanMT team are delighted to have surpassed expectations, by developing and refining cutting edge technologies that make Machine Translation easier to understand and use.

Here are some of the highlights for 2013, as KantanMT looks back on an exceptional year.

Strong Customer Focus…

The year started on a high note, with the opening of a second office in Galway, Ireland, and KantanMT kept the forward momentum going as the year progressed. The Galway office is focused on customer service, product education and Customer Relationship Management (CRM), and is home to Aidan Collins, User Engagement Manager, Kevin McCoy, Customer Relationship Manager and MT Success Coach, and Gina Lawlor, Customer Relationship co-ordinator.

KantanMT officially launched the KantanMT Statistical Machine Translation (SMT) platform as a commercial entity in June 2013. The platform was tested pre-launch by both industry and academic professionals, and was presented at the European OPTIMALE (Optimizing Professional Translator Training in a Multilingual Europe) workshop in Brussels. OPTIMALE is an academic network of 70 partners from 32 European countries, and the organization aims to promote professional translator training as the translation industry merges with the internet and translation automation.

The KantanMT Community…

The KantanMT member’s community now includes top tier Language Service Providers (LSPs), multinationals and smaller organizations. In 2013, the community has grown from 400 members in January to 3400 registered members in December, and in response to this growth, KantanMT introduced two partner programs, with the objective of improving the Machine Translation ecosystem.

The Developer Partner Program, which supports organizations interested in developing integrated technology solutions, and the Preferred Supplier of MT Program, dedicated to strengthening the use of MT technology in the global translation supply chain. KantanMT’s Preferred Suppliers of MT are:

KantanMT’s Progress…

To date, the most popular target languages on the KantanMT platform are; French, Spanish and Brazilian-Portuguese. Members have uploaded more than 67 billion training words and built approx. 7,000 customized KantanMT engines that translated more than 500 million words.

As usage of the platform increased, KantanMT focused on developing new technologies to improve the translation process, including a mobile application for iOS and Android that allows users to get access to their KantanMT engines on the go.

KantanMT’s Core Technologies from 2013…

KantanMT have been kept busy continuously developing and releasing new technologies to help clients build robust business models to integrate Machine Translation into existing workflows.

  • KantanAnalytics™ – segment level Quality Estimation (QE) analysis as a percentage ‘fuzzy match’ score on KantanMT translations, provides a straightforward method for costing and scheduling translation projects.
  • BuildAnalytics™ – QE feature designed to measure the suitability of the uploaded training data. The technology generates a segment level percentage score on a sample of the uploaded training data.
  • KantanWatch™ – makes monitoring the performance of KantanMT engines more transparent.
  • TotalRecall™ – combines TM and MT technology, TM matches with a ‘fuzzy match’ score of less than 85% are automatically put through the customized MT engine, giving the users the benefits of both technologies.
  • KantanISR™ Instant Segment Retraining technology that allows members near instantaneous correction and retraining of their KantanMT engines.
  • PEX Rule Editor – an advanced pattern matching technology that allows members to correct repetitive errors, making a smoother post-editing process by reducing post-editing effort, cost and times.
  • Kantan API – critical for the development of software connectors and smooth integration of KantanMT into existing translation workflows. The success of the MemoQ connector, led to the development of subsequent connectors for MemSource and XTM.

KantanMT sourced and cleaned a range of bi-directional domain specific stock engines that consist of approx. six million words across legal, medical and financial domains and made them available to its members. KantanMT also developed support for Traditional and Simplified Chinese, Japanese, Thai and Croatian Languages during 2013.

Recognition as Business Innovators…

KantanMT received awards for business innovation and entrepreneurship throughout the year. Founder and Chief Architect, Tony O’Dowd was presented with the ICT Commercialization award in September.

In October, KantanMT was shortlisted for the PITCH start-up competition and participated in the ALPHA Program for start-ups at Dublin’s Web Summit, the largest tech conference in Europe. Earlier in the year KantanMT was also shortlisted for the Vodafone Start-up of the Year awards.

KantanMT were silver sponsors at the annual 2013 ASLIB Conference ‘Adopting the theme Translating and the Computer’ that took place in London, in November, and in October, Tony O’Dowd, presented at the TAUS Machine Translation Showcase at Localization World in Silicon Valley.

KantanMT have recently published a white paper introducing its cornerstone Quality Estimation technology, KantanAnalytics, and how this technology provides solutions to the biggest industry challenges facing widespread adoption of Machine Translation.

KantanAnalytics WhitePaper December 2013

For more information on how to introduce Machine Translation into your translation workflow contact Niamh Lacy (niamhl@kantanmt.com).

機械翻訳 KantanMT Supports Japanese

KantanMT Japanese TokenizerThis week, KantanMT announced the introduction of a Japanese tokenizer and detokenizer to its KantanMT platform. This means that members can now build Machine Translation engines with Japanese as either the source or target language. To celebrate the release of KantanMT Japanese, we are going to give you a few facts and figures about Japan, the language, and Japan’s Machine Translation industry.

Oh and by the way, the title of this post means “Machine Translation”!!

The Japanese Language
Japanese is known as one of the world’s most difficult languages. Not too difficult to speak, but tough to read and write.

Japanese syntax is very different to English

  • Japanese sentence structure is in a subject-object-verb (SOV) or object-subject-verb (OSV) order, which is opposite to the English subject–verb–object (SVO) structure. The verb always comes at the end of a sentence
  • The indefinite and definite articles (‘a’ and ‘the’) are not commonly used
  • Japanese is written in 3 alphabets – Hiragana, Katakana, and Kanji
  • The singular and plural of a word are the same
  • 5 vowels and 11 consonants produce the 48 sounds of the language
  • There are no “L” and “R” sounds in Japanese

There is some good news however, because nouns do not have genders in Japanese-just like English!

Some other facts about Japanese…

  • There are approx.130 million people speaking Japanese in the world today. Most of these are in Japan of course, but there are also people speaking Japanese as their first language in the USA and South America. Japanese is the second most common language spoken in Brazil.
  • The literacy rate in Japan is almost 100%.
  • There are thousands of foreign loan words in the Japanese language. These are called gairaigo (外来語) and come from mostly English and European languages. These words are always written with the Katakana alphabet.
  • English is the only foreign language taught in public Japanese schools.


Japan and Machine Translation
Now that we know some more about the Japanese language, we’re going to turn our attention to the history of Japan’s Machine Translation Industry.

In 1955, the first Japanese research programme began at Kyushu University, and the other major Machine Translation research bodies in Japan up until the mid-60s were The Electrotechnical Laboratory in Tokyo and Kyoto University. It was at the Electrotechnical Laboratory in Tokyo that research on the first English to Japanese Machine Translation system began in 1957.

John Hutchins (n.d.) says that English to Japanese was the primary research focus of the period, however, it was very difficult to analyse written Japanese because of the “lack of any indication of word boundaries” (Hutchins, n.d., p. 1). Hutchins goes on to say that there was also very few general purpose computers in Japan with “sufficient storage capacity for Machine Translation needs (Hutchins, n.d., p. 1)”, he adds that this directed early Japanese Machine Translation research towards “the investigation of special purpose machines and perhaps the emphasis on theoretical studies” (Hutchins, n.d., p. 2).

Japan a Leader in MT…

Japan became a leading player in the Machine Translation field during the 1980s. In 1982, the state launched a four year Machine Translation programme that resulted in a huge increase in the number of English to Japanese Machine Translation projects within the Japanese manufacturing industry. The decade also saw Fujitsu launching its Atlas Machine Translation Japanese to English engine and the first ever Machine Translation summit was held in Tokyo in 1987.

You can find out more about early Japanese Machine Translation projects by reading the TAUS timeline and John Hutchins’s Projects and groups in Japan, China, and Mexico (1956-1966).

The Japanese language itself has also been involved in some of the major Machine Translation projects of the past decades. For example, in 1991 NEC showcased INTERTALKER, which was an “automatic speech to speech system combining speech recognition, PiVOT MT, and speech synthesis for English, Japanese, French, and Spanish” (TAUS, 2013). In 1992, the C-Star demonstrated the first phone translation between Japanse, English, and German. Then in 1993, the eight year German state-supported project Veromobil began. Veromobil aimed to produce “portable systems for face-to-face English-language business negotiations in German and Japanese” (Wired, 2000).

By introducing a Japanese tokenizer and detokenizer, KantanMT is adding a new page to the history of Machine Translation and the Japanese language. We also want to play a part in the continued expansion of your company, and with KantanMT, the door to Japanese markets is now open!

If you want to find out more about KantanMT, visit KantanMT.com and sign up to our free 14 day trial.

Featured Image Source: http://www.csuci.edu/cia/countries/japan.htm

KantanMT now Supports Chinese

ChinaAs the American market for translation slows down, the market in Asia continues to grow. According to the Common Sense Advisory Board, Asia makes up 12.88% of the global market share for translation services and provides a multitude of opportunities for growth.
As Asia’s biggest economy, China has an important role to play in this. Although there is widespread talk of decline within the Chinese economy, it is still by far the fastest growing nation of the last decade.

Despite its ageing population of 1.3 billion, China is set to create an increasing amount of translation business in the coming years. The Chinese government is now focusing on creating more exporting opportunities for indigenous businesses that are answering to growing demands from the US and Europe. Chinese exports grew by 14.1% in December 2012 compared to one year earlier. In February 2013 ‘Bloomberg’ discussed how China has now surpassed the US as the world’s biggest trading nation.

Answering to the demand for Chinese Machine Translations, KantanMT has recently introduced Chinese language capabilities on the cloud-based platform. Members can manage Chinese Machine Translation engines using the same process as all other languages on KantanMT.com. Simply upload Chinese training data, build a KantanMT engine, and then translate client files. KantanMT encourages members to use KantanWatch™ to track the quality improvements of their engines over time, helping them to significantly improve engine performance.

Login and start translating >>

Register here for a Free Webinar >>