Following the announcement of a direct collaboration of KantanLabs and the ADAPT Centre for Digital Content Technology, we got in touch with Professor Andy Way from the School of Computing in Dublin City University and ADAPT Centre to ask him about innovations in the field of automated translations as well as his thoughts on the engagement between KantanLabs and ADAPT.
KantanMT: Firstly, thank you for doing this interview with us. We are delighted to have you and the ADAPT research as a mentor for KantanLabs. To begin, would you mind telling us what you think about the present state-of-affairs with Machine Translation (MT), and where do you think MT will be in five or ten years?
Andy: Clearly neural net-based approaches are beginning to dominate the research landscape, with most submissions to leading academic conferences featuring some connectionist aspect. However, it is yet to be proven definitively that these approaches have supplanted the more traditional phrase-based statistical MT (PB-SMY) systems as the new state-of-the-art.
We need to be careful not to throw out the baby with the bath water here; I’ve been around long enough to see – for example when PB-SMT came in – that purely “letting the data decide” soon gets you to a glass ceiling. After Moses was released in 2007, we spent the next decade ‘smuggling in’ syntactic and semantic (and recently, discourse) features in order to break through this ceiling. The same is happening now with the advent of this new paradigm, so advocates of the connectionist approach could learn a lot from this, and not make the same mistakes.
Ultimately, of course, whatever approach you take to MT, the problem is the same, and it’s a very hard problem at that. So in 5-10 years, I think that there will be some combination of PB-SMT and connectionist approaches to better solve the vast array of problems and use-cases that MT can help to solve. I think that is more likely where MT will be most impactful.
KantanMT: Do you think setting up an advanced research lab for automated translation research, like KantanLabs is a necessary step forward for MT?
Andy: Back in 2007 when we set up CNGL, the forerunner of ADAPT, it is fair to say that most of the academics in the Centre were not commercially focused. Over the past 9 years, that has changed enormously. As to myself, between 2011-13, I went to the UK to build industry-leading PB-SMT systems for a range of international clients; I couldn’t have hit the ground running in the way that my team did if we hadn’t changed our focus more to the demands of industry.
By the same token, you can see throughout the industry that leading MT academics are being hoovered up by industry, so it’s clear that industry can see that hooking up with leading MT experts is the way to go. This bidirectional push and pull model is undoubtedly the way to go, and I believe KantanLabs will lead to improvements more rapidly in the technology that we are both (industry and academics) interested in.
KantanMT: Could you tell our readers a bit more about the research association between KantanLabs and the ADAPT Centre? What do you hope will come from the partnership?
Andy: As you’ve rightly pointed out, ADAPT has pulled in a huge amount of funding from SFI and the Irish Government. In return, we are charged with broadening our funding model, and the KPIs agreed with SFI are challenging. Nonetheless, the unique cohesion between the commercially-facing team, the world-leading academics and the D-Lab – a set of commercially-trained programmers – is a very enticing model for our industry partners, which is already starting to pay off with the successful delivery of academic-industry funded projects.
Accordingly, I expect KantanMT and KantanLabs to benefit from the internationally-renowned expertise in my MT lab, to help improve the quality of the translations output by the industry-leading KantanMT infrastructure. By the same token, this close cooperation with KantanMT will help improve our skills, especially in cloud-based MT offerings.
KantanMT: Are there any specific areas of MT research that you would like to see KantanLabs working on?
Andy: With KantanMT and KantanLabs being only a few hundred yards away from my research lab, Tony O’Dowd and I meet regularly for a coffee and chat about some of the industry challenges of MT. I think some of the topics that we will work on together in the very near future are how to integrate aspects of the new connectionist models into the KantanMT pipeline to demonstrate clear benefits to KantanMT’s wide range of clients. Clearly too the problems of dealing with CJK languages – especially the J and the K – haven’t gone away, so some of the novel work we’ve been doing in the lab on reordering models is likely to prove of benefit to KantanMT.
KantanMT: In conjunction with the EAMT (European Association of Machine Translation), KantanLabs will hire two interns to research LT. As a member of the organisation, what do you hope the researchers will achieve during their internships?
Andy: I’ve been on the EAMT committee for longer than I care to recall! While I was President from 2009—15, we came up with a wide range of schemes whereby profits from conferences could be returned to our members with clear benefits to the wider community. Funded internships are one of those programmes which deliver a very clear benefit to both company and intern in a short amount of time. Getting to apply their scientific knowledge in a cutting-edge environment like KantanLabs provides invaluable experience to a budding researcher, and adds considerably to their CV. For a company like KantanMT, by having new, enthusiastic MT researchers at KantanLabs being occupied on a single well-defined problem gives the best chance of successful integration of this work into the MT pipeline with clear benefits to the organisation and your clients. As this is half-funded by the EAMT, the business model is hard to beat!
Note from KantanMT: We announced our Summer Internship Programme recently in conjunction with the EAMT (European Association of Machine Translation). KantaLabs will launch its internship programme in June 2016. Selected candidates will then work within the Research and Development teams at KantanLabs to explore the impact of word re-ordering in the creation of higher quality translation outputs.
About Prof. Andy Way
Prof. Way obtained a B.Sc. (Hons) in 1986, an M.Sc. in 1989, and his PhD in 2001 from the University of Essex, Colchester, U.K. He is presently a Full Professor at DCU and a recipient of DCU President’s Research Award for Science and Engineering.
Prof. Way has secured grants over €64.7 million in total, with over €10.1 million directly for his own research. He was PI for Integrated Language Technologies in CNGL until June 2011, when he took a career break until December 2013 to work in the translation industry in the UK: for Applied Language Solutions as Director of Language Technology (2011-12), and for Lingo24 as Director of Machine Translation (2012-13).
Since his return to DCU in Jan. 2014, Prof. Way has been Deputy Director of CNGL, and subsequently Deputy Director of ADAPT, the new €23.9 million SFI-funded Centre for Digital Content Technology. In ADAPT Prof. Way leads the Transforming Digital Content Theme and the Localisation Spoke.
Prof. Way also served as President of the European Association for Machine Translation from and has been Editor of the Machine Translation journal since 2007. He has published over 300 peer-reviewed papers and has successfully graduated 20 PhD students.