Deep Learning – Is it Simply a Chip Off the Old Block? – KantanAI – Machine Translation – Neural Language Technology – AI – Localization Technology

Today’s blog is aimed at helping the novice understand the technology that is Deep Learning (DL). To do this, I will need to discuss in-depth Linear Algebra, Statistics, Probability Theory and Multivariate Calculus. Only Joking! Nothing would turn the novice readers off than trying to hack our way through the above complex disciplines. We’ll leave that for the nerds. Today’s blog – like my last on Machine Learning – will try and use an analogy to help explain what is without doubt a very multifaceted, intricate subject to fully master.

For myself, the more I read about Deep Learning, and the more I spoke to the engineering masterminds at KantanAI the more I realised that the discipline of using a Deep Learning model bore a similarity to sculpting. Let me expand: I don’t know to whom this quote is attributed, but for me it certainly describes the methods of Deep Learning:

“The sculpture produces the beautiful statute by chipping such parts of the marble block as are not needed – it is a process of elimination.”

Indeed, I think it was no less than Michelangelo, who when asked about sculpting, said that the angel lay within the marble block; it was simply his job to release it. Michelangelo’s minimalist explanation, and the above quotation, encapsulate in its simplest form what the Deep Learning progression involves. The engineer is the sculpture. The marble block represents the huge block of dense data to be processed. The act of processing the data is the chipping away of unwanted information by neural networks. The act of fine tuning the deep learning neural engine represents the technique of the sculptor carefully finessing the shape of the emerging form in to a recognisable figure.

In both the role of sculptor and engineer there is a vision of what the ‘fine-tuning’ activity should produce. I am confident that if you as a novice accept this simple analogy you are going someway to grasping the very fundamentals of the Deep Learning process.

As a concept, Deep Learning is less than two decades old. The origin of the expression is attributed to Igor Aizenberg, Professor and Chair of the Department of Computer Science at Manhattan College, New York. Aizenberg studies, amongst other things, complex-valued neural networks. He came up with the concept of an Artificial Neural Network system based on that of the human neural network – the network of the human brain.

The ‘Deep’ element of the concept refers to a multi-layered processing network of neuron filters. The equivalent process in the human brain is that of information flowing through neurons connected by synapses. In the machine equivalent, artificial neurons are used to fine-tune and refine data as it is passed through the ‘engine’. The process of Deep Learning also learns from experience and can adjust its processes accordingly. In sculpting, it is the equivalent of the experienced sculptor chipping and refining the marble to release Michelangelo’s hidden angel.

Jeff Dean, a Senior Fellow at Google’s ‘System and Information Group’ – the group behind many of Google’s highly sophisticated machine learning technologies – said:

“When you hear the term ‘Deep Learning’ just think of a large neural net. Deep refers to the number of layers typically, and so this is kind of the popular term that’s been adopted by the press.”

For many novices there is a confusion around the terms Machine Learning (ML), Artificial Intelligence (AI) and Deep Learning (DL). There need not be this confusion as the division is quite simple: Artificial Intelligence is the catch-all term to cover Machine Learning and Deep Learning. Machine Learning is an over-arching term for the training of computers, using algorithms, to parse data, learn from it and make informed decisions based on the accrued learning. Examples of machine learning in action is Netflix showing you what you might want to watch next. Or Amazon suggesting books you might want to buy. These suggestions are the outcome of these companies using ML technology to monitor and build preferences profiles based on your buying patterns.

Deep Learning is a subset of ML. It uses a highly sophisticated, multi-layered, pattern of ‘neurons’ to process huge chunks of data looking to refine the information contained within that data. It takes an abstract jungle of information, as is contained with data, and refines these in to clearly understood concepts. The data used can be clean, or not clean. Clean data is the processing of refining the pre-processed information to remove any clearly irrelevant information. Clean data can be processed quicker than data that has not been cleaned. Think of it as the human brain blocking out extraneous information as it processes what is relevant, and discards what is irrelevant. Something the human brain does every minute of every day.

But why has Deep Learning suddenly taken off so spectacularly? It is because of the ability to train Artificial Neural Networks (ANN) to a level of accuracy when trained with huge amount of data. ANN can synthesise complex non-linear processes with a high degree of accuracy. DL is also becoming predominant because of the following boosters:

The emergence of Big Data
The increase in computational power
The emergence of The Cloud
The affordable availability of GPU and TPU
The development of DL models using open source code

Today it is estimated that Big Data provides 2.5 quintillion bytes of information per day. Now, if you are like me, you’ll will never have heard of the measure quintillion. Well, apparently, it is 1 million billion. Not that helps give it finer focus!

According to IBM:

“90% of the data in the world today has been created in the last two years. This data comes from everywhere: sensors used to gather shopper information, posts to social media sites, digital pictures and videos, purchase transaction, and cell phone GPS signals to name a few. This data is big data.”

It is safe to say that the amount of data available will only increase over the coming years. Institutions such as the European Union, the United Nations, the World Bank, the World Health Organisation, Social Media companies etc make huge volumes of data available daily, and in multilingual form. The importance of this resource of massive data is underlined by Andrew Ng, Chief Scientist at Baidu, China’s major search engine, who said:

“The analogy to deep learning is that the rocket engine is the deep learning models and the fuel is the huge amounts of data we can feed to these algorithms.”

The advent of Cloud Computing has allowed even small companies to have virtually unlimited storage space, and access to fantastically powerful computational power. Processors of the power of tensor processing unit (TPU) are available via Cloud computing. Some examples of Cloud computing sources would be Amazon’s Web Service, IBM’s SmartCloud or Google’s Cloud.

TPUs were developed by Google to specifically deal with the demands of ANN. Previously, graphics processing unit (GPUs) reduced from weeks to hours the machine learning process. TPUs have speeded up that process exponentially. Without this level of computing power, it is unlikely Deep Learning would be a viable technology.

Finally, Intel is reportedly developing a device called a Neural Stick which they claim will allow companies to bypass the Cloud to do their processing at a local level (i.e. non-Cloud level). This will be a boost to those companies who baulk at the security implications of processing data in a remote location. It will also increase the speed of processing as all the crunching will be done at the local level. Intel say it is their intent to make DL work “everywhere and on every device”. If they succeed, Deep Learning will expand to a huge degree. Interesting times lie ahead for Artificial Intelligence.

Aidan Collins is the Marketing Manager at KantanAI