Circa 1997, the reigning world chess champion Garry Kasparov was against an unknown opponent. The opponent was formidable. Garry was not playing a human. He was playing the game with IBM’s behemoth supercomputer, Deep Blue.
Garry had beaten the opponent in the last few games. However, the game played on 11th May 1997 game was different. Garry lost the game. Deep Blue made history:
The First computer program to defeat a world champion in a match under tournament regulations.
This game was significant for many reasons. It caught the world imagination. It laid the foundation for many possibilities that will shape the world of AI. Like explorers, data scientists and software engineers embarked on the relatively unchartered territory of Deep Learning.
Fast forward 2018, Deep Learning is a buzz word. According to Gartner, Deep Learning has already crossed the innovation trigger stage. It has reached the stage of the peak of inflated expectation.
It will be another few years before this technology goes mainstream. However, the applications of deep learning have already permeated in our lives.
This article is a primer for deep learning. It attempts to provide a simple explanation of the fundamental concepts. It discusses the reason for its rise and touches upon few applications of Deep Learning.
Let us first classify deep learning in the world of Artificial Intelligence.
As depicted in the figure above, deep learning is a sub-set of Machine Learning. Machine Learning is in itself a subset of Artificial Intelligence (AI)AI is a field that enables machines to become intelligent progressively.
That intelligence can manifest in many ways. Let us understand how deep learning systems manifests itself.
Imagine a system that enables to identify customer churn for a telco organization. One way to design this system is to craft rules about how to determine who will churn. A series of business rules are hand-crafted for a specific purpose, i.e., identify customers who will churn. Creating lot of rules is an arduous task. There are a lot of factors and their permutations. Rules are also prone to frequent changes. As the customer profile changes or the business model changes, these rules need to be altered.
This is a rudimentary form of AI. A rule-based system.
Another way to identify customer churn would create statistical learning models. They learn it from past churn information. These models take some inputs a.k.a features. These features impact customer churn. They predict if customer churns or not.
These models are Machine Learning models. They learn from the past data and input features. They adapt to the characteristics of input data changes.
Note that these machine learning models rely on humans to provide the input features. For the model to be effective, the input feature needs to be useful. They rely on the intuition and domain knowledge of the modeler. The modeler will have to feed the machine learning model with the correct features. It asks for the right representation of the data.
This is a Machine Learning based AI systems.
A traditional machine learning model works fine as long as the representation of the data is congruous to the expected output. However, when the number of potential features grows, identifying right input features becomes a challenge. Machine Learning practitioner also call this challenge the curse of dimensionality. In a traditional machine learning model development, a lot of time is spent on feature engineering.
In the example of customer churn, a lot more features impact customer churn. Some of these features are unknown. Some of them are derived.
What if these features can be learned automatically?
Such scenarios, where there are a lot of unknown features is where a deep learning based system shines. A deep learning based system automatically learns the relevant characteristics that cause the churn. It acquires the right representation of data.
The process of “learning the features automatically” is called as Representational Learning.
A deep learning based system automatically learns the relevant features to solve a machine learning task. That’s great! But how does it do it?
The building block of a deep learning network is a machine learning algorithm called neural networks.
Deep neural networks are the cornerstone algorithms that make deep learning happen. A neural net consists of a lot of simple processing interconnected nodes.
A deep neural network has three types of layers:
Neural networks work on a simple to complex pattern recognition. They learn simple features in the first layers of the net. Some nodes are activated based on defined thresholds. These activated nodes input into the subsequent layers of the network. In the following layers, it combines those features to derive other sophisticated features. The process goes on until it computes the final output in the output layer.
Deep Learning has been around for quite some time.
Why is deep learning becoming popular now?
In 1943, Warren McCulloch wrote a paper on neurons might work. However, in the earlier years, the progress and adoption of the neural network were impeded by two significant limitations:
With the advent of Big Data and cloud computing, these limitations were no longer an impediment.
As shown in the figure above, the computing power increased by 10,000 times since the year 2000. The cost of storing the data has also gone down by around 3000 times since the year 2000. There has been an exponential growth of data created due to the rise of the internet, the smartphone revolution and the social media. Data is ubiquitously available now. These three ingredients created a milieu for a perfect storm for deep learning. Deep Learning saw a rekindled interest in research and a resurgence in adoption.
It turns out that deep learning frameworks are efficient to carry out tasks that humans excel at. Humans excel in tasks like image recognition, speech translation, and recognition. Humans are good at recognizing patterns in images and identify specific objects. Humans are good at processing languages, understanding them and classifying them into intents and entities. Deep Learning networks excel in these kinds of tasks too. Major domains in which deep learning is used extensively are:
Computer vision is an interdisciplinary field that deals with how computers can be used for gaining understanding of images.
A few applications that use computer vision are:
– Object recognition: identifying or classifying objects in images or video streams.
– Face recognition: recognition faces in an image or video streams.
Natural Language Processing is the application of computational techniques to the analysis and synthesis of natural language and speech
Deep Learning frameworks have managed to beat humans in speech recognition. In Jan 2018, Microsoft’s and Alibaba’s speech recognition models were able to score more than humans. It was a challenge known as SQuAD, for Stanford Question Answering Dataset.
A few applications that use speech recognition are:
In this article, we touched upon the core components of deep learning. We discussed why it is on the rise and what are its important applications.
Deep Learning framework is at the center of the rise of Artificial Intelligence. It is an evolving field. It will continue to see growing adoption in coming years. Deep Learning applications will continue to transform the world we live.
This article was first published at www.datascientia.blog.
Content retrieved from: https://www.datasciencecentral.com/profiles/blogs/an-executive-primer-to-deep-learning.