Deep Learning gets more and more traction. It basically focuses on one section of Machine Learning: Artificial Neural Networks. This article explains why Deep Learning is a game changer in analytics, when to use it, and how Visual Analytics allows business analysts to leverage the analytic models built by a (citizen) data scientist.
Deep Learning is the modern buzzword for artificial neural networks, one of many concepts and algorithms in machine learning to build analytics models. A neural network works similar to what we know from a human brain: You get non-linear interactions as input and transfer them to output. Neural networks leverage continuous learning and increasing knowledge in computational nodes between input and output. A neural network is a supervised algorithm in most cases, which uses historical data sets to learn correlations to predict outputs of future events, e.g. for cross selling or fraud detection. Unsupervised neural networks can be used to find new patterns and anomalies. In some cases, it makes sense to combine supervised and unsupervised algorithms.
Neural Networks are used in research for many decades and includes various sophisticated concepts like Recurrent Neural Network (RNN), Convolutional Neural Network (CNN) or Autoencoder. However, today’s powerful and elastic computing infrastructure in combination with other technologies like graphical processing units (GPU) with thousands of cores allows to do much more powerful computations with a much deeper number of layers. Hence the term “Deep Learning”.
The following picture from TensorFlow Playground shows an easy-to-use environment which includes various test data sets, configuration options and visualizations to learn and understand deep learning and neural networks:

If you want to learn more about the details of Deep Learning and Neural Networks, I recommend the following sources:
While Deep Learning is getting more and more traction, it is not the silver bullet for every scenario.
Deep Learning enables many new possibilities which were not possible in “mass production” a few years ago, e.g. image classification, object recognition, speech translation or natural language processing (NLP) in much more sophisticated ways than without Deep Learning. A key benefit is the automated feature engineering, which costs a lot of time and efforts with most other machine learning alternatives.
You can also leverage Deep Learning to make better decisions, increase revenue or reduce risk for existing (“already solved”) problems instead of using other machine learning algorithms. Examples include risk calculation, fraud detection, cross selling and predictive maintenance.
However, note that Deep Learning has a few important drawbacks:
Deep Learning is ideal for complex problems. It can also outperform other algorithms in moderate problems. Deep Learning should not be used for simple problems. Other algorithms like logistic regression or decision trees can solve these problems easier and faster.
Neural networks are mostly adopted using one of various open source implementations. Various mature deep learning frameworks are available for different programming languages.
The following picture shows an overview of open source deep learning frameworks and evaluates several characteristics:

These frameworks have in common that they are built for data scientists, i.e. personas with experience in programming, statistics, mathematics and machine learning. Note that writing the source code is not a big task. Typically, only a few lines of codes are needed to build an analytic model. This is completely different from other development tasks like building a web application, where you write hundreds or thousands of lines of code. In Deep Learning – and Data Science in general – it is most important to understand the concepts behind the code to build a good analytic model.
Some nice open source tools like KNIME or RapidMinerallow visual coding to speed up development and also encourage citizen data scientists (i.e. people with less experience) to learn the concepts and build deep networks. These tools use own deep learning implementations or other open source libraries like H2O.ai or DeepLearning4j as embedded framework under the hood.
If you do not want to build your own model or leverage existing pre-trained models for common deep learning tasks, you might also take a look at the offerings from the big cloud providers, e.g. AWS Polly for Text-to-Speech translation, Google Vision API for Image Content Analysis, or Microsoft’s Bot Framework to build chat bots. The tech giants have years of experience with analysing text, speech, pictures and videos and offer their experience in sophisticated analytic models as a cloud service; pay-as-you-go. You can also improve these existing models with your own data, e.g. train and improve a generic picture recognition model with pictures of your specific industry or scenario.
No matter if you want to use “just” a framework in your favourite programming language or a visual coding tool: You need to be able to make decisions based on the built neural network. This is where visual analytics comes into play. In short, visual analytics allows any persona to make data-driven decisions instead of listening to gut feeling when analysing complex data sets. See “Using Visual Analytics for Better Decisions – An Online Guide” to understand the key benefits in more detail.
A business analyst does not understand anything about deep learning, but just leverages the integrated analytic model to answer its business questions. The analytic model is applied under the hood when the business analyst changes some parameters, features or data sets. Though, visual analytics should also be used by the (citizen) data scientist to build the neural network. See “How to Avoid the Anti-Pattern in Analytics: Three Keys for Machine …” to understand in more details how technical and non-technical people should work together using visual analytics to build neural networks, which help solving business problems. Even some parts of data preparation are best done within visual analytics tooling.
From a technical perspective, Deep Learning frameworks (and in a similar way any other Machine Learning frameworks, of course) can be integrated into visual analytics tooling in different ways. The following list includes a TIBCO Spotfire example for each alternative:
All options have in common that you need to add configuration of some hyper-parameters, i.e. “high level” parameters like problem type, feature selection or regularization level. Depending on the integration option, this can be very technical and low level, or simplified and less flexible using terms which the business analyst understands.
Let’s take one specific category of neural networks as example: Autoencoders to find anomalies. Autoencoder is an unsupervised neural network used to replicate the input dataset by restricting the number of hidden layers in a neural network. A reconstruction error is generated upon prediction. The higher the reconstruction error, the higher is the possibility of that data point being an anomaly.
Use Cases for Autoencoders include fighting financial crime, monitoring equipment sensors, healthcare claims fraud, or detecting manufacturing defects. A generic TIBCO Spotfire template is available in the TIBCO Community for free. You can simply add your data set and leverage the template to find anomalies using Autoencoders – without any complex configuration or even coding. Under the hood, the template uses H2O.ai’s deep learning implementation and its R API. It runs in a local instance on the machine where to run Spotfire. You can also take a look at the R code, but this is not needed to use the template at all and therefore optional.
Let’s use the Autoencoder for a real-world example. In telco, you have to analyse the infrastructure continuously to find problems and issues within the network. Best before the failure happens so that you can fix it before the customer even notices the problem. Take a look at the following picture, which shows historical data of a telco network:

The orange dots are spikes which occur as first indication of a technical problem in the infrastructure. The red dots show a constant failure where mechanics have to replace parts of the network because it does not work anymore.
Autoencoders can be used to detect network issues before they actually happen. TIBCO Spotfire is uses H2O’s autoencoder in the background to find the anomalies. As discussed before, the source code is relative scarce. Here is the snipped of building the analytic model with H2O’s Deep Learning R API and detecting the anomalies (by finding out the reconstruction error of the Autoencoder):

This analytic model – built by the data scientist – is integrated into TIBCO Spotfire. The business analyst is able to visually analyse the historical data and the insights of the Autoencoder. This combination allows data scientists and business analysts to work together fluently. It was never easier to implement predictive maintenance and create huge business value by reducing risk and costs.
This article focuses on building deep learning models with Data Science Frameworks and Visual Analytics. Key for success in projects is to apply the build analytic model to new events in real time to add business value like increasing revenue, reducing cost or reducing risk.
“How to Apply Machine Learning to Event Processing” describes in more detail how to apply analytic models to real time processing. Or watch the corresponding video recording leveraging TIBCO StreamBase to apply some H2O models in real time. Finally, I can recommend to learn about various streaming analytics frameworks to apply analytic models.
Let’s come back to the Autoencoder use case to realize predictive maintenance in telcos. In TIBCO StreamBase, you can easily apply the built H2O Autoencoder model without any redevelopment via StreamBase’ H2O connector. You just attach the Java code generated by H2O framework, which contains the analytic model and compiles to very performant JVM bytecode:

The most important lesson learned: Think about the execution requirements before building the analytic model. What performance do you need regarding latency? How many events do you need to process per minute, second or millisecond? Do you need to distribute the analytic model to a clusters with many nodes? How often do you have to improve and redeploy the analytic model? You need to answer these questions at the beginning of your project to avoid double efforts and redevelopment of analytic models!
Another important fact is that analytic models do not always need “real time processing” in terms of very fast and / or frequent model execution. In the above telco example, these spikes and failures might happen in subsequent days or even weeks. Thus, in many use cases, it is fine to apply an analytic model once a day or week instead of just every second to every new event, therefore.
Deep Learning allows to solve many well understood problems like cross selling, fraud detection or predictive maintenance in a more efficient way. In addition, you can solve additional scenarios, which were not possible to solve before, like accurate and efficient object detection or speech-to-text translation.
Visual Analytics is a key component in Deep Learning projects to be successful. It eases the development of deep neural networks by (citizen) data scientists and allows business analysts to leverage these analytic models to find new insights and patterns.
Today, (citizen) data scientists use programming languages like R or Python, deep learning frameworks like Theano, TensorFlow, MXNet or H2O’s Deep Water and a visual analytics tool like TIBCO Spotfire to build deep neural networks. The analytic model is embedded into a view for the business analyst to leverage it without knowing the technology details.
In the future, visual analytics tools might embed neural network features like they already embed other machine learning features like clustering or logistic regression today. This will allow business analysts to leverage Deep Learning without the help of a data scientist and be appropriate for simpler use cases.
However, do not forget that building an analytic model to find insights is just the first part of a project. Deploying it to real time afterwards is as important as second step. Good integration between tooling for finding insights and applying insights to new events can improve time-to-market and model quality in data science projects significantly. The development lifecycle is a continuous closed loop. The analytic model needs to be validated and rebuild in certain sequences.
Content retrieved from: https://www.datavizualization.datasciencecentral.com/blog/open-source-deep-learning-frameworks-and-visual-analytics.
Circa 1997, the reigning world chess champion Garry Kasparov was against an unknown opponent. The opponent was formidable. Garry was not playing a human. He was playing the game with IBM’s behemoth supercomputer, Deep Blue.
Garry had beaten the opponent in the last few games. However, the game played on 11th May 1997 game was different. Garry lost the game. Deep Blue made history:
The First computer program to defeat a world champion in a match under tournament regulations.
This game was significant for many reasons. It caught the world imagination. It laid the foundation for many possibilities that will shape the world of AI. Like explorers, data scientists and software engineers embarked on the relatively unchartered territory of Deep Learning.
Fast forward 2018, Deep Learning is a buzz word. According to Gartner, Deep Learning has already crossed the innovation trigger stage. It has reached the stage of the peak of inflated expectation.

It will be another few years before this technology goes mainstream. However, the applications of deep learning have already permeated in our lives.
This article is a primer for deep learning. It attempts to provide a simple explanation of the fundamental concepts. It discusses the reason for its rise and touches upon few applications of Deep Learning.
Let us first classify deep learning in the world of Artificial Intelligence.

As depicted in the figure above, deep learning is a sub-set of Machine Learning. Machine Learning is in itself a subset of Artificial Intelligence (AI)AI is a field that enables machines to become intelligent progressively.
That intelligence can manifest in many ways. Let us understand how deep learning systems manifests itself.

Imagine a system that enables to identify customer churn for a telco organization. One way to design this system is to craft rules about how to determine who will churn. A series of business rules are hand-crafted for a specific purpose, i.e., identify customers who will churn. Creating lot of rules is an arduous task. There are a lot of factors and their permutations. Rules are also prone to frequent changes. As the customer profile changes or the business model changes, these rules need to be altered.
This is a rudimentary form of AI. A rule-based system.

Another way to identify customer churn would create statistical learning models. They learn it from past churn information. These models take some inputs a.k.a features. These features impact customer churn. They predict if customer churns or not.
These models are Machine Learning models. They learn from the past data and input features. They adapt to the characteristics of input data changes.
Note that these machine learning models rely on humans to provide the input features. For the model to be effective, the input feature needs to be useful. They rely on the intuition and domain knowledge of the modeler. The modeler will have to feed the machine learning model with the correct features. It asks for the right representation of the data.
This is a Machine Learning based AI systems.

A traditional machine learning model works fine as long as the representation of the data is congruous to the expected output. However, when the number of potential features grows, identifying right input features becomes a challenge. Machine Learning practitioner also call this challenge the curse of dimensionality. In a traditional machine learning model development, a lot of time is spent on feature engineering.
In the example of customer churn, a lot more features impact customer churn. Some of these features are unknown. Some of them are derived.
What if these features can be learned automatically?
Such scenarios, where there are a lot of unknown features is where a deep learning based system shines. A deep learning based system automatically learns the relevant characteristics that cause the churn. It acquires the right representation of data.
The process of “learning the features automatically” is called as Representational Learning.
A deep learning based system automatically learns the relevant features to solve a machine learning task. That’s great! But how does it do it?
The building block of a deep learning network is a machine learning algorithm called neural networks.
Deep neural networks are the cornerstone algorithms that make deep learning happen. A neural net consists of a lot of simple processing interconnected nodes.

A deep neural network has three types of layers:
Neural networks work on a simple to complex pattern recognition. They learn simple features in the first layers of the net. Some nodes are activated based on defined thresholds. These activated nodes input into the subsequent layers of the network. In the following layers, it combines those features to derive other sophisticated features. The process goes on until it computes the final output in the output layer.
Deep Learning has been around for quite some time.
Why is deep learning becoming popular now?
In 1943, Warren McCulloch wrote a paper on neurons might work. However, in the earlier years, the progress and adoption of the neural network were impeded by two significant limitations:
With the advent of Big Data and cloud computing, these limitations were no longer an impediment.

As shown in the figure above, the computing power increased by 10,000 times since the year 2000. The cost of storing the data has also gone down by around 3000 times since the year 2000. There has been an exponential growth of data created due to the rise of the internet, the smartphone revolution and the social media. Data is ubiquitously available now. These three ingredients created a milieu for a perfect storm for deep learning. Deep Learning saw a rekindled interest in research and a resurgence in adoption.
It turns out that deep learning frameworks are efficient to carry out tasks that humans excel at. Humans excel in tasks like image recognition, speech translation, and recognition. Humans are good at recognizing patterns in images and identify specific objects. Humans are good at processing languages, understanding them and classifying them into intents and entities. Deep Learning networks excel in these kinds of tasks too. Major domains in which deep learning is used extensively are:
Computer vision is an interdisciplinary field that deals with how computers can be used for gaining understanding of images.
A few applications that use computer vision are:
– Object recognition: identifying or classifying objects in images or video streams.
– Face recognition: recognition faces in an image or video streams.
Natural Language Processing is the application of computational techniques to the analysis and synthesis of natural language and speech
Deep Learning frameworks have managed to beat humans in speech recognition. In Jan 2018, Microsoft’s and Alibaba’s speech recognition models were able to score more than humans. It was a challenge known as SQuAD, for Stanford Question Answering Dataset.
A few applications that use speech recognition are:
In this article, we touched upon the core components of deep learning. We discussed why it is on the rise and what are its important applications.
Deep Learning framework is at the center of the rise of Artificial Intelligence. It is an evolving field. It will continue to see growing adoption in coming years. Deep Learning applications will continue to transform the world we live.
This article was first published at www.datascientia.blog.
Content retrieved from: https://www.datasciencecentral.com/profiles/blogs/an-executive-primer-to-deep-learning.
Posted by Packt Publishing on June 13, 2018 at 12:30amThe performance of the neural network improves with an increasing volume of training data. With more and more devices generating data that can potentially be used for training and model generation, the models are getting better at generalizing the stochastic environment and handling complex tasks. However, with more data and more complex structures for the deep neural networks, the computational requirements increase.
Even though we have started leveraging GPUs for deep neural network training, the vertical scaling of the compute infrastructure has its own limitations and cost implications. Leaving the cost implications aside, the time it takes to train a significantly large deep neural network on a large set of training data is not reasonable. However, due to the nature and network topology of the neural networks, it is possible to distribute the computation on multiple machines at the same time and merge the results back with a centralized process. This is very similar to Hadoop, as a distributed computing batch processing engine, and Spark, as an in-memory distributed computing framework.
With deep neural networks, there are two approaches for leveraging distributed computing:


The data distribution approach is very similar to Hadoop’s MapReduce framework. The MapReduce job creates the input splits based on predefined and run-time configuration parameters. These chunks are sent to the independent nodes for processing by the map tasks in a parallel manner.
The output from the map tasks is shuffled for relevance (simple sort) and is given as input to the reduce tasks for generating intermediate results. The individual MapReduce chunks are combined to produce the final result. The data distribution approach is more naturally suitable for Hadoop and Spark frameworks and it is a more widely researched approach at this time. The deep neural networks that leverage data distribution primarily deploy a parameter-averaging strategy for training the model.
This is a simple but efficient approach for training a deep neural network with data distribution:

Based on these fundamental concepts of distributed processing, let’s review some of the popular libraries and frameworks that enable parallelized deep neural networks.
With an ever-increasing number of data sources and data volumes, it is imperative that the deep learning application and research leverage the power of distributed computing frameworks. In this section, we will review some of the libraries and frameworks that effectively leverage distributed computing. These are popular frameworks based on their capabilities, adoption level, and active community support.
The core framework of DL4J is designed to work seamlessly with Hadoop (HDFS and MapReduce) as well as Spark-based processing. It is easy to integrate DL4J with Spark. DL4J with Spark leverages data parallelism by sharding large datasets into manageable chunks and training the deep neural networks on each individual node in parallel. Once the models produce parameter values (weights and biases), those are iteratively averaged for producing the final outcome.
In order to train the deep neural networks on Spark using DL4J, two primary wrapper classes need to be used:
The network configuration process for the standard, as well as the distributed, mode remains same. That means we configure the network properties by creating a MultiLayerConfiguration instance. The workflow for deep learning on Spark with DL4J can be depicted as follows:

Here are the sample code snippets for the workflow steps:
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT).iterations(1)
.learningRate(0.1)
.updater(Updater.RMSPROP) //To configure: .updater(new RmsProp(0.95))
.seed(12345)
.regularization(true).l2(0.001)
.weightInit(WeightInit.XAVIER)
.list()
.layer(0, new GravesLSTM.Builder().nIn(nIn).nOut(lstmLayerSize).activation(Activation.TANH).build())
.layer(1, new GravesLSTM.Builder().nIn(lstmLayerSize).nOut(lstmLayerSize).activation(Activation.TANH).build())
.layer(2, new RnnOutputLayer.Builder(LossFunctions.LossFunction.MCXENT).activation(Activation.SOFTMAX) //MCXENT + softmax for classification
.nIn(lstmLayerSize).nOut(nOut).build())
.backpropType(BackpropType.TruncatedBPTT).tBPTTForwardLength(tbpttLength).tBPTTBackwardLength(tbpttLength)
.pretrain(false).backprop(true)
.build();
ParameterAveragingTrainingMaster tm = new ParameterAveragingTrainingMaster.Builder(examplesPerDataSetObject)
.workerPrefetchNumBatches(2) //Async prefetch 2 batches for each worker
.averagingFrequency(averagingFrequency)
.batchSizePerWorker(examplesPerWorker)
.build();
SparkDl4jMultiLayer sparkNetwork = new SparkDl4jMultiLayer(sc, config, tm);
public static JavaRDD<DataSet> getTrainingData(JavaSparkContext sc) throws IOException {
List<String> list = getTrainingDatAsList(); // arbitrary sample method
JavaRDD<String> rawStrings = sc.parallelize(list);
Broadcast<Map<Character, Integer>> bcCharToInt = sc.broadcast(CHAR_TO_INT);
return rawStrings.map(new StringToDataSetFn(bcCharToInt));
}
sparkNetwork.fit(trainingData);
mvn package
spark-submit –class fully qualified class name>> –num-executors 3 ./jar_name>>-1.0-SNAPSHOT.jar
The DeepLearning4j official website provides extensive documentation for running the deep neural networks on Spark: https://deeplearning4j.org/spark
TensorFlow is the most popular library created and open sourced by Google. It uses data-flow graphs for numerical computations and deals with Tensor as the basic building block. A Tensor can simply be considered as an n-dimensional matrix. TensorFlow applications can be seamlessly deployed across platforms and it can run on GPUs and CPUs, along with mobile and embedded devices. TensorFlow is designed as a large-scale distributed training that supports new machine learning models, research, and granular-level optimizations.
TensorFlow is quick to install and start experimenting with. The latest version of TensorFlow can be downloaded from https://www.tensorflow.org/. The site also contains extensive documentation and tutorials.
Further reading:
Distributed TensorFlow: Working with multiple GPUs and servers
Keras is a high-level neural network API, written in Python and capable of running on top of TensorFlow. For more information, refer to https://keras.io/.
TensorFlow and Keras hold the top two spots in terms of adoption and mention by researchers in scientific papers. The stack ranking of the frameworks and libraries as per arxiv.org is as follows:

You enjoyed an excerpt from Packt Publishing’s latest book, Artificial Intelligence for Big Data written by Anand Deshpande and Manish Kumar. If you are a Java developer, this is the book you will need to build next-generation Artificial Intelligence systems.
Content retrieved from: https://www.datasciencecentral.com/profiles/blogs/top-libraries-for-distributed-deep-learning.
Posted by Capri Granville on January 27, 2018 at 7:00pmGuest blog post by Kevin Jacobs.
MLPs (Multi-Layer Perceptrons) are great for many classification and regression tasks. However, it is hard for MLPs to do classification and regression on sequences. In this Python deep learning tutorial, a GRU is implemented in TensorFlow. Tensorflow is one of the many Python Deep Learning libraries.
By the way, another great article on Machine Learning is this article on Machine Learning fraud detection. If you are interested in another article on RNNs, you should definitely read this article on the Elman RNN.
A sequence is an ordered set of items and sequences appear everywhere. In the stock market, the closing price is a sequence. Here, time is the ordering. In sentences, words follow a certain ordering. Therefore, sentences can be viewed as sequences. A gigantic MLP could learn parameters based on sequences, but this would be infeasible in terms of computation time. The family of Recurrent Neural Networks (RNNs) solve this by specifying hidden states which do not only depend on the input, but also on the previous hidden state. GRUs are one of the simplest RNNs. Vanilla RNNs are even simpler, but these models suffer from the Vanishing Gradient problem.
[responsive_video type=’youtube’ hide_related=’1′ hide_logo=’0′ hide_controls=’0′ hide_title=’0′ hide_fullscreen=’0′ autoplay=’0′]https://www.youtube.com/watch?v=dFARw8Pm0Gk[/responsive_video]
The key idea of GRUs is that the gradient chains do not vanish due to the length of sequences. This is done by allowing the model to pass values completely through the cells. The model is defined as the following [1]:
I had a hard time understanding this model, but it turns out that it is not too hard to understand. In the definitions, is used as the Hadamard product, which is just a fancier name for element-wise multiplication.
is the Sigmoid function which is defined as
. Both the Sigmoid function (
) and the Hyperbolic Tangent function (
) are used to squish the values between
and
.
functions as a filter for the previous state. If
is low (near
), then a lot of the previous state is reused! The input at the current state (
) does not influence the output a lot. If
is high, then the output at the current step is influenced a lot by the current input (
), but it is not influenced a lot by the previous state (
).
functions as forget gate (or reset gate). It allows the cell to forget certain parts of the state.
In the code example, a simple task is used for testing the GRU. Given two numbers and
, their sum is computed:
. The numbers are first converted to reversed bitstrings. The reversal is also what most people would do by adding up two numbers. You start at the right from the number and if the sum is larger than
, you carry (memorize) a certain number. The model is capable of learning what to carry. As an example, consider the number
and
. In bitstrings (of length 3), we have
and
. In reversed bitstring representation, we have that
and
. The sum of these numbers is
in reversed bitstring representation. This is
in normal bitstring representation and this is equivalent to
. These are all the steps which are also done by the code automatically.
The code is self-explaining. If you have any questions, feel free to ask! The code can also be found on GitHub. Sharing (or Starring) is Caring :-)!

After ~2000 iterations, the model has fully learned how to add 2 integer numbers!
This Python deep learning tutorial showed how to implement a GRU in Tensorflow. The implementation of the GRU in TensorFlow takes only ~30 lines of code! There are some issues with respect to parallelization, but these issues can be resolved using the TensorFlow API efficiently. In this tutorial, the model is capable of learning how to add two integer numbers (of any length).
To access the source code and view the original article, click here.
DSC Resources
Popular Articles
Content retrieved from: https://www.datasciencecentral.com/profiles/blogs/gru-implementation-in-tensorflow.
Posted by William Vorhies on April 10, 2018 at 8:18am
View Blog
Summary: There are several things holding back our use of deep learning methods and chief among them is that they are complicated and hard. Now there are three platforms that offer Automated Deep Learning (ADL) so simple that almost anyone can do it.

There are several things holding back our use of deep learning methods and chief among them is that they are complicated and hard.
A small percentage of our data science community has chosen the path of learning these new techniques, but it’s a major departure both in problem type and technique from the predictive and prescriptive modeling that makes up 90% of what we get paid to do.
Artificial intelligence, at least in the true sense of image, video, text, and speech recognition and processing is on everyone’s lips but it’s still hard to find a data scientist qualified to execute your project.
Actually when I list image, video, text, and speech applications I’m selling deep learning a little short. While these are the best known and perhaps most obvious applications, deep neural nets (DNNs) are also proving excellent at forecasting time series data, and also in complex traditional consumer propensity problems.
Last December as I was listing my predictions for 2018, I noted that Gartner had said that during 2018, DNNs would become a standard component in the toolbox of 80% of data scientists. My prediction was that while the first provider to accomplish this level of simplicity would certainly be richly rewarded, no way was it going to be 2018. It seems I was wrong.
Here we are and it’s only April and I’ve recently been introduced to three different platforms that have the goal of making deep learning so easy, anyone (well at least any data scientist) can do it.
[responsive_video type=’youtube’ hide_related=’1′ hide_logo=’0′ hide_controls=’0′ hide_title=’0′ hide_fullscreen=’0′ autoplay=’0′]https://www.youtube.com/watch?v=_2EHcpg52uU[/responsive_video]
Minimum Requirements
All of the majors and several smaller companies offer greatly simplified tools for executing CNNs or RNN/LSTMs, but these still require experimental hand tuning of the layer types and number, connectivity, nodes, and all the other hyperparameters that so often defeat initial success.
To be part of this group you need a truly one-click application that allows the average data scientists or even developer to build a successful image or text classifier.
The quickest route to this goal is by transfer learning. In DL, transfer learning means taking a previously built successful, large, complex CNN or RNN/LSTM model and using a new more limited data set to train against it.
Basically transfer learning, most used in image classification, summarizes the more complex model into fewer or previously trained categories. Transfer learning can’t create classifications that weren’t in the original model, but it can learn to create subsets or summary categories of what’s there.
The advantage is that the hyperparameter tuning has already been done so you know the model will train. More importantly, you can build a successful transfer model with just a few hundred labeled images in less than an hour.
The real holy grail of AutoDL however, is fully automated hyperparameter tuning, not transfer learning. As you’ll read below, some are on track, and others claim to already have succeeded.
[responsive_video type=’youtube’ hide_related=’1′ hide_logo=’0′ hide_controls=’0′ hide_title=’0′ hide_fullscreen=’0′ autoplay=’0′]https://www.youtube.com/watch?v=f4XBxNuEifQ[/responsive_video]
Microsoft CustomVision.AI
Late in 2017 MS introduced a series of greatly simplified DL capabilities covering the full range of image, video, text, and speech under the banner of the Microsoft Cognitive Services. In January they introduced their fully automated platform, Microsoft Custom Vision Services (https://www.customvision.ai/).
The platform is limited to image classifiers and promises to allow users to create robust CNN transfer models based on only a few images capitalizing on MS’s huge existing library of large, complex, multi-image classifiers.
Using the platform is extremely simple. You drag and drop your images onto the platform and press go. You’ll need at least a pay-as-you-go Azure account and basic tech support runs $29/mo. It’s not clear how long the models take to train but since it’s transfer learning it should be quite fast and therefore, we’re guessing, inexpensive (but not free).
During project setup you’ll be asked to identify a general domain from which your image set will transfer learn and these currently are:
While all these models will run from a restful API once trained, the last three categories (marked ‘compact’) can be exported to run off line on any iOS or Android edge device. Export is to the CoreML format for iOS 11 and to the TensorFlow format for Android. This should entice a variety of app developers who may not be data scientist to add instant image classification to their device.
You can bet MS will be rolling out more complex features as fast as possible.
Google Cloud AutoML
[responsive_video type=’youtube’ hide_related=’1′ hide_logo=’0′ hide_controls=’0′ hide_title=’0′ hide_fullscreen=’0′ autoplay=’0′]https://www.youtube.com/watch?v=GbLQE2C181U[/responsive_video]
Also in January, Google announced its similar entry Cloud AutoML. The platform is in alpha and requires an invitation to participate.
Like Microsoft, the service utilizes transfer learning from Google’s own prebuilt complex CNN classifiers. They recommend at least 100 images per label for transfer learning.

It’s not clear at this point what categories of images will be allowed at launch, but user screens show guidance for general, face, logo, landmarks, and perhaps others. From screen shots shared by Google it appears these models train in the range of about 20 minutes to a few hours.
In the data we were able to find, use appears to be via API. There’s no mention of export code for offline use. Early alpha users include Disney and Urban Outfitters.
Anticipating that many new users won’t have labeled data, Google offers access to its own human-labeling services for an additional fee.
Beyond transfer learning, all the majors including Google are pursuing automated ways of automating the optimal tuning of CNNs and RNNs. Handcrafted models are the norm today and are the reason so many often unsuccessful iterations are required.

Google calls this next technology Learn2Learn. Currently they are experimenting with RNNs to optimize layers, layer types, nodes, connections, and the other hyperparameters. Since this is basically very high speed random search the compute resources can be extreme.
Next on the horizon is the use of evolutionary algorithms to do the same which are much more efficient in terms of time and compute. In a recent presentation, Google researchers showed good results from this approach but they were still taking 3 to 10 days to train just for the optimization.
[responsive_video type=’youtube’ hide_related=’1′ hide_logo=’0′ hide_controls=’0′ hide_title=’0′ hide_fullscreen=’0′ autoplay=’0′]https://www.youtube.com/watch?v=c3a4OilFeAM[/responsive_video]
OneClick.AI
OneClick.AI is an automated machine learning (AML) platform new in the market late in 2017 which includes both traditional algorithms and also deep learning algorithms.
OneClick.AI would be worth a look based on just its AML credentials which include blending, prep, feature engineering, and feature selection, followed by the traditional multi-models in parallel to identify a champion model.
However, what sets OneClick apart is that it includes both image and text DL algos with both transfer learning as well as fully automated hyperparameter tuning for de novo image or text deep learning models.
Unlike Google and Microsoft they are ready to deliver on both image and text. Beyond that, they blend DNNs with traditional algos in ensembles, and use DNNs for forecasting.
Forecasting is a little explored area of use for DNNs but it’s been shown to easily outperform other times series forecasters like ARIMA and ARIMAX.
For a platform with this complex offering of tools and techniques it maintains its claim to super easy one-click-data-in-model-out ease which I identify as the minimum requirement for Automated Machine Learning, but which also includes Automated Deep Learning.
The methods used for optimizing its deep learning models are proprietary, but Yuan Shen, Founder and CEO describes it as using AI to train AI, presumably a deep learning approach to optimization.
[responsive_video type=’youtube’ hide_related=’1′ hide_logo=’0′ hide_controls=’0′ hide_title=’0′ hide_fullscreen=’0′ autoplay=’0′]https://www.youtube.com/watch?v=yofjFQddwHE[/responsive_video]
Which is Better?
It’s much too early to expect much in the way of benchmarking but there is one example to offer, which comes from OneClick.AI.
In a hackathon earlier this year the group tested OneClick against Microsoft’s CustomVision (Google AutoML wasn’t available). Two image classification problems were tested. Tagging photos with:
Horses running or horses drinking water.

Detecting photos with nudity.

The horse tagging task was multi-label classification, and the nudity detection task was binary classification. For each task they used 20 images for training, and another 20 for testing.
This lacks statistical significance and uses only a very small sample in transfer learning. However the results look promising.
This is transfer learning. We’re very interested to see comparisons of the automated model optimization. OneClick’s is ready now. Google should follow shortly.
You may also be asking, where is Amazon in all of this? In our search we couldn’t find any reference to a planned AutoDL offering, but it can’t be far behind.
Content retrieved from: https://www.datasciencecentral.com/profiles/blogs/automated-deep-learning-so-simple-anyone-can-do-it.
Our team of global experts have compiled this list of the 8 Best Deep Learning Certification, Course, Training and Tutorial available online in 2018 to help you Learn Deep Learning. These are suitable for beginners, intermediate learners as well as experts.
Contents
This is undoubtedly one of the most sought after deep learning certifications with Andrew Ng himself teaching the subject. The Co Founder of Global Learning Platform Coursera, Andrew has been the head of Google Brain and Baidu AI group in the past. Joining him are Teaching Assistants, Younes Bensouda Mourri from Mathematical & Computational Sciences at Stanford University and Kian Katanforoosh, Adjunct Lecturer at Stanford University. All in all, we have no doubt in proclaiming this as the Best Deep Learning Certification out there. In this certification course, you will learn about the foundations of Deep Learning, know how to build neural networks and understand all about machine learning projects. There will be real time case studies including sign language reading, music generation and natural language processing among others. Along with all the theory, you will be taught to implement these concepts in Python and TensorFlow.
Rating : 4.7 out of 5
Review : Course content is very good. Andrew Ng’s style of teaching is phenomenal. He has a knack for uncomplicating an otherwise complex subject matter. Highly recommended for anyone who is trying to understand the fundamentals of neural networks and deep learning.
Jose Marcial Portilla has an MS from Santa Clara University and has been teaching Data Science and programming for multiple years now. His training will help you learn how to use Google’s Deep Learning Framework – TensorFlow with Python. He will also teach you how you can use TensorFlow for Image Classification with Convolutional Neural Networks, how to do time series analysis with Recurrent Neural Networks and teach you to solve unsupervised learning problems with AutoEncoders. This training has been attended by close to 20,000 students and has got remarkable reviews and ratings.
Rating : 4.6 out of 5
Review – Excellent course. Portilla sets a pedagogical curve. Responsive Q&A, and reliable and regularly updated course materials are made available. Good foundation to a broad array of well-established and cutting-edge topics, and many useful external resources provided. – Jack Rasmus-Vorrath
A whooping 72,000 students have attended this training course on Deep Learning. Kirill Eremenko, Hadelin de Ponteves and the SuperDataScience Team, they are pros when it comes to matters of deep learning, data science and machine learning. Even basic high school level mathematics is enough for you to get started with this course and in the 23 hours of on demand video, the trainers will take you through all the necessary knowledge and information required by you to become proficient at deep learning. Specifically, you will learn about the intuition behind Artificial Neural Networks and Convolutional Networks, appying Artificial Neural Networks and Convolutional Networks in practice and much more around Recurrent Neural Networks, Self Organizing Maps and Boltzmann Machines. This is ideally one of the best deep learning course you will find out there.
[responsive_video type=’youtube’ hide_related=’1′ hide_logo=’0′ hide_controls=’0′ hide_title=’0′ hide_fullscreen=’0′ autoplay=’0′]https://www.youtube.com/watch?v=75qvwRXZjz4[/responsive_video]
Rating : 4.5 out of 5
Review – Very nice course! These two instructors know how to explain difficult concepts in simple terms. Kirill is an intuition god, and Hadelin explains every single line of code as you go through the examples. I feel comfortable enough to apply what I learned in this course in my own project. I definitely recommend this course to anyone who wants to understand the basic deep learning concepts and how they are implemented in the real world. – Raoul Noumbissi
The trainer is a data scientist, big data engineer as well as a full stack software engineer. He has a masters degree in computer engineering with a specialization in machine learning and pattern recognition. With a CV like that, you should already feel assured with the quality of teaching for this deep learning program. This course will be like a complete guide on deriving and implementing GLoVe, word2vec and word embeddings. You will also be taught how to understand and implement recursive neural tensor networks for sentiment analysis. At 6 hours, this is a good crash course for those with not enough time on their hands.
Rating : 4.6 out of 5
Review – Instructor explains things vividly and with detail unlike some other instructors of machine learning. I recommend this course for serious data scientists. – Xiao Qiao
In this deep learning training spanning 7.5 hours, with full lifetime access, you will learn to apply momentum to back propagation to train neural networks, apply adaptive learning rate procedures like AdaGrad, RMSprop, and Adam, understand the basic building blocks of Theano and then build a neural network in Theano. In addition to understanding TensorFlow, you will also write a neural network using Keras, PyTorch, CNTK and MXNet. In order to attend this program, you will have to be comfortable with Python, Numpy, and Matplotlib. You will have to install Theano and TensorFlow before or during the training.
Rating : 4.6 out of 5
Review – Clear and consistent. Goes over enough of the pre-requisites each course to refresh and remind the student of the foundations and then breaks new ground. Courses are updated often and stay current with the latest versions of the imports API’s. – Bill Hicks
In this deep learning certification by Microsoft, you will learn an intuitive approach to building complex models that help machines solve real problems. You will need to have basic programming skills, working knowledge of data science before signing up in order to make the most of this program. This course tries to enable engineers / data scientists and technology managers to eventually develop smart understanding of this technology. You will be taught how to use the Microsoft Cognitive Toolkit (CNTK) to tap into data sets through deep learning. Course is taught by Jonathan Sanito, Senior Content Developer at Microsoft, Sayan Pathak Principal ML Scientist and AI School Instructor, CNTK team and Roland Fernandez, Senior Researcher and AI School Instructor, Deep Learning Technology Center, Microsoft Research AI.
Rating : 4.4 out of 5
This program will serve as a guide for writing a neural network in Python and Numpy using Google’s TensorFlow. The trainer will teach you about how deep learning really works and how a neural network is built from basic building blocks. He will help you demystify various terms related to neural networks like “activation”, “backpropagation” and “feedforward”. There is a live project which is a part of the course to help you implement what you learn in real time.
Rating : 4.6 out of 5
[responsive_video type=’vimeo’]https://vimeo.com/162437052[/responsive_video]
Review – This is a very honest course taught by somebody who clearly understands the subject in great depth. As somebody who has been playing around with Keras, Scikit learn and Tensorflow for over a year, I have learned a huge amount through implementing models taught in this course just using Python. – Malcolm Mason
Know all there is to know about the simple recurrent unit (Elman unit), GRU (gated recurrent unit), LSTM (long short-term memory unit) and also figure out how to write various recurrent networks in Theano in this course around recurrent neural networks in Python. To take up this program, you should know about backpropagation, understand Calculus and Linear Algebra. 10,000+ students have attended this course with great reviews and high ratings.
Rating : 4.6 out of 5
Review – This is the best intro to RNN that I have seen so far, much better than Udacity version in the Deep Learning Nanodegree. I really like the emphasis on the math: although it is not deep but it is clear enough so one get some mathematical intuitions on the working of the Recurrent unit. – Jean-Marc Beaujour
So that was our take on the Best Deep Learning courses, tutorials, certifications and training, specially for 2018. Do check out Best Machine Learning Online Course to dive deep into the domain and also Blockchain Training along with Best Python Certification. Since all these courses can be attended online, you have the benefit of carrying on learning from just about anywhere on the planet. We wish you all the best in your career! Team Digital Defynd.
Content retrieved from: https://digitaldefynd.com/deep-learning-courses-training-tutorial-certifications/.
Posted by Ronald van Loon on May 7, 2018 at 11:00pmDeep learning is a sub-category within machine learning and artificial intelligence. It is inspired by and based on the model of the human brain to create artificial neural networks for machines. Deep learning will allow machines and devices to function in some ways as humans do.
Dr. Rodrigo Agundez of GoDataDriven is co-author of this article and very enthusiastic about the improvements that deep learning can offer. He’s been involved in the data science and analysis field for some time, and is already working on implementing models for practical applications.
Rodrigo notes that the new generation of users wants to interact with devices and appliances in a human-like manner. Take the example of Apple’s Siri, which allows for voice command and voice recognition. Communicating with Siri is similar to interacting with a human.
The user interface for Siri seems simple enough. However, the A.I. algorithms that are designed on the back-end are quite complex.
Designing this kind of interaction with a machine was not possible a few years ago. System designers now have access to complex deep learning algorithms that makes it possible to integrate such behavior into machines.
Artificial Intelligence will never truly come of age without giving machines the powerful capabilities of deep learning.
The idea of designing deep learning models can be difficult to grasp for many people. This is because understanding human concepts comes naturally to us. But giving the same ability to machines is a very complex process of design.
One way to do it is by structuring data in a way that makes it easier to process for machines. Take the word “fat” for instance. If we say to a friend, “This burger has too much fat,” they would understand what we mean and the word would have a negative connotation here. But if we told a friend that “I would love to get fat from this meal any day,” the word would mean something entirely different.
Creating machines that are capable of understanding minute differences in words embedded in a context may seem like a very small thing, but requires a very large set of data and complex algorithms to execute.
One way to differentiate between traditional machine learning and deep learning is through the use of features. These are the characteristics of the data that help us differentiate and identify one entity from another.
To understand features better, take the example of a normal bank transaction. Features of the transaction help us identify the timing of the transaction, the value transferred, names of the parties to the transaction, and other important information.
In a traditional machine learning model, features have to be designed by humans. In a deep learning model, features are identified by the A.I. itself.
We can take another example of differences between a cat and a dog. If we showed a person a cat and a dog and asked them to point to the cat, they would immediately identify it. However, if the same person was asked to identify the exact features that differentiate the two, they would have a problem. Both creatures have four legs, a body, a tail, and a head. They appear very similar in terms of features. Humans can distinguish one from the other in an instant. Yet, they would have trouble identifying the features that differentiate any pair of a cat and a dog.
This is a problem that data scientists and A.I. developers hope to solve with deep learning. Features can be found even in unstructured data with the help of deep learning algorithms.
Rodrigo states that deep learning models are superior at certain A.I. characteristics than any traditional machine learning models, as the models has shown its effectiveness. This can be traced back to 2012 where in a known online image recognition challenge, a deep learning algorithm proved to be twice as effective as any other algorithm before.
If an A.I. model reaches an accuracy of 50%, the device would not be very practical for use. Take the example of automobiles. A person would not trust getting in a car where brakes work 50% of the time.
However, if the accuracy of an A.I. system reaches values around 95%, it would be much more reliable and robust for practical use. Rodrigo believes that this level of accuracy for human-like tasks can only be achieved with deep learning algorithms.
Deep learning can be applied to speech recognition to improve customer experience. Speech recognition technology has been around for quite some time, but it didn’t cross the accuracy boundary to become a marketable product until the introduction of deep learning models.
Home automation systems and devices work through voice command. This is an area where deep learning can significantly improve customer experience.
Royal FloraHolland is the biggest horticulture marketplace and knowledge center in the world. An essential part of their process is having the correct photographs of the flower or plants uploaded by suppliers. These photos need to have a plant, some images require a ruler to be visible or a tray to be present.
The task of sorting through all these photographs manually and quickly is basically impossible, therefore it was decided to implement A.I. for the process.
GoDataDriven designed a system with deep learning algorithms to automate the checking of the images. The system can accurately identify and sort pictures taken from different angles and devices.
The system removed the need for manual human review and completely automated the process for the company.
Deep learning algorithms were developed for UMCG with collaboration from GoDataDriven, Google and Siemens. This involved the use of MRI data in a 4D format (volume + time). Using deep learning models, the team calculated the heart ventricles volumes evolution over time.

One of the project goals is to assist in the decision making regarding pace makers and treatments. For example, it could take the heart cycle and volumes into consideration for prognosis and heart failure.
More than 400 images were taken per patient for different hearth depths across time. The team at GoDataDriven and Siemens developed multiple models, including binary and multi-class segmentation.

The model based on the U-Net deep learning architecture takes the MRI scan as input and outputs the corresponding volumes.
Traditionally, the process is done manually by looking at the scans and interpreting the results through hand-drawn diagrams.
Deep learning provides a way for companies to develop life-long learning modules. When more complex and richer algorithms are developed on top of pre-existing ones, companies will be able to achieve incremental growth.
Rodrigo believes that deep learning has a bright future because of its open source community and accessible platforms. Major corporations such as Apple which had built their systems on secrecy are finally coming around to the open-source model.
The main reason they are switching now is because they find deep learning talent acquisition more difficult in comparison with open source companies, such as Google’s Deep Mind for example. A company could have developed the most amazing and efficient deep learning system but if they don’t publish their research and share the knowledge online, talented data scientists and deep learning practitioners will not apply to this companies.
Currently deep learning teams like Google Brain, Google Deep Mind and companies like Facebook and Baidu find it much easier to hire talented deep learning practitioners. They continuously publish research and open source the related implementations, such that the deep learning is reminded that these companies are at the cutting edge of these technologies.
Since the shift is towards open source and global adaptation of this technology, deep learning is likely to do well in the future and impact vast sectors of in our society. T
Dr. Rodrigo Agundez

Rodrigo Agundez is Data Scientist and Deep Learning specialist for GoDataDriven. Rodrigo has worked as a consultant in numerous artificial intelligence projects and has given multiply deep learning trainings and workshops inside and outside The Netherlands. If you would like to know more about the exciting world of deep learning don’t hesitate to contact him via LinkedIn or Twitter.
Ronald van Loon
Ronald van Loon is Director at Adversitement, and an Advisory Board Member and Big Data & Analytics course advisor for Simplilearn. He contributes his expertise towards the rapid growth of Simplilearn’s popular Big Data & Analytics category.
If you would like to read more from Ronald van Loon on the possibilities of Big Data and the Internet of Things (IoT), please click “Follow” and connect on LinkedIn, Twitter, and YouTube.
Content retrieved from: https://www.datasciencecentral.com/profiles/blogs/how-deep-learning-will-change-customer-experience.

In this article, I clarify the various roles of the data scientist, and how data science compares and overlaps with related fields such as machine learning, deep learning, AI, statistics, IoT, operations research, and applied mathematics. As data science is a broad discipline, I start by describing the different types of data scientists that one may encounter in any business setting: you might even discover that you are a data scientist yourself, without knowing it. As in any scientific discipline, data scientists may borrow techniques from related disciplines, though we have developed our own arsenal, especially techniques and algorithms to handle very large unstructured data sets in automated ways, even without human interactions, to perform transactions in real-time or to make predictions.

1. Different Types of Data Scientists
Recently (August 2016) Ajit Jaokar discussed Type A (Analytics) versus Type B (Builder) data scientist:
I also wrote about the ABCD’s of business processes optimization where D stands for data science, C for computer science, B for business science, and A for analytics science. Data science may or may not involve coding or mathematical practice, as you can read in my article on low-level versus high-level data science. In a startup, data scientists generally wear several hats, such as executive, data miner, data engineer or architect, researcher, statistician, modeler (as in predictive modeling) or developer.
While the data scientist is generally portrayed as a coder experienced in R, Python, SQL, Hadoop and statistics, this is just the tip of the iceberg, made popular by data camps focusing on teaching some elements of data science. But just like a lab technician can call herself a physicist, the real physicist is much more than that, and her domains of expertise are varied: astronomy, mathematical physics, nuclear physics (which is borderline chemistry), mechanics, electrical engineering, signal processing (also a sub-field of data science) and many more. The same can be said about data scientists: fields are as varied as bioinformatics, information technology, simulations and quality control, computational finance, epidemiology, industrial engineering, and even number theory.
In my case, over the last 10 years, I specialized in machine-to-machine and device-to-device communications, developing systems to automatically process large data sets, to perform automated transactions: for instance, purchasing Internet traffic or automatically generating content. It implies developing algorithms that work with unstructured data, and it is at the intersection of AI (artificial intelligence,) IoT (Internet of things,) and data science. This is referred to as deep data science. It is relatively math-free, and it involves relatively little coding (mostly API’s), but it is quite data-intensive (including building data systems) and based on brand new statistical technology designed specifically for this context.
Prior to that, I worked on credit card fraud detection in real time. Earlier in my career (circa 1990) I worked on image remote sensing technology, among other things to identify patterns (or shapes or features, for instance lakes) in satellite images and to perform image segmentation: at that time my research was labeled as computational statistics, but the people doing the exact same thing in the computer science department next door in my home university, called their research artificial intelligence. Today, it would be called data science or artificial intelligence, the sub-domains being signal processing, computer vision or IoT.
Also, data scientists can be found anywhere in the lifecycle of data science projects, at the data gathering stage, or the data exploratory stage, all the way up to statistical modeling and maintaining existing systems.
2. Machine Learning versus Deep Learning
Before digging deeper into the link between data science and machine learning, let’s briefly discuss machine learning and deep learning. Machine learning is a set of algorithms that train on a data set to make predictions or take actions in order to optimize some systems. For instance, supervised classification algorithms are used to classify potential clients into good or bad prospects, for loan purposes, based on historical data. The techniques involved, for a given task (e.g. supervised clustering), are varied: naive Bayes, SVM, neural nets, ensembles, association rules, decision trees, logistic regression, or a combination of many. For a detailed list of algorithms, click here. For a list of machine learning problems, click here.
All of this is a subset of data science. When these algorithms are automated, as in automated piloting or driver-less cars, it is called AI, and more specifically, deep learning. Click here for another article comparing machine learning with deep learning. If the data collected comes from sensors and if it is transmitted via the Internet, then it is machine learning or data science or deep learning applied to IoT.
Some people have a different definition for deep learning. They consider deep learning as neural networks (a machine learning technique) with a deeper layer. The question was asked on Quora recently, and below is a more detailed explanation (source: Quora)
What is the difference between machine learning and statistics?
This article tries to answer the question. The author writes that statistics is machine learning with confidence intervals for the quantities being predicted or estimated. I tend to disagree, as I have built engineer-friendly confidence intervals that don’t require any mathematical or statistical knowledge.
3. Data Science versus Machine Learning
Machine learning and statistics are part of data science. The word learning in machine learning means that the algorithms depend on some data, used as a training set, to fine-tune some model or algorithm parameters. This encompasses many techniques such as regression, naive Bayes or supervised clustering. But not all techniques fit in this category. For instance, unsupervised clustering – a statistical and data science technique – aims at detecting clusters and cluster structures without any a-priori knowledge or training set to help the classification algorithm. A human being is needed to label the clusters found. Some techniques are hybrid, such as semi-supervised classification. Some pattern detection or density estimation techniques fit in this category.
Data science is much more than machine learning though. Data, in data science, may or may not come from a machine or mechanical process (survey data could be manually collected, clinical trials involve a specific type of small data) and it might have nothing to do with learning as I have just discussed. But the main difference is the fact that data science covers the whole spectrum of data processing, not just the algorithmic or statistical aspects. In particular, data science also covers
Of course, in many organisations, data scientists focus on only one part of this process.
Content retrieved from: https://www.datasciencecentral.com/profiles/blogs/difference-between-machine-learning-data-science-ai-deep-learning.