This lesson is in the early stages of development (Alpha version)

What is Machine Learning

Overview

Teaching: 45 min
Exercises: 0 min
Questions
  • The basics.

Objectives
  • Understand the differences between artificial intelligence, machine learning and deep learning.

  • Familiarize with some of the most common ways to classify machine learning algorithms

Differences between Artificial Intelligence, Machine Learning and Deep Learning.

The best way to try to understand the differences between them is to visualize them as concentric circles with Artificial Intelligence (the originator field) as the largest, then Machine Learning (which flourished later) fitting as a subfield of AI, and finally Deep Learning (the driving force of modern AI explosion) fitting inside both.

AI, ML and DL relationship

Artificial Intelligence

Ada Lovelace is considered as the first computer programmer in history and while she was the first to recognise the enormous potential of the Analytical Engine and universal computing, she also noted what she considered as an intrinsic limitation

“The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform…. Its province is to assist us in making available what we’re already acquainted with.”

For several decades this remained as the default view on the topic. However the computing developments reached by the 1950s where significant enough to bring Alan Turing to pose the question “Can machines think?”. This sprang the development of the new scientific field of AI that could be defined as:

The effort to automate intellectual tasks normally performed by humans.

In this way, AI is a general field that encompasses Machine Learning and Deep Learning, but that also includes many more approaches that don’t involve any learning. For a fairly long time, many experts believed that human-level AI could be achieved by having programmers handcraft a sufficiently large set of explicit rules for manipulating knowledge. This approach is known as symbolic AI (it represents a problem using symbols and then uses logic to search for solutions), and it was the dominant paradigm in AI from the 1950s to the late 1980s. It reached its peak popularity during the expert systems boom of the 1980s.

Expert systems

The INCO (Integrated Communications Officer) Expert System Project (IESP) was undertaken in 1987 by the Mission Operations Directorate (MOD) at NASA’s Johnson Space Center (JSC) to explore the use of advanced automation in the mission operations arena. Space Shuttle - INCO

MYCIN was an early backward chaining expert system that used artificial intelligence to identify bacteria causing severe infections - MYCIN operated using a fairly simple inference engine and a knowledge base of ~600 rules. - MYCIN was developed over five or six years in the early 1970s at Stanford University. It was written in Lisp MYCIN

Dendral was an artificial intelligence project of the 1960s for the specific task of helping organic chemists in identifying unknown organic molecules by analyzing their mass spectra and using knowledge of chemistry. This software is considered the first expert system because it automated the decision-making process and problem-solving behavior of organic chemists. Caffeine Structure

Although symbolic AI proved suitable to solve well-defined, logical problems, such as playing chess, it turned out to be intractable to figure out explicit rules for solving more complex, fuzzy problems, such as image classification, speech recognition, and language translation. A new approach arose to take symbolic AI’s place: Machine Learning.

Machine Learning

Machine learning arises from the question: could a computer go beyond “what we know how to order it to perform” and learn on its own how to perform a specified task? Could a computer surprise us? Rather than programmers crafting data-processing rules by hand, could a computer automatically learn these rules by looking at data?

This question opened the door to a new programming paradigm. In classical programming , the paradigm of symbolic AI, humans input rules (a program) and data to be processed according to these rules, and out come answers. With Machine Learning, humans input data as well as the answers expected from the data, and out come the rules. These rules can then be applied to new data to produce original answers.

AI, ML and DL relationship

Using data to predict something can be categorized as Machine Learning, a very simple example of this is to develop a linear regression model based, for example, on several mice weight and size measurements and use it predict the size of a new mouse based on its weight. In general, to do Machine Learning, we need three things:

The central problem in Machine Learning is to meaningfully transform data, this is, to learn useful representations of the input data at hand, representations that would get us closer to the expected output. A representation is, at its core, a different way to look at data—to represent or encode data. Machine Learning models are all about finding appropriate representations for their input data transformations of the data that make it more amenable to the task at hand, such as a classification task.

Learning by changing representations

Consider a number of points distributed in a xy-coordinate system. Some of them are white and some are black.

Example 1 - Raw Data

And we are given the task to develop an algorithm to calculate the probability of the point being black or white given its x-y coordinates. In this case,

  • The inputs are the coordinates of our points
  • The expected outputs are the colours of our points
  • The measure of success would be the percentage of points correctly classified

One way to solve the problem is by applying a coordinate change (a new representation of our data). This new representation would allow us to classify our points with a more simple set of rules: “Black points are those such that x>0”

In this case, we defined the coordinate change by hand. But if instead we tried systematically searching for different possible coordinate changes, and used as feedback the percentage of points being correctly classified, then we would be doing Machine Learning. Learning, in this context, describes an automatic search process for better representations.

Coordinate change Better representation
Example 1 - Coordinate Change Example 1 - Better Representation

All Machine Learning algorithms consist of automatically finding such transformations that turn data into more useful representations for a given task. These operations can be coordinate changes, as you just saw, or linear projections, translations, nonlinear operations, and so on. Machine Learning algorithms aren’t usually creative in finding these transformations; they’re merely searching through a predefined set of operations, also called a hypothesis space.

With the previous description in mind, we can summarize Machine Learning as:

the field of study that gives computers the ability to learn without being explicitly programmed. —Arthur Samuel, 1959

Or more formally:

A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. —Tom Mitchell, 1997

A machine-learning system is trained rather than explicitly programmed. It is presented with many examples relevant to a task, and it finds statistical structure in these examples that eventually allows the system to come up with rules for automating the task.

When to use Machine Learning

Machine Learning methods are great when:

  • the solution of a problem requires lots of parameter tweaking and/or writing long lists of rules.
  • there is no known optimal solution for the problem at hand.
  • the problem requires adapting to constantly changing data.
  • we wish to obtain more insights about complex problems and large amounts of data.

Common Machine Learning problems

There are many classes and subclasses of Machine Learning problems based on what the prediction task looks like. Some of the most common ones include:

Types of Machine learning

Although there is no formal Machine Learning classification system yet, they are commonly described and classified in broad categories based on:

Main challenges

Computational frameworks

Deep learning frameworks allow a user to define networks either via a config (like Caffe) or programmatically like Theano, Tensorflow, or Torch. Furthermore, the programming language exposed to define networks might vary, like Python in the case of Theano and Tensorflow or Lua in the case of Torch. An additional variation on them is whether the framework provides define-compile-execute semantics or dynamic semantics (as in the case of PyTorch).

References

Some further reading to learn more about machine and deep learning.

Further training

Some other sources with training material for DL and ML applications:

Key Points

  • Machine Learning is the science of getting computers to learn, without being explicitly programmed.

  • Machine Learning main objective is to find new useful representation that help understand hidden patterns in data.

  • There are several ways to classify machine learning algorithms based on the problem they are trying to solve, how is data processed and how much human supervision is required.