What is Machine Learning?

The term ‘Machine Learning’ was coined by Arthur Samuel in 1959 and he defined it as-

Machine learning is a field of study that gives computer the ability to learn without being explicitly programmed.

Arthur Samuel

However, a more recent definition has gained popularity that is widely being used by academics-

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

Tom Mitchell

Simply put, machine learning is the idea of having a broader generic set of algorithms that can tell you something specific about the data that you feed to the algorithm without you having to explicitly write a custom program for it.

Supervised and Unsupervised Learning

ML algorithms falls into either of the two broad categories. Supervised learning or unsupervised learning.

Supervised Learning

In supervised learning, we are given a data set and already know what the correct output looks like. i.e. there is a direct relationship between the input and output.

If we have an input parameter (x) and an output parameter (y), we write an algorithm that learns to figure out the mapping function from input to output.

y = f(x)

The algorithm iteratively tries to predict the right answers on the training data set and since we already know the right answer, we can correct the computer if it fails to do so. This process is continued until the computer gives the right answer (thus, confirming it has figured out the mapping function).

A good example of a supervised algorithm is a spam filterer. You can train the algorithm by marking the email as ‘spam’ or ‘no spam’. After enough iterations, the algorithm can figure out if a new email is spam or no spam on the basis of the previous mails you labelled.

Supervised learning is further categorised into two more parts-

Regression

In regression, we try to predict results within a continuous output.

For example, if you’re asked to predict the price of a house based on a training set of house details – it’s a regression task since prices are a continuous value. We would have a fair idea of what are the prices of a house in a locality and after enough iterations of ML algorithm, we can predict what would be the price of a house in a different locality.

Classification

In classification, we try to predict results in a discrete output. It either predicts categories or classifies data.

The spam filterer we discussed above is an example of a classification task. The computer classifies mails into either ‘spam’ or ‘no spam’.

Unsupervised Learning

In unsupervised learning, we do not know how the output looks like and instead we try to derive the structure of data in order to learn more about the data.

Unlike supervised algorithms, here we cannot correct the computer if it gives the wrong answer. Instead, we train the computer to deduce associations between the data as to find some relation between them.

Unsupervised learning is further categorised into two parts-

Clustering

Clustering is a task of grouping data in a way that the data points in a group are similar to the other data points of the same group and dissimilar to the data points of the other group.

Association

In association, we tend to find out associations and relationships among large sets of data items.

An e-commerce site might use association to group consumers on the basis of their purchasing habits and might recommend them items they are likely to buy based on the similarity found with other consumers.


If you have any questions or have any comments, please reach out to the comment box below.

Be the first one to be notified of any new post, subscribe to my blog! Please? 🙂

Leave a Reply