Using Data to Answer Questions: An Introduction to Machine Learning

Today Machine Learning is one of the most talked-about topics among new developers and companies. Every other company in the market is now shifting towards this new trend. But what exactly Machine Learning is? And why it’s becoming a choice for everyone? We will figure it out in this article.

Before coming to this new trend let's try to understand a bit about our traditional way of programming. How our traditional way of programming works? Let’s understand this with a very simple example. Suppose we have the scores of 100 players in the form of numbers and we need to generate a rank list of players, i.e., we need to sort the scores. How do we do that in general? We simply write a sorting algorithm, a set of rules, and pass our data to it and we get the expected sorted rank list very easily. To understand it better, have a look at this diagram:

Traditional Programming

This way is quite successful and we are doing it in this way for decades. But now we are in the era of data. Every single day we are generating a huge amount of data in form of texts, images, videos, etc. And processing huge amounts of data with any algorithms takes huge computational power as well and sometimes it’s not feasible for us with our traditional way of programming. Like if we have the scores of 1 Billion players instead of just 100, probably, we need more computational power and it is also going to take some more time. If you are thinking that sorting 1B number is a feasible task and we can do that with our traditional way, you’re right but sorting is just an example to explain the concept, in real-world we have some really complex and large data and some of them are not feasible to process. So, let’s see how we approach this using Machine Learning. It’s just the opposite of Traditional Programming. Instead of defining some set of rules and passing data into it to get the answer, we pass some known data with respective answers to a model to generate its own set of rules i.e. to learn from the known data and results. With this approach, we will let the model figure out in itself how to reach the answer from the given data. This way of programming is called Machine Learning and the above process is called Training a Model. Now as we know that the model is trained with some known data if we pass some unknown data it is going to tell us the answer we expect, but this time we don’t have to provide any set of rules which we did in the traditional way of programming. This phase is called making Prediction. To understand it better have a look at this diagram:

Machine Learning

So, how we are going to approach sorting with Machine Learning. Let’s see.

We first train our model based on some known data like this:

After the model being trained if we pass some unknown data without any rules to process the data with, we will still get our expected result i.e. a sorted list of numbers like this:

Sorting numbers is a really basic example and Machine Learning doesn’t seem to be very useful in this case. True. but suppose we are making a Hand Cricket game. Think of how we can code this using our traditional way of programming. We need to take image data from the camera, define rules to detect different gestures of the hand, and then do the logic for the game. Here detecting gestures of hand is having a huge complexity as it depends upon lots of factors like shape, size, skin tone, etc. And writing code for that is a huge pain for any developer. We may end up writing thousands and thousands of lines of code. It doesn’t seem to be feasible. Here comes the use of Machine Learning in the picture. How do we do this using Machine Learning? We do it in the same way as training a model and using it to get an answer. We first train our model with some images of hand gestures labeled with there respective numbers it represents and then later we use this model for our unknown data. See this illustration:

Hand Cricket Using Machine Learning

Now to conclude this article, we can define Machine Learning in just five words “Using Data to Answer Questions”. It has two parts ‘Using Data’ and ‘Answer Questions’. Using data refer to the Training phase and Answer Question refers to the prediction or making inference phase of Machine Learning. So, we can say that Machine Learning has two parts Training and Prediction/Inference, and it's nothing but deriving meaning from data, and data is the most important thing for machine learning to work according to our expectations. It helps us to solve some unachievable tasks of traditional programming in a better, faster, and easier way.

Please let me know in the comments how it helped you to understand the basic concept of Machine Learning.

N.B.: The title of this article is inspired by this video on YouTube by Google Cloud Platform.

A designer and developer. check out nitishsharma.design