Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

What is Machine Learning?

Machine learning is everything on this list! It is considered a subfield of Artificial intelligence


involves learning models which allow the program to make predictions on data more than just a list
of instructions which clearly defines what the algorithm should do closely linked to computational
statistics, which uses computers to make predictions. The third option here probably needs some
explaining. A key difference between a regular algorithm and a machine learning algo is the
learning” model which allows the algorithm to learn from the data and make its own decisions. This
allows machines to perform tasks which otherwise are impossible for it to perform. Such tasks can be
as simple as recognizing human handwriting or as complex as self-driving cars! For example, say an
algorithm is supposed to correctly distinguish between a male and a female face from a group of ID-
card photos. A machine learning (ML) algorithm would be trained on a training data to ‘learn’ how
to recognize any face. Where a simple algorithm would not be capable of performing this task, a ML
algo would not only be able to categorize the photos as trained, it would continuously learn from
testing data and add to its “learning” to become more accurate in its predictions! Recall how often
Facebook prompts you to tag the person in the picture! Among billions of users, FB ML algos are
able to correctly match different pictures of the same person and identify her! There are many key
industries where ML is making a huge impact: Financial services, Delivery, Marketing and Sales,
Health Care to name a few.
It is expected that in a couple of decades the mechanical, repetitive tasks will be over. Machine
learning and improvements in Artificial intelligence techniques have made impossible possible. We
are here to discuss the implementation and usage of machine learning in trading.
Why have traders started to learn and use machine learning?
The answer in simple words is, to get advantage of complex mathematical/statistical computations
which are difficult, if possible, to carry on in any other way. For instance, assume that you have an
understanding of the market trend. You have a simple trading strategy using a few technical ndicators
which help you to predict the market trend and trade accordingly. Another trader, Machine learning
quipped, will follow the same approach but armed with ML algos, he will allow the machine to go
through hundreds of technical indicators, instead of a few old preferred ones and let the machine
decide which indicator performs best in predicting the correct market trend. While the regular trader
might have stuck to say RSI or MA, the ML algo will be able to refer to many more technical
indicators. This technique is known as feature selection, where machine learning chooses between
different features of indicators and chooses the most efficient ones for prediction. These features can
be price change, buy/sell signals, volatility signals, volume weights to name a few. It is obvious that
the trader who has done more quantitative research, backtesting on the historical data, and better
optimization has a greater chance of performing better in live markets. However, the true
effectiveness of a strategy depends on what happens in live markets! To understand further how
some traders and researchers have used ML techniques we will share two researches with you during
this course. But first, let us learn about a few basic concepts in Machine Learning so that we are not
lost in the simple jargon when we read literature available in this field.
We will learn a few important terms about machine learning before we proceed further!
These are a few things
that we should know about machine learning to be able to understand its implementation in trading:
Training & Testing Data Sets
Types of Machine Learning Tasks:
Supervised learning
Unsupervised learning
Reinforcement learning
Let us start with “Training & Testing Data Sets'' Machine Learning is heavily dependent on data, the
algorithms learn from data. It is crucial that algos are fed right data for the problem they are trying to
solve. Typically the algos work with two data sets: one on which they are “trained” and another on
which they are “tested”. Training data is used by algorithms to learn to do a specific task such as
classification, clustering. The model gets “trained” using the training data. Test data is used to “test”
or determine the accuracy of the machine learning algorithms. It is used to check whether the
algorithm yields expected result on previously unknown data. Separating data into training and
testing sets is an important task in machine learning. Training and testing data should be mutually
independent and created by random sampling. Typically, when you separate a data into a training set
and testing set, most of the data is used for training, and a smaller portion of the data is used for
testing. After a model has been refined by using the training set, you test the model against the test
dataset and determine whether the model’s prediction are accurate enough to be deployed.

Now that we have a basic understanding of Training & Testing Data sets, let us understand the
different types of machine learning tasks. Machine learning tasks can roughly fall into two main
categories

When we clearly define the expected outcome


The right answer is not defined

The 1st one where the expected outcome is clearly defined is called “Supervised Learning”
supervised learning, each training data consists of a pair of input objects and the desired output value
or target. The main task is to produce a function that will map input values to the output value in a
way such that when you have new input data you should be able to make reasonable prediction about
the target value. Such type of learning is called supervised learning because training data sets are
predefined and can be thought of as a teacher supervising the learning process.

Supervised learning problems can be divided into classification and regression problems. If a
supervised learning algorithm analyzes the training data and produces an output in classes or in a
discrete form such as 0/1, then it is called a classification problem. If the output is continuous, then it
is called a regression problem. For example, Suppose you want to classify pictures into labels such as
animals and birds. In order to make the classification, your algorithm is provided with labeled
pictures. After training data on these labeled pictures, your algorithm should be able to categorize
unlabeled images with some level of accuracy. This would be considered a classification problem.
Similarly, if you are given a data set containing the parameters which affect the price of a house and
you are expected to make a prediction for the price of the house then it is a regression problem. The
second one where the expected outcome is not defined is called unsupervised learning. Unsupervised
learning refers to a broad array of machine learning algorithms which are used to draw inferences
from datasets consisting of input data without labeled responses. This implies that the algorithm is
not presented with the right output for a sample input, but instead is forced to learn the correct way to
produce an output in an unsupervised manner. Unsupervised learning mostly forms the significant
part of learning for the human brain and hence is an important segment of machine learning. There
are two approaches to unsupervised learning, the first involves using some sort of reward system to
indicate success. This type of system falls under the decision problem framework because the goal is
to make decisions that maximizes rewards. The second type of unsupervised learning is called
clustering. Here, the goal is to find similarities in the training data. The assumption in such a system
is that the clusters discovered will match reasonably well with an intuitive classification. For
example, the clustering of individuals based on demographics might result in clustering the high and
low income groups in two cluster. It might seem that all machine learning machine learning
algorithms can be categorized into supervised and unsupervised learning, but that is not the case.
Reinforcement learning involves techniques that try to retro-feed the model to improve performance.
In order to accomplish this, the model needs to be able to interpret signals, decide on an action and
then compare the outcome against a predefined reward system. Reinforcement learning tries to
understand what needs to be done in order to maximize the rewards. This is not a supervised type of
learning, because it does not strictly depend on supervised or labelled data. It relies on the ability to
interpret the response of the actions being taken and measure them against the definition of the
reward. And it is not unsupervised learning either, because it receives and modifies the model
according to the feedback received against the predefined reward system. In the upcoming units, your
concept will be tested through several multiple choice questions and after which we will discuss an
application of machine learning in trading using reinforcement learning model.

You might also like