Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

UNIT 1 INTRODUCTION AND RECAP

Machine learning (ML) is a branch of artificial intelligence (AI) that enables


computers to “self-learn” from training data and improve over time, without being
explicitly programmed. Machine learning algorithms are able to detect patterns in
data and learn from them, in order to make their own predictions. In short,
machine learning algorithms and models learn through experience.

In traditional programming, a computer engineer writes a series of directions that


instruct a computer how to transform input data into a desired output. Instructions
are mostly based on an IF-THEN structure: when certain conditions are met, the
program executes a specific action.

Machine learning, on the other hand, is an automated process that enables


machines to solve problems with little or no human input, and take actions based
on past observations.

While artificial intelligence and machine learning are often used interchangeably,
they are two different concepts. AI is the broader concept – machines making
decisions, learning new skills, and solving problems in a similar way to humans –
whereas machine learning is a subset of AI that enables intelligent systems to
autonomously learn new things from data.

Instead of programming machine learning algorithms to perform tasks, you can


feed them examples of labeled data (known as training data), which helps them
make calculations, process data, and identify patterns automatically.

Put simply, Google’s Chief Decision Scientist describes machine learning as a


fancy labeling machine. After teaching machines to label things like apples and
pears, by showing them examples of fruit, eventually they will start labeling
apples and pears without any help – provided they have learned from appropriate
and accurate training examples.

Machine learning can be put to work on massive amounts of data and can
perform much more accurately than humans. It can help you save time and
money on tasks and analyses, like solving customer pain points to improve
customer satisfaction, support ticket automation, and data mining from internal
sources and all over the internet.

Supervised Learning
Supervised learning algorithms and supervised learning models make predictions
based on labeled training data. Each training sample includes an input and a
desired output. A supervised learning algorithm analyzes this sample data and
makes an inference – basically, an educated guess when determining the labels
for unseen data.

This is the most common and popular approach to machine learning. It’s
“supervised” because these models need to be fed manually tagged sample data
to learn from. Data is labeled to tell the machine what patterns (similar words and
images, data categories, etc.) it should be looking for and recognize connections
with.

For example, if you want to automatically detect spam, you would need to feed a
machine learning algorithm examples of emails that you want classified as spam
and others that are important, and should not be considered spam.

Which brings us to our next point – the two types of supervised learning tasks:
classification and regression.

Classification in supervised machine learning


There are a number of classification algorithms used in supervised learning, with
Support Vector Machines (SVM) and Naive Bayes among the most common.

In classification tasks, the output value is a category with a finite number of


options. For example, with this free pre-trained sentiment analysis model, you
can automatically classify data as positive, negative, or neutral.

Unsupervised Learning
Unsupervised learning algorithms uncover insights and relationships in unlabeled
data. In this case, models are fed input data but the desired outcomes are
unknown, so they have to make inferences based on circumstantial evidence,
without any guidance or training. The models are not trained with the “right
answer,” so they must find patterns on their own.

One of the most common types of unsupervised learning is clustering, which


consists of grouping similar data. This method is mostly used for exploratory
analysis and can help you detect hidden patterns or trends.

For example, the marketing team of an e-commerce company could use


clustering to improve customer segmentation. Given a set of income and
spending data, a machine learning model can identify groups of customers with
similar behaviors.

Segmentation allows marketers to tailor strategies for each key market. They
might offer promotions and discounts for low-income customers that are high
spenders on the site, as a way to reward loyalty and improve retention.

Semi-Supervised Learning
In semi-supervised learning, training data is split into two. A small amount of
labeled data and a larger set of unlabeled data.

In this case, the model uses labeled data as an input to make inferences about
the unlabeled data, providing more accurate results than regular
supervised-learning models.

This approach is gaining popularity, especially for tasks involving large datasets
such as image classification. Semi-supervised learning doesn’t require a large
number of labeled data, so it’s faster to set up, more cost-effective than
supervised learning methods, and ideal for businesses that receive huge
amounts of data.

Reinforcement Learning
Reinforcement learning (RL) is concerned with how a software agent (or
computer program) ought to act in a situation to maximize the reward. In short,
reinforced machine learning models attempt to determine the best possible path
they should take in a given situation. They do this through trial and error. Since
there is no training data, machines learn from their own mistakes and choose the
actions that lead to the best solution or maximum reward.
This machine learning method is mostly used in robotics and gaming. Video
games demonstrate a clear relationship between actions and results, and can
measure success by keeping score. Therefore, they’re a great way to improve
reinforcement learning algorithms.

Deep Learning (DL)


Deep learning models can be supervised, semi-supervised, or unsupervised (or a
combination of any or all of the three). They’re advanced machine learning
algorithms used by tech giants, like Google, Microsoft, and Amazon to run entire
systems and power things, like self driving cars and smart assistants.

Deep learning is based on Artificial Neural Networks (ANN), a type of computer


system that emulates the way the human brain works. Deep learning algorithms
or neural networks are built with multiple layers of interconnected neurons,
allowing multiple systems to work together simultaneously, and step-by-step.

When a model receives input data ‒ which could be image, text, video, or audio ‒
and is asked to perform a task (for example, text classification with machine
learning), the data passes through every layer, enabling the model to learn
progressively. It’s kind of like a human brain that evolves with age and
experience!

Deep learning is common in image recognition, speech recognition, and Natural


Language Processing (NLP). Deep learning models usually perform better than
other machine learning algorithms for complex problems and massive sets of
data. However, they generally require millions upon millions of pieces of training
data, so it takes quite a lot of time to train them.
How machine learning works.

In order to understand how machine learning works, first you need to know what
a “tag” is. To train image recognition, for example, you would “tag” photos of
dogs, cats, horses, etc., with the appropriate animal name. This is also called
data labeling.

When working with machine learning text analysis, you would feed a text analysis
model with text training data, then tag it, depending on what kind of analysis
you’re doing. If you’re working with sentiment analysis, you would feed the model
with customer feedback, for example, and train the model by tagging each
comment as Positive, Neutral, and Negative.

At its most simplistic, the machine learning process involves three steps:

Feed a machine learning model training input data. In our case, this could be
customer comments from social media or customer service data.
Tag training data with a desired output. In this case, tell your sentiment analysis
model whether each comment or piece of data is Positive, Neutral, or Negative.
The model transforms the training data into text vectors – numbers that represent
data features.
Test your model by feeding it testing (or unseen) data. Algorithms are trained to
associate feature vectors with tags based on manually tagged samples, then
learn to make predictions when processing unseen data.

Machine learning applications and use cases are nearly endless, especially as
we begin to work from home more (or have hybrid offices), become more tied to
our smartphones, and use machine learning-guided technology to get around.

Machine learning in finance, healthcare, hospitality, government, and beyond, is


already in regular use. Businesses are beginning to see the benefits of using
machine learning tools to improve their processes, gain valuable insights from
unstructured data, and automate tasks that would otherwise require hours of
tedious, manual work (which usually produces much less accurate results).

For example, UberEats uses machine learning to estimate optimum times for
drivers to pick up food orders, while Spotify leverages machine learning to offer
personalized content and personalized marketing. And Dell uses machine
learning text analysis to save hundreds of hours analyzing thousands of
employee surveys to listen to the voice of employee (VoE) and improve
employee satisfaction.

How do you think Google Maps predicts peaks in traffic and Netflix creates
personalized movie recommendations, even informs the creation of new content
? By using machine learning, of course.

There are many different applications of machine learning, which can benefit your
business in countless ways. You’ll just need to define a strategy to help you
decide the best way to implement machine learning into your existing processes.
In the meantime, here are some common machine learning use cases and
applications that might spark some ideas:

Social Media Monitoring


Customer Service & Customer Satisfaction
Image Recognition
Virtual Assistants
Product Recommendations
Stock Market Trading
Medical Diagnosis
Social Media Monitoring
Using machine learning you can monitor mentions of your brand on social media
and immediately identify if customers require urgent attention. By detecting
mentions from angry customers, in real-time, you can automatically tag customer
feedback and respond right away. You might also want to analyze customer
support interactions on social media and gauge customer satisfaction (CSAT), to
see how well your team is performing.

Natural Language Processing gives machines the ability to break down spoken
or written language much like a human would, to process “natural” language, so
machine learning can handle text from practically any source.

Customer Service & Customer Satisfaction


Machine learning allows you to integrate powerful text analysis tools with
customer support tools, so you can analyze your emails, live chats, and all
manner of internal data on the go. You can use machine learning to tag support
tickets and route them to the correct teams or auto-respond to common queries
so you never leave a customer in the cold.

Furthermore, using machine learning to set up a voice of customer (VoC)


program and a customer feedback loop will ensure that you follow the customer
journey from start to finish to improve the customer experience (CX), decrease
customer churn, and, ultimately, increase your profits.

Image Recognition
Image recognition is helping companies identify and classify images. For
example, facial recognition technology is being used as a form of identification,
from unlocking phones to making payments.

Self-driving cars also use image recognition to perceive space and obstacles. For
example, they can learn to recognize stop signs, identify intersections, and make
decisions based on what they see.

Virtual Assistants
Virtual assistants, like Siri, Alexa, Google Now, all make use of machine learning
to automatically process and answer voice requests. They quickly scan
information, remember related queries, learn from previous interactions, and
send commands to other apps, so they can collect information and deliver the
most effective answer.

Customer support teams are already using virtual assistants to handle phone
calls, automatically route support tickets, to the correct teams, and speed up
interactions with customers via computer-generated responses.

Product Recommendations
Association rule-learning is a machine learning technique that can be used to
analyze purchasing habits at the supermarket or on e-commerce sites. It works
by searching for relationships between variables and finding common
associations in transactions (products that consumers usually buy together). This
data is then used for product placement strategies and similar product
recommendations.
Associated rules can also be useful to plan a marketing campaign or analyze
web usage.

Stock Market Trading


Machine learning algorithms can be trained to identify trading opportunities, by
recognizing patterns and behaviors in historical data. Humans are often driven by
emotions when it comes to making investments, so sentiment analysis with
machine learning can play a huge role in identifying good and bad investing
opportunities, with no human bias, whatsoever. They can even save time and
allow traders more time away from their screens by automating tasks.

Medical Diagnosis
The ability of machines to find patterns in complex data is shaping the present
and future. Take machine learning initiatives during the COVID-19 outbreak, for
instance. AI tools have helped predict how the virus will spread over time, and
shaped how we control it. It’s also helped diagnose patients by analyzing lung
CTs and detecting fevers using facial recognition, and identified patients at a
higher risk of developing serious respiratory disease.

Machine learning is driving innovation in many fields, and every day we’re seeing
new interesting use cases emerge. In business, the overall benefits of machine
learning include:

It’s cost-effective and scalable. You only need to train a machine learning model
once, and you can scale up or down depending on how much data you receive.
Performs more accurately than humans. Machine learning models are trained
with a certain amount of labeled data and will use it to make predictions on
unseen data. Based on this data, machines define a set of rules that they apply
to all datasets, helping them provide consistent and accurate results. No need to
worry about human error or innate bias. And you can train the tools to the needs
and criteria of your business.
Works in real-time, 24/7. Machine learning models can automatically analyze
data in real-time, allowing you to immediately detect negative opinions or urgent
tickets and take action
The term "Artificial Neural Network" is derived from Biological neural networks
that develop the structure of a human brain. Similar to the human brain that has
neurons interconnected to one another, artificial neural networks also have
neurons that are interconnected to one another in various layers of the networks.
These neurons are known as nodes.

What is Artificial Neural Network


The given figure illustrates the typical diagram of Biological Neural Network.

The typical Artificial Neural Network looks something like the given figure.
What is Artificial Neural Network
Dendrites from Biological Neural Network represent inputs in Artificial Neural
Networks, cell nucleus represents Nodes, synapse represents Weights, and
Axon represents Output.

Relationship between Biological neural network and artificial neural network:

Biological Neural Network Artificial Neural Network

Dendrites Inputs

Cell nucleus Nodes

Synapse Weights

Axon Output

An Artificial Neural Network in the field of Artificial intelligence where it attempts to


mimic the network of neurons makes up a human brain so that computers will have an
option to understand things and make decisions in a human-like manner. The artificial
neural network is designed by programming computers to behave simply like
interconnected brain cells.

There are around 1000 billion neurons in the human brain. Each neuron has an
association point somewhere in the range of 1,000 and 100,000. In the human brain,
data is stored in such a manner as to be distributed, and we can extract more than one
piece of this data when necessary from our memory parallelly. We can say that the
human brain is made up of incredibly amazing parallel processors.

We can understand the artificial neural network with an example, consider an example
of a digital logic gate that takes an input and gives an output. "OR" gate, which takes two
inputs. If one or both the inputs are "On," then we get "On" in output. If both the inputs
are "Off," then we get "Off" in output. Here the output depends upon input. Our brain
does not perform the same task. The outputs to inputs relationship keep changing
because of the neurons in our brain, which are "learning."

The architecture of an artificial neural network:


To understand the concept of the architecture of an artificial neural network, we have to
understand what a neural network consists of. In order to define a neural network that
consists of a large number of artificial neurons, which are termed units arranged in a
sequence of layers. Lets us look at various types of layers available in an artificial neural
network.

Artificial Neural Network primarily consists of three layers

Input Layer:

As the name suggests, it accepts inputs in several different formats provided by the
programmer.

Hidden Layer:
The hidden layer presents in-between input and output layers. It performs all the
calculations to find hidden features and patterns.

Output Layer:

The input goes through a series of transformations using the hidden layer, which finally
results in output that is conveyed using this layer.

The artificial neural network takes input and computes the weighted sum of the inputs
and includes a bias. This computation is represented in the form of a transfer function.

It determines weighted total is passed as an input to an activation function to produce


the output. Activation functions choose whether a node should fire or not. Only those
who are fired make it to the output layer. There are distinctive activation functions
available that can be applied upon the sort of task we are performing.

Advantages of Artificial Neural Network (ANN)


Parallel processing capability:

Artificial neural networks have a numerical value that can perform more than one task
simultaneously.

Storing data on the entire network:

Data that is used in traditional programming is stored on the whole network, not on a
database. The disappearance of a couple of pieces of data in one place doesn't prevent
the network from working.

Capability to work with incomplete knowledge:

After ANN training, the information may produce output even with inadequate data. The
loss of performance here relies upon the significance of missing data.

Having a memory distribution:

For ANN is to be able to adapt, it is important to determine the examples and to


encourage the network according to the desired output by demonstrating these
examples to the network. The succession of the network is directly proportional to the
chosen instances, and if the event can't appear to the network in all its aspects, it can
produce false output.

Having fault tolerance:

Extortion of one or more cells of ANN does not prohibit it from generating output, and
this feature makes the network fault-tolerance.

Disadvantages of Artificial Neural Network:


Assurance of proper network structure:

There is no particular guideline for determining the structure of artificial neural


networks. The appropriate network structure is accomplished through experience, trial,
and error.

Unrecognized behavior of the network:

It is the most significant issue of ANN. When ANN produces a testing solution, it does
not provide insight concerning why and how. It decreases trust in the network.

Hardware dependence:

Artificial neural networks need processors with parallel processing power, as per their
structure. Therefore, the realization of the equipment is dependent.

Difficulty of showing the issue to the network:

ANNs can work with numerical data. Problems must be converted into numerical values
before being introduced to ANN. The presentation mechanism to be resolved here will
directly impact the performance of the network. It relies on the user's abilities.

The duration of the network is unknown:

The network is reduced to a specific value of the error, and this value does not give us
optimum results.

Science artificial neural networks that have steeped into the world in the mid-20th
century are exponentially developing. In the present time, we have investigated the
pros of artificial neural networks and the issues encountered in the course of their
utilization. It should not be overlooked that the cons of ANN networks, which are a
flourishing science branch, are eliminated individually, and their pros are
increasing day by day. It means that artificial neural networks will turn into an
irreplaceable part of our lives progressively important.

How do artificial neural networks work?


Artificial Neural Network can be best represented as a weighted directed graph, where
the artificial neurons form the nodes. The association between the neurons outputs and
neuron inputs can be viewed as the directed edges with weights. The Artificial Neural
Network receives the input signal from the external source in the form of a pattern and
image in the form of a vector. These inputs are then mathematically assigned by the
notations x(n) for every n number of inputs.
Afterward, each of the input is multiplied by its corresponding weights ( these weights
are the details utilized by the artificial neural networks to solve a specific problem ). In
general terms, these weights normally represent the strength of the interconnection
between neurons inside the artificial neural network. All the weighted inputs are
summarized inside the computing unit.

If the weighted sum is equal to zero, then bias is added to make the output non-zero or
something else to scale up to the system's response. Bias has the same input, and
weight equals to 1. Here the total of weighted inputs can be in the range of 0 to positive
infinity. Here, to keep the response in the limits of the desired value, a certain maximum
value is benchmarked, and the total of weighted inputs is passed through the activation
function.

The activation function refers to the set of transfer functions used to achieve the
desired output. There is a different kind of the activation function, but primarily either
linear or non-linear sets of functions. Some of the commonly used sets of activation
functions are the Binary, linear, and Tan hyperbolic sigmoidal activation functions. Let us
take a look at each of them in details:

Binary:

In binary activation function, the output is either a one or a 0. Here, to accomplish this,
there is a threshold value set up. If the net weighted input of neurons is more than 1,
then the final output of the activation function is returned as one or else the output is
returned as 0.

Sigmoidal Hyperbolic:

The Sigmoidal Hyperbola function is generally seen as an "S" shaped curve. Here the tan
hyperbolic function is used to approximate output from the actual net input. The
function is defined as:

F(x) = (1/1 + exp(-????x))

Where ???? is considered the Steepness parameter.

Types of Artificial Neural Network:


There are various types of Artificial Neural Networks (ANN) depending upon the human
brain neuron and network functions, an artificial neural network similarly performs
tasks. The majority of the artificial neural networks will have some similarities with a
more complex biological partner and are very effective at their expected tasks. For
example, segmentation or classification.

Feedback ANN:

In this type of ANN, the output returns into the network to accomplish the best-evolved
results internally. As per the University of Massachusetts, Lowell Centre for
Atmospheric Research. The feedback networks feed information back into itself and are
well suited to solve optimization issues. The Internal system error corrections utilize
feedback ANNs.

Feed-Forward ANN:
A feed-forward network is a basic neural network comprising of an input layer, an output
layer, and at least one layer of a neuron. Through assessment of its output by reviewing
its input, the intensity of the network can be noticed based on group behavior of the
associated neurons, and the output is decided. The primary advantage of this network is
that it figures out how to evaluate and recognize input patterns
Perspectives and Issues in Machine Learning
Following are the list of issues in machine learning:

1. What algorithms exist for learning general target functions from specific training
examples? In what settings will particular algorithms converge to the desired function,
given sufficient training data? Which algorithms perform best for which types of
problems and representations?

2. How much training data is sufficient? What general bounds can be found to relate the
confidence in learned hypotheses to the amount of training experience and the
character of the learner’s hypothesis space?

3. When and how can prior knowledge held by the learner guide the process of
generalizing from examples? Can prior knowledge be helpful even when it is only
approximately correct?

4. What is the best strategy for choosing a useful next training experience, and how
does the choice of this strategy alter the complexity of the learning problem?

5. What is the best way to reduce the learning task to one or more function
approximation problems? Put another way, what specific functions should the system
attempt to learn? Can this process itself be automated?

5. How can the learner automatically alter its representation to improve its ability to
represent and learn the target function?
CONCEPT LEARNING

Concept Learning in Machine Learning can be thought of as a boolean-valued function


defined over a large set of training data. In my previous article I covered the basics of
designing a learning system in ML, in order to complete the design of a learning
algorithm, we need a learning mechanism or a good representation of the target
concept.
Taking a very simple example, one possible target concept may be to Find the day when
my friend Ramesh enjoys his favorite sport. We have some attributes/features of the day
like, Sky, Air Temperature, Humidity, Wind, Water, Forecast and based on this we have a
target Concept named EnjoySport.

We have the following training example available:

Example Sky AirTemp Humidity Wind Water Forecast EnjoySport

1 Sunny Warm Normal Strong Warm Same Yes

2 Sunny Warm High Strong Warm Same Yes

3 Rainy Cold High Strong Warm Change No

4 Sunny Warm High Strong Cool Change Yes


Let’s Design the problem formally with TPE(Task, Performance, Experience):

Problem: Leaning the day when Ramesh enjoys the sport.

Task T: Learn to predict the value of EnjoySport for an arbitrary day, based on the values
of the attributes of the day.

Performance measure P: Total percent of days (EnjoySport) correctly predicted.

Training experience E: A set of days with given labels (EnjoySport: Yes/No)

Let us take a very simple hypothesis representation which consists of a conjunction of


constraints in the instance attributes. We get a hypothesis h_i with the help of example i
for our training set as below:

hi(x) := <x1, x2, x3, x4, x5, x6>

where x1, x2, x3, x4, x5 and x6 are the values of Sky, AirTemp, Humidity, Wind, Water and
Forecast.

Hence h1 will look like(the first row of the table above):

h1(x=1): <Sunny, Warm, Normal, Strong, Warm, Same > Note: x=1 represents a positive
hypothesis / Positive example

We want to find the most suitable hypothesis which can represent the concept. For
example, Ramesh enjoys his favorite sport only on cold days with high humidity (This
seems independent of the values of the other attributes present in the training
examples).

h(x=1) = <?, Cold, High, ?, ?, ?>


Here ? indicates that any value of the attribute is acceptable. Note: The most generic
hypothesis will be < ?, ?, ?, ?, ?, ?> where every day is a positive example and the most
specific hypothesis will be <?,?,?,?,?,? > where no day is a positive example.

We will discuss the two most popular approaches to find a suitable hypothesis, they are:
Find-S Algorithm
List-Then-Eliminate Algorithm
Find-S Algorithm:
Following are the steps for the Find-S algorithm:

Initialize h to the most specific hypothesis in H

For each positive training example,

For each attribute, constraint ai in h

If the constraints ai is satisfied by x

Then do nothing

Else replace ai in h by the next more general constraint that is satisfied by x

Output hypothesis h

The LIST-THEN-ELIMINATE Algorithm:


Following are the steps for the LIST-THE-ELIMINATE algorithm:

VersionSpace <- a list containing every hypothesis in H

For each training example, <x, c(x)>

Remove from VersionSpace any hypothesis h for which h(x) != c(x)

Output the list of hypotheses in VersionSpace.


VERSION SPACE DETAIL Notes

https://kogs-www.informatik.uni-hamburg.de/~neumann/WBS-WS-2006/VersionSpaceL
earning.pdf

You might also like