Aiml Mca

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 38

Introduction to Artificial Intelligence

Machine Learning
• Machine Learning is defined as a technology that is used to train machines to perform various actions such as
predictions, recommendations, estimations, etc.
• All these operations of ML such as predictions are based on historical data or past experience.
• Past experience and predicted data used by machine learning for training purpose enables computers to
behave like human beings.
• There are three key aspects of Machine Learning, which are as follows: Task, Experience, and Performance.
• Task: A task is defined as the main problem in which we are interested. This task/problem can be related to
the predictions and recommendations and estimations, etc.
• Experience: It is defined as learning from historical or past data and used to estimate and resolve future
tasks.
• Performance: It is defined as the capacity of any machine to resolve any machine learning task or problem
and provide the best outcome for the same. However, performance is dependent on the type of machine
learning problems.
• Q. Define Machine Learning and the key aspects of ML?
Handwriting recognition learning problem
• Task T: Recognizing and classifying handwritten words within images.
• Training experience E: A data-set of handwritten words with given classifications.
• Performance measure P: Percent of words correctly classified, accuracy.
• In order to perform the task T, the system learns from the data-set provided. A data-set is a
collection of many examples. An example is a collection of features.
Categories of Machine Learning
• Machine Learning is generally categorized into three types: Supervised Learning, Unsupervised
Learning, Reinforcement learning.
• Supervised Learning: Supervised learning is applicable when a machine has sample data, i.e.,
input as well as output data with correct labels.
• Correct labels are used to check the correctness of the model using some labels and tags.
• Supervised learning technique helps us to predict future events with the help of past experience
and labeled examples.
• Initially, it analyses the known training dataset, and later it introduces an inferred function that
makes predictions about output values.
• Further, it also predicts errors during this entire learning process and also corrects those errors
through algorithms.
• Unsupervised Learning: In unsupervised learning, a machine is trained with some input samples or labels
only, while output is not known.
• The training information is neither classified nor labeled; hence, a machine may not always provide correct
output compared to supervised learning.
• Example: Let's assume a machine is trained with some set of documents having different categories (Type A,
B, and C), and we have to organize them into appropriate groups. Because the machine is provided only with
input samples or without output, so, it can organize these datasets into type A, type B, and type C categories,
but it is not necessary whether it is organized correctly or not.
• Reinforcement Learning: Reinforcement Learning is a feedback-based machine learning technique. In such
type of learning, agents (computer programs) need to explore the environment, perform actions, and on the
basis of their actions, they get rewards as feedback.
• For each good action, they get a positive reward, and for each bad action, they get a negative reward. The
goal of a Reinforcement learning agent is to maximize the positive rewards. Since there is no labeled data, the
agent is bound to learn by its experience only.
Cont.
• Semi-supervised Learning is an intermediate technique of both supervised and unsupervised
learning.
• It performs actions on datasets having few labels as well as unlabeled data. However, it generally
contains unlabeled data. Hence, it also reduces the cost of the machine learning model as labels are
costly, but for corporate purposes, it may have few labels.
• Further, it also increases the accuracy and performance of the machine learning model.
• Semi-supervised learning helps data scientists to overcome the drawback of supervised and
unsupervised learning.
• Speech analysis, web content classification, protein sequence classification, text documents
classifiers., etc., are some important applications of Semi-supervised learning.
• Q. Explain different types of ML algorithm?
Data Labeling
• In machine learning, data labeling is the process of identifying raw data (images, text files, videos,
etc.) and adding one or more meaningful and informative labels to provide context so that a
machine learning model can learn from it.
• For example, labels might indicate whether a photo contains a bird or car, which words were
uttered in an audio recording, or if an x-ray contains a tumor.
• Data labeling is required for a variety of use cases including computer vision, natural language
processing, and speech recognition.
How does data labeling work?
• Today, most practical machine learning models utilize supervised learning, which applies an
algorithm to map one input to one output. For supervised learning to work, you need a labeled set
of data that the model can learn from to make correct decisions. Data labeling typically starts by
asking humans to make judgments about a given piece of unlabeled data. For example, labelers
may be asked to tag all the images in a dataset where “does the photo contain a bird” is true. The
tagging can be as rough as a simple yes/no or as granular as identifying the specific pixels in the
image associated with the bird. The machine learning model uses human-provided labels to learn
the underlying patterns in a process called "model training." The result is a trained model that can
be used to make predictions on new data.
• In machine learning, a properly labeled dataset that you use as the objective standard to train and
assess a given model is often called “ground truth.” The accuracy of your trained model will
depend on the accuracy of your ground truth, so spending the time and resources to ensure highly
accurate data labeling is essential.
What are some common types of data labeling?
• Computer Vision: When building a computer vision system, you first need to label images, pixels,
or key points, or create a border that fully encloses a digital image, known as a bounding box, to
generate your training dataset.
• For example, you can classify images by quality type (like product vs. lifestyle images) or content
(what’s actually in the image itself), or you can segment an image at the pixel level. You can then
use this training data to build a computer vision model that can be used to automatically categorize
images, detect the location of objects, identify key points in an image, or segment an image.  
• Natural Language Processing: Natural language processing requires you to first manually identify
important sections of text or tag the text with specific labels to generate your training dataset. For
example, you may want to identify the sentiment or intent of a text blurb, identify parts of speech,
classify proper nouns like places and people, and identify text in images, PDFs, or other files.
• To do this, you can draw bounding boxes around text and then manually transcribe the text in your
training dataset. Natural language processing models are used for sentiment analysis, entity name
recognition, and optical character recognition.
• Audio Processing: Audio processing converts all kinds of sounds such as speech, wildlife noises
(barks, whistles, or chirps), and building sounds (breaking glass, scans, or alarms) into a structured
format so it can be used in machine learning.
• Audio processing often requires you to first manually transcribe it into written text. From there,
you can uncover deeper information about the audio by adding tags and categorizing the audio.
This categorized audio becomes your training dataset.
Application of Machine Learning
• Automatic Language Translation
• Speech Recognition
• Medical Diagnosis
• Stock Market Trading
• Online Fraud Detection
• Self driving cars
• Traffic Prediction
• Product Recommendation
• Email Spam and Malware Filtering
• Q. What are the applications of ML?
Regression
• Regression is defined as a statistical method that helps us to analyze and understand the
relationship between two or more variables of interest. The process that is adapted to perform
regression analysis helps to understand which factors are important, which factors can be ignored,
and how they are influencing each other.
• In regression, we normally have one dependent variable and one or more independent variables.
Here we try to “regress” the value of the dependent variable “Y” with the help of the independent
variables. In other words, we are trying to understand, how the value of ‘Y’ changes w.r.t change
in ‘X’.
• For the regression analysis is be a successful method, we understand the following terms:
• Dependent Variable: This is the variable that we are trying to understand or forecast.
• Independent Variable: These are factors that influence the analysis or target variable and provide
us with information regarding the relationship of the variables with the target variable.
Regression Analysis
• Regression analysis is used for prediction and forecasting. This has substantial overlap with the
field of machine learning. This statistical method is used across different industries such as,
• Financial Industry- Understand the trend in the stock prices, forecast the prices, and evaluate risks
in the insurance domain
• Marketing- Understand the effectiveness of market campaigns, and forecast pricing and sales of
the product. 
• Manufacturing- Evaluate the relationship of variables that determine to define a better engine to
provide better performance
• Medicine- Forecast the different combinations of medicines to prepare generic medicines for
diseases.
• Q. Define Regression and its use for industry?
Terminologies used in Regression Analysis
• Outliers : Suppose there is an observation in the dataset that has a very high or very low value as
compared to the other observations in the data, i.e. it does not belong to the population, such an
observation is called an outlier. In simple words, it is an extreme value. An outlier is a problem
because many times it hampers the results we get.
• Multicollinearity: When the independent variables are highly correlated to each other, then the
variables are said to be multicollinear. Many types of regression techniques assume
multicollinearity should not be present in the dataset. It is because it causes problems in ranking
variables based on its importance, or it makes the job difficult in selecting the most important
independent variable.
• Overfitting: Overfitting means that our algorithm works well on the training set but is unable to
perform better on the test sets. It is also known as a problem of high variance.
• Underfit: When our algorithm works so poorly that it is unable to fit even a training set well, then
it is said to underfit the data. It is also known as a problem of high bias.
Cont.
Cont.
• Bias: Bias is the difference between the average prediction of our model and the correct value
which we are trying to predict. Model with high bias pays very little attention to the training data
and oversimplifies the model. It always leads to high error on training and test data.
• Variance: Variance is the variability of model prediction for a given data point or a value which
tells us spread of our data. Model with high variance pays a lot of attention to training data and
does not generalize on the data which it hasn’t seen before. As a result, such models perform very
well on training data but has high error rates on test data.
Types of Regression
• Linear Regression
• Polynomial Regression
• Logistic Regression
Linear Regression
• The simplest of all regression types is Linear Regression which tries to establish relationships
between Independent and Dependent variables. The Dependent variable considered here is always
a continuous variable.
• Linear Regression is a predictive model used for finding the linear relationship between a
dependent variable and one or more independent variables.
• Here, ‘Y’ is our dependent variable, which is a continuous numerical and we are trying to
understand how ‘Y’ changes with ‘X’.
• If the relationship with the dependent variable is in the form of single variables, then it is known
as Simple Linear Regression
• Examples of Independent & Dependent Variables: Here x is Rainfall and y is Crop Yield,
Secondly, x is Advertising Expense and y is Sales, and At last, x is sales of goods and y is GDP.
• X --> Y
• If the relationship between Independent and dependent variables is multiple in number, then it is
called Multiple Linear Regression.
• Assumptions: Since Linear Regression assesses whether one or more predictor variables explain
the dependent variable and hence it has 5 assumptions: Linear Relationship, Normality, No or
Little Multicollinearity, No Autocorrelation in errors, and Homoscedasticity.
• Homoscedasticity, or homogeneity of variances, is an assumption of equal or similar variances in
different groups being compared. This is an important assumption of parametric statistical tests
because they are sensitive to any dissimilarities. 
Logistic Regression
• Logistic Regression is also known as Logit, Maximum-Entropy classifier is a supervised learning
method for classification. It establishes a relation between dependent class variables and
independent variables using regression.
• The dependent variable is categorical i.e. it can take only integral values representing different
classes. 
• This model belongs to a family of discriminative classifiers. They rely on attributes which
discriminate the classes well.
• This model is used when we have 2 classes of dependent variables. When there are more than 2
classes, then we have another regression method which helps us to predict the target variable
better.
• It helps us to predict the output of categorical dependent variables using a given set of independent
variables. However, it can be Binary (0 or 1) as well as Boolean (true/false), but instead of giving
an exact value, it gives a probabilistic value between o or 1. 
Cont.
• As Linear regression is used for solving regression problems, similarly, Logistic regression is
helpful for solving classification problems.
• Logistic Regression can be expressed as an 'S-shaped curve called sigmoid functions. It predicts
two maximum values (0 or 1).
• Types of Logistic Regression: Binomial, Multinomial and Ordinal.
Classification
• Classification is defined as the process of recognition, understanding, and grouping of objects and
ideas into preset categories “sub-populations.” With the help of these pre-categorized training
datasets, classification in machine learning programs leverage a wide range of algorithms to
classify future datasets into respective and relevant categories.
• Classification algorithms used in machine learning utilize input training data for the purpose of
predicting the likelihood or probability that the data that follows will fall into one of the
predetermined categories.
• One of the most common applications of classification is for filtering emails into “spam” or “non-
spam”, as used by today’s top email service providers.
• In short, classification is a form of “pattern recognition,”. Here, classification algorithms applied
to the training data find the same pattern (similar number sequences, words or sentiments, and the
like) in future data sets.
Learner in Classification Problem
• Lazy Learners: It first stores the training dataset before waiting for the test dataset to arrive. When
using a lazy learner, the classification is carried out using the training dataset's most appropriate
data. Less time is spent on training, but more time is spent on predictions. Some of the examples
are case-based reasoning and the KNN algorithm.
• Eager Learners: Before obtaining a test dataset, eager learners build a classification model using a
training dataset. They spend more time studying and less time predicting. Some of the examples
are ANN, naive Bayes, and Decision trees.
Types of Classification in Machine Learning
• Binary Classification
• Multi-Class Classification
• Multi-Label Classification
• Imbalanced Classification
Binary Classification
• Those classification jobs with only two class labels are referred to as binary classification.
• Examples comprise - Prediction of conversion (buy or not), Churn forecast (churn or not), and
Detection of spam email (spam or not).
• Binary classification problems often require two classes, one representing the normal state and the
other representing the aberrant state.
• The following are well-known binary classification algorithms:
• Logistic Regression
• Support Vector Machines
• Simple Bayes
• Decision Trees
Multi-Class Classification
• Multi-class labels are used in classification tasks referred to as multi-class classification.
• Examples: Categorization of faces, Classifying plant species and Character recognition using
optical.
• The multi-class classification does not have the idea of normal and abnormal outcomes, in contrast
to binary classification. Instead, instances are grouped into one of several well-known classes.
• The following well-known algorithms can be used for multi-class classification:
• Progressive Boosting
• Choice trees
• Nearest K Neighbors
• Rough Forest
• Simple Bayes
Multi-Label Classification
• Multi-label classification problems are those that feature two or more class labels and allow for the
prediction of one or more class labels for each example.
• Think about the photo classification example. Here a model can predict the existence of many
known things in a photo, such as “person”, “apple”, "bicycle," etc. A particular photo may have
multiple objects in the scene.
• Classification algorithms, include:
• Multi-label Gradient Boosting
• Multi-label Random Forests
• Multi-label Decision Trees
Imbalanced Classification
• The term "imbalanced classification" describes classification jobs where the distribution of
examples within each class is not equal.
• A majority of the training dataset's instances belong to the normal class, while a minority belong to
the abnormal class, making imbalanced classification tasks binary classification tasks in general.
• Examples comprise: Clinical diagnostic procedures, Detection of outliers, and Fraud investigation.
• Examples comprise:
• Cost-sensitive Support Vector Machines
• Cost-sensitive Decision Trees
• Cost-sensitive Logistic Regression
Neural Network
• Neural networks reflect the behavior of the human brain, allowing computer programs to
recognize patterns and solve common problems in the fields of AI, machine learning, and deep
learning.
• Neural networks, also known as artificial neural networks (ANNs) or simulated neural networks
(SNNs), are a subset of machine learning and are at the heart of deep learning algorithms. 
• Artificial neural networks (ANNs) are comprised of a node layers, containing an input layer, one
or more hidden layers, and an output layer.
LSTM Algorithm
• Long Short Term Memory is a kind of recurrent neural network. In RNN output from the last step
is fed as input in the current step.
• LSTM networks are well-suited to classifying, processing and making predictions based on time
series data, since there can be lags of unknown duration between important events in a time series. 
LSTM ALGORITHM STRUCTURE
Working of LSTM Model
• Firstly, at a basic level, the output of an LSTM at a particular point in time is dependent on three
things:
• The current long-term memory of the network — known as the cell state.
• The output at the previous point in time — known as the previous hidden state
• The input data at the current time step.
• LSTMs use a series of ‘gates’ which control how the information in a sequence of data comes into,
is stored in and leaves the network.
• There are three gates in a typical LSTM; forget gate, input gate and output gate.
• These gates can be thought of as filters and are each their own neural network.
Step 1
• The first step in the process is the forget gate. Here we will decide which bits of the cell state
(long term memory of the network) are useful given both the previous hidden state and new input
data.
• To do this, the previous hidden state and the new input data are fed into a neural network. This
network generates a vector where each element is in the interval [0,1] (ensured by using the
sigmoid activation). This network (within the forget gate) is trained so that it gives outputs close to
0 when a component of the input is deemed irrelevant and closer to 1 when relevant. It is useful to
think of each element of this vector as a sort of filter/sieve which allows more information through
as the value gets closer to 1.
• In summary, the forget gate decides which pieces of the long-term memory should now be
forgotten (have less weight) given the previous hidden state and the new data point in the
sequence.
Step 2
• The next step involves the new memory network and the input gate. The goal of this step is to
determine what new information should be added to the networks long-term memory (cell state),
given the previous hidden state and new input data.
• The new memory network is a tanh activated neural network which has learned how to combine
the previous hidden state and new input data to generate a ‘new memory update vector’. This
vector essentially contains information from the new input data given the context from the
previous hidden state. This vector tells us how much to update each component of the long-term
memory (cell state) of the network given the new data.
• Note that we use a tanh here because its values lie in [-1,1] and so can be negative. The possibility
of negative values here is necessary if we wish to reduce the impact of a component in the cell
state.
• The input gate is a sigmoid activated network which acts as a filter, identifying which components
of the ‘new memory vector’ are worth retaining. This network will output a vector of values in
[0,1] (due to the sigmoid activation), allowing it to act as a filter through pointwise multiplication.
Cont.
• The output of parts 1 and 2 are pointwise multiplied. This causes the magnitude of new
information we decided on in part 2 to be regulated and set to 0 if need be. The resulting combined
vector is then added to the cell state, resulting in the long-term memory of the network being
updated.
Step 3
• The final step, the output gate, deciding the new hidden state. To decide this, we will use three
things; the newly updated cell state, the previous hidden state and the new input data.
• The step-by-step process for this final step is as follows:
• Apply the tanh function to the current cell state pointwise to obtain the squished cell state, which
now lies in [-1,1].
• Pass the previous hidden state and current input data through the sigmoid activated neural network
to obtain the filter vector.
• Apply this filter vector to the squished cell state by pointwise multiplication.
• Output the new hidden state!
Working in Simple Words (Important)
• The first part chooses whether the information coming from the previous timestamp is to be
remembered or is irrelevant and can be forgotten.
• In the second part, the cell tries to learn new information from the input to this cell.
• At last, in the third part, the cell passes the updated information from the current timestamp to the
next timestamp.
• Just like a simple RNN, an LSTM also has a hidden state where H(t-1) represents the hidden state of the
previous timestamp and Ht is the hidden state of the current timestamp.
• In addition to that LSTM also have a cell state represented by C(t-1) and C(t) for previous and current
timestamp respectively.
• It is interesting to note that the cell state carries the information along with all the timestamps.

• Q. Explain the working of LSTM Algorithm?

You might also like