Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

What is AI Project Cycle

A step-by-step process that a person should follow to develop an AI Project to solve a problem. AI
Project Cycle provides us with an appropriate framework which can lead us to achieve our goal.

The AI Project Cycle mainly has 5 stages

• Problem Scoping

• Data Acquisition

• Data Exploration

• Modelling

• Evaluation

Problem Scoping

Identifying a problem and having a vision to solve it, is called Problem Scoping. Scoping a problem is
not that easy as we need to have a deeper understanding so that the picture becomes clearer while
we are working to solve it. So we use the 4Ws Problem Canvas to understand the problem in a better
way.

Problem Canvas

The 4Ws Problem canvas helps in identifying the key elements related to the problem. The 4Ws are :

 Who
 What
 Where
 Why

1. Who : This block helps in analysing the people who are getting affected directly or indirectly due
to a problem. Under this, we find out who are the ‘Stakeholders’ (those people who face this
problem and would be benefitted with the solution) to this problem? Below are the questions that
we need to discuss under this block.

 Who are the stakeholders?


 What do you know about them?
2. What: This block helps to determine the nature of the problem. What is the problem and how do
we know that it is a problem? Under this block, we also gather evidence to prove that the problem
you have selected actually exists. Below are the questions that we need to discuss under this block.

 What is the problem?


 How do you know that it is a problem?

3. Where: This block will help us to look into the situation in which the problem arises, the context of
it, and the locations where it is prominent. Here is the Where Canvas:

What is the context/situation in which the stakeholders experience the problem

4. Why: In the “Why” canvas, we think about the benefits which the stakeholders would get from the
solution and how it will benefit them as well as the society. Below are the questions that we need to
discuss under this block.

 What would be of key value to the stakeholders?


 How would it improve their situation?

What is Data Acquisition

This is the second stage of AI Project cycle. According to the term, this stage is about acquiring data
for the project. Whenever we want an AI project to be able to predict an output, we need to train it
first using data.

For example, If you want to make an Artificially Intelligent system which can predict the salary of any
employee based on his previous salaries, you would feed the data of his previous salaries into the
machine. The previous salary data here is known as Training Data while the next salary prediction
data set is known as the Testing Data.

Data features refer to the type of data you want to collect. In above example, data features would be
salary amount, increment percentage, increment period, bonus, etc. There can be various ways to
collect the data. Some of them are:

 Surveys
 Web Scraping
 Sensors
 Cameras
 Observations
 API (Application Program Interface)

One of the most reliable and authentic sources of information, are the open-sourced websites
hosted by the government. Some of the open-sourced Govt. portals are: data.gov.in, india.gov.in

What is Data Exploration


While acquiring data, we must have noticed that the data is a complex entity – it is full of numbers
and if anyone wants to make some sense out of it, they have to work some patterns out of it. Thus,
to analyse the data, you need to visualise it in some user-friendly format so that you can:

 Quickly get a sense of the trends, relationships and patterns contained within the data.
 Define strategy for which model to use at a later stage.
 Communicate the same to others effectively.
 To visualise data, we can use various types of visual representations like Bargraph,
Histogram, Line Chart, Pie Chart.

What is Data Modelling

The graphical representation makes the data understandable for humans as we can discover trends
and patterns out of it, but machine can analyse the data only when the data is in the most basic form
of numbers (which is binary – 0s and 1s). The ability to mathematically describe the relationship
between parameters is the heart of every AI model.

Generally, AI models can be classified as follows:

Rule Based Approach

It refers to the AI modelling where the rules are defined by the developer. The machine follows the
rules or instructions mentioned by the developer and performs its task accordingly.

In this we fed the data along with rules to the machine and the machine after getting trained on
them is now able to predict answers for the same. A drawback/feature for this approach is that the
learning is static.

What are advantages of the rule-based system in AI?

1. A rule-based system is generally cost-efficient and accurate in terms of its results.


2. The outputs generated by the system are dependent on rules so the output responses are
stable and not random.
3. The coverage for different circumstances is less, whatever scenarios are covered by the Rule
Based system will provide high accuracy. The error rate goes down because of the predefined
rules.
4. It's feasible to reduce the amount of risk in terms of system accuracy.
5. Optimizing the speed of the system is easier as you know all the parts. So, providing instant
outputs, is not a big issue.
What are the disadvantages of the rule-based system in AI?

1. A rule-based system is built upon a lot of data, deep knowledge of the domain, and a lot of
manual work.
2. Writing and generating rules for a complex system is quite challenging and time-consuming.
3. The self-learning capacity in a rule-based system is less as it generates the result as per the
rules.
4. Complex pattern identification is a challenging task in the Rule Based method as it takes a lot
of time and analysis.

Learning Based Approach

It refers to the AI modelling where the machine learns by itself. In this approach the AI model gets
trained on the data fed to it and then is able to design a model which is adaptive to the change in
data. An advantage for this approach is that the learning is dynamic. The learning-based approach
can further be divided into three parts:

Supervised Learning: In a supervised learning model, the dataset which is fed to the machine is
labelled. A label is some information which can be used as a tag for data. For example, students get
grades according to the marks they secure in examinations. These grades are labels which categorise
the students according to their marks. There are two types of Supervised Learning models:

 Classification: Where the data is classified according to the labels. This model works on
discrete dataset which means the data need not be continuous.
The best example to understand the Classification problem is Email Spam Detection. The
model is trained on the basis of millions of emails on different parameters, and whenever it
receives a new email, it identifies whether the email is spam or not. If the email is spam,
then it is moved to the Spam folder.

 Regression: Such models work on continuous data. For example, if we wish to predict our
next salary, then we would put in the data of our previous salary, any increments, etc., and
would train the model. Here, the data which has been fed to the machine is continuous.

The task of the Regression algorithm is to find the mapping function to map the input
variable(x) to the continuous output variable(y).
Example: Suppose we want to do weather forecasting, so for this, we will use the Regression
algorithm. In weather prediction, the model is trained on the past data, and once the
training is completed, it can easily predict the weather for future days.

Regression Algorithm Classification Algorithm

In Regression, the output variable must In Classification, the output variable must be a discrete
be of continuous nature or real value. value.

The task of the regression algorithm is to The task of the classification algorithm is to map the
map the input value (x) with the input value(x) with the discrete output variable(y).
continuous output variable(y).

Regression Algorithms are used with Classification Algorithms are used with discrete data.
continuous data.

In Regression, we try to find the best fit In Classification, we try to find the decision boundary,
line, which can predict the output more which can divide the dataset into different classes.
accurately.

Regression algorithms can be used to Classification Algorithms can be used to solve


solve the regression problems such as classification problems such as Identification of spam
Weather Prediction, House price emails, Speech Recognition, Identification of cancer
prediction, etc. cells, etc.

The regression Algorithm can be further The Classification algorithms can be divided into Binary
divided into Linear and Non-linear Classifier and Multi-class Classifier.
Regression.

Unsupervised Learning: An unsupervised learning model works on unlabelled dataset. This means
that the data which is fed to the machine is random. This model is used to identify relationships,
patterns and trends out of the data which is fed into it. It helps the user in understanding what the
data is about and what are the major features identified by the machine in it.

Unsupervised learning models can be further divided into two categories:

 Clustering: Clustering or cluster analysis is a machine learning technique, which groups the
unlabelled dataset. It can be defined as "A way of grouping the data points into different
clusters, consisting of similar data points. The objects with the possible similarities remain
in a group that has less or no similarities with another group."
 It does it by finding some similar patterns in the unlabelled dataset such as shape, size, color,
behavior, etc., and divides them as per the presence and absence of those similar patterns.
 It is an unsupervised learning method, hence no supervision is provided to the algorithm,
and it deals with the unlabeled dataset.
 After applying this clustering technique, each cluster or group is provided with a cluster-ID.
ML system can use this id to simplify the processing of large and complex datasets.

 The clustering technique can be widely used in various tasks. Some most common uses of
this technique are:
o Market Segmentation
o Statistical data analysis
o Social network analysis
o Image segmentation
o Anomaly detection, et

 Dimensionality Reduction: We humans are able to visualise up to 3-Dimensions only but,


there are various entities which exist beyond 3-Dimensions. For example, in Natural language
Processing, the words are considered to be N-Dimensional entities. So, to make sense out of
it, dimensionality reduction algorithm is used to reduce their dimensions.

Benefits of applying Dimensionality Reduction

Some benefits of applying dimensionality reduction technique to the given dataset are given
below:

o By reducing the dimensions of the features, the space required to store the dataset also gets
reduced.

o Less Computation training time is required for reduced dimensions of features.

o Reduced dimensions of features of the dataset help in visualizing the data quickly.

o It removes the redundant features (if present) by taking care of multicollinearity.

Disadvantages of dimensionality Reduction

There are also some disadvantages of applying the dimensionality reduction, which are given
below:

o Some data may be lost due to dimensionality reduction.

o In the PCA dimensionality reduction technique, sometimes the principal components


required to consider are unknown.
Reinforcement learning: Reinforcement learning is a type of machine learning method where an
intelligent agent (computer program) interacts with the environment and learns to act within that.
How a Robotic dog learns the movement of his arms is an example of Reinforcement learning.

Application of Reinforcement Learnings

1. Robotics: Robots with pre-programmed behavior are useful in structured environments, such as
the assembly line of an automobile manufacturing plant, where the task is repetitive in nature.

2. A master chess player makes a move. The choice is informed both by planning, anticipating
possible replies and counter replies.

3. An adaptive controller adjusts parameters of a petroleum refinery’s operation in real time.

RL can be used in large environments in the following situations:

1. A model of the environment is known, but an analytic solution is not available;

2. Only a simulation model of the environment is given (the subject of simulation-based


optimization)

3. The only way to collect information about the environment is to interact with it.

Advantages and Disadvantages of Reinforcement Learning

Advantages of Reinforcement learning

1. Reinforcement learning can be used to solve very complex problems that cannot be solved by
conventional techniques.

2. The model can correct the errors that occurred during the training process.

3. In RL, training data is obtained via the direct interaction of the agent with the environment

4. Reinforcement learning can handle environments that are non-deterministic, meaning that the
outcomes of actions are not always predictable. This is useful in real-world applications where the
environment may change over time or is uncertain.

5. Reinforcement learning can be used to solve a wide range of problems, including those that
involve decision making, control, and optimization.

6. Reinforcement learning is a flexible approach that can be combined with other machine learning
techniques, such as deep learning, to improve performance.

Disadvantages of Reinforcement learning

1. Reinforcement learning is not preferable to use for solving simple problems.

2. Reinforcement learning needs a lot of data and a lot of computation

3. Reinforcement learning is highly dependent on the quality of the reward function. If the reward
function is poorly designed, the agent may not learn the desired behavior.
4. Reinforcement learning can be difficult to debug and interpret. It is not always clear why the agent
is behaving in a certain way, which can make it difficult to diagnose and fix problems.

Rule Based Approach Learning Based Approach


It refers to the AI modelling where the It refers to the AI modelling where the
rules are defined by the developer machine learns by itself
In this learning is static In this learning is dynamic
The machine once trained, does not take The machine once trained, does take into
into consideration any changes made in consideration any changes made in the
the original training dataset original training dataset

Comparison between Classification and Clustering:

Parameter CLASSIFICATION CLUSTERING


Type used for supervised learning used for unsupervised learning
Basic process of classifying the input grouping the instances based on
instances based on their corresponding their similarity without the help of
class labels class labels
Need it has labels so there is need of training there is no need of training and
and testing dataset for verifying the testing dataset
model created
Complexity more complex as compared to less complex as compared to
clustering classification

What is Evaluation

Once a model has been made and trained, it needs to go through proper testing so that one can
calculate the efficiency and performance of the model. Hence, the model is tested with the help of
Testing Data and the efficiency of the model is calculated on the basis of the parameters mentioned
below

 Accuracy
 Precision
 Recall
 F1 Score (F1 score is a machine learning evaluation metric that measures a model's accuracy)

Neural Network

Neural networks are loosely modelled after how neurons in the human brain behave. The key
advantage of neural networks is that they are able to extract data features automatically without
needing the input of the programmer. It is a fast and efficient way to solve problems for which the
dataset is very large, such as in images.
As seen in the figure given above, the larger Neural Networks tend to perform better with larger
amounts of data whereas the traditional machine learning algorithms stop improving after a certain
saturation point.

How Neural Network works

A Neural Network is divided into multiple layers and each layer is further divided into several blocks
called nodes. The first layer of a Neural Network is known as the input layer. Its job is to acquire data
and feed it to the Neural Network. No processing occurs at the input layer. Next to it, are the hidden
layers. Hidden layers are the layers in which the whole processing occurs. These layers are hidden
and are not visible to the user. There can be multiple hidden layers in a neural network system. The
last hidden layer passes the final processed data to the output layer which then gives it to the user as
the final output.

Some of the features of a Neural Network are listed below:

 Neural Network Systems are modelled on the human brain and nervous system.
 They are able to automatically extract features without input from the programmer.
 Every neural network node is essentially a machine learning algorithm.
 It is useful when solving problems for which the data set is very large.

You might also like