Professional Documents
Culture Documents
Unit 1
Unit 1
UNIT 1
Artificial Intelligence (AI) refers to the ability of machines or computer systems to perform tasks that
typically require human intelligence, such as learning, problem-solving, perception, decision making,
and natural language processing.
1 Machine Learning: This is a technique of teaching machines to learn from data without being
explicitly programmed. It involves feeding large amounts of data to a computer system, which then
uses statistical algorithms to identify patterns and make predictions or decisions.
2. Natural Language Processing (NLP): This is the ability of machines to understand and process
human language. NLP involves teaching computers to analyze, interpret, and generate human
language in a way that is useful and meaningful.
3 Computer Vision: This involves teaching machines to recognize and interpret visual information,
such as images and videos. Computer vision systems use complex algorithms and deep learning
models to analyze images and identify objects, people, and other features within them.
4. Robotics: This is the field of AI that deals with the design, construction, and operation of robots.
Robotics involves developing intelligent machines that can perceive and interact with their
environment, perform tasks autonomously, and learn from their experiences.
5. Expert Systems: These are computer programs that mimic the decision-making abilities of a
human expert in a particular domain. Expert systems use knowledge bases and inference engines to
reason and make decisions based on the available data.
6. Deep learning: Deep learning is a type of machine learning that uses neural networks with
multiple layers to learn more complex patterns in data. Deep learning is often used for tasks like
image recognition, natural language processing, and speech recognition.
These are just a few of the fundamentals of AI. As AI continues to evolve, new techniques and
technologies are likely to emerge, expanding the scope and capabilities of intelligent machines.
Data science is an interdisciplinary field that combines statistical analysis, machine learning, and
computer science to extract insights from data. Data science is increasingly important in many fields,
including business, healthcare, social sciences, and more. Here are some common needs and
applications of data science:
Decision-making: Data science can be used to inform decisions by providing insights into patterns
and trends. This can be particularly useful in fields like finance, where data analysis can help identify
investment opportunities or risks.
Predictive modeling: Data science can be used to build models that predict future outcomes based
on historical data. This is particularly useful in fields like healthcare, where predictive modeling can
be used to identify patients at risk of developing certain conditions.
Customer analysis: Data science can be used to analyze customer behavior and preferences,
enabling businesses to better target their products and services. This can lead to improved customer
satisfaction and increased sales.
Fraud detection: Data science can be used to detect fraud by analyzing patterns and anomalies in
data. This is particularly useful in fields like finance, where fraudulent activity can have significant
financial consequences.
Natural language processing: Data science can be used to analyze and process human language,
enabling applications like language translation, sentiment analysis, and speech recognition.
Image and video analysis: Data science can be used to analyze images and videos, enabling
applications like facial recognition, object detection, and autonomous driving.
These are just a few examples of the many needs and applications of data science. As the field
continues to develop, new applications and use cases will continue to emerge.
Data Mining:
Data mining is the process of extracting useful information or patterns from large datasets. It is a
subset of the broader field of data science, and is typically used in combination with other
techniques like machine learning and statistical analysis.
Data mining involves analyzing data from multiple sources, such as databases, data warehouses, and
the internet, to discover patterns, trends, and relationships that can be used to inform business
decisions or improve operational efficiency.
Data collection: Gathering relevant data from various sources, including databases, data
warehouses, and the internet.
Data cleaning: Identifying and removing errors, duplicates, and inconsistencies in the data.
Data transformation: Converting the data into a format that can be analyzed, such as standardizing
units of measurement or converting text data into numerical values.
Data mining: Applying algorithms to the data to identify patterns, trends, and relationships.
Evaluation: Assessing the results of the data mining process to determine their usefulness and
accuracy.
Deployment: Incorporating the results of the data mining process into decision-making processes or
operational workflows.
Data mining has numerous applications in a variety of fields, including marketing, finance,
healthcare, and manufacturing. For example, in marketing, data mining can be used to identify
customer segments with similar characteristics, allowing businesses to better target their marketing
efforts. In healthcare, data mining can be used to identify patterns in patient data that can help
identify risk factors for certain diseases.
Data Preparation:
Data preparation, also known as data preprocessing, is the process of cleaning, transforming, and
organizing raw data in preparation for analysis. It is a critical step in the data science workflow, as
the quality of the results of data analysis and modeling depend heavily on the quality of the data
being analyzed.
Data cleaning: This involves identifying and handling missing, inconsistent, or incorrect data. Missing
data can be imputed or removed, while inconsistent or incorrect data can be corrected or deleted.
Data transformation: This involves converting data into a suitable format for analysis. This can
include encoding categorical data, scaling numerical data, and reducing dimensionality.
Data integration: This involves combining data from multiple sources into a single dataset. This can
be challenging, as different data sources may use different formats or structures.
Data reduction: This involves reducing the amount of data to be analyzed by selecting a subset of
features or observations that are most relevant for the analysis.
Data formatting: This involves formatting the data in a way that is suitable for the analysis. This can
include reshaping the data or converting it into a different file format.
Data splitting: This involves splitting the data into training, validation, and testing sets for machine
learning applications.
Data preparation can be a time-consuming and labor-intensive process, but it is critical for ensuring
that the data being analyzed is accurate, consistent, and suitable for the analysis at hand. Good data
preparation practices can help reduce the risk of errors and ensure that the results of the analysis
are trustworthy and actionable.
Based on the methods and way of learning, machine learning is divided into mainly
four types, which are:
The main goal of the supervised learning technique is to map the input
variable(x) with the output variable(y). Some real-world applications of supervised
learning are Risk Assessment, Fraud Detection, Spam filtering, etc.
Categories of Supervised Machine Learning
Supervised machine learning can be classified into two types of problems, which are
given below:
o Classification
o Regression
a) Classification
Classification algorithms are used to solve the classification problems in which the
output variable is categorical, such as "Yes" or No, Male or Female, Red or Blue,
etc. The classification algorithms predict the categories present in the dataset. Some
real-world examples of classification algorithms are Spam Detection, Email
filtering, etc.
b) Regression
Disadvantages:
o Image Segmentation:
Supervised Learning algorithms are used in image segmentation. In this
process, image classification is performed on different image data with pre-
defined labels.
o Medical Diagnosis:
Supervised algorithms are also used in the medical field for diagnosis
purposes. It is done by using medical images and past labelled data with
labels for disease conditions. With such a process, the machine can identify a
disease for the new patients.
o Fraud Detection - Supervised Learning classification algorithms are used for
identifying fraud transactions, fraud customers, etc. It is done by using historic
data to identify the patterns that can lead to possible fraud.
o Spam detection - In spam detection & filtering, classification algorithms are
used. These algorithms classify an email as spam or not spam. The spam
emails are sent to the spam folder.
o Speech Recognition - Supervised learning algorithms are also used in speech
recognition. The algorithm is trained with voice data, and various
identifications can be done using the same, such as voice-activated
passwords, voice commands, etc.
In unsupervised learning, the models are trained with the data that is neither
classified nor labelled, and the model acts on that data without any supervision.
So, now the machine will discover its patterns and differences, such as colour
difference, shape difference, and predict the output when it is tested with the test
dataset.
o Clustering
o Association
1) Clustering
The clustering technique is used when we want to find the inherent groups from the
data. It is a way to group the objects into a cluster such that the objects with the
most similarities remain in one group and have fewer or no similarities with the
objects of other groups. An example of the clustering algorithm is grouping the
customers by their purchasing behaviour.
2) Association
Some popular algorithms of Association rule learning are Apriori Algorithm, Eclat,
FP-growth algorithm.
Disadvantages:
3. Semi-Supervised Learning
Semi-Supervised learning is a type of Machine Learning algorithm that lies
between Supervised and Unsupervised machine learning. It represents the
intermediate ground between Supervised (With Labelled training data) and
Unsupervised learning (with no labelled training data) algorithms and uses the
combination of labelled and unlabeled datasets during the training period.
Disadvantages:
4. Reinforcement Learning
Reinforcement learning works on a feedback-based process, in which an AI
agent (A software component) automatically explore its surrounding by hitting
& trail, taking action, learning from experiences, and improving its
performance. Agent gets rewarded for each good action and get punished for each
bad action; hence the goal of reinforcement learning agent is to maximize the
rewards.
The reinforcement learning process is similar to a human being; for example, a child
learns various things by experiences in his day-to-day life. An example of
reinforcement learning is to play a game, where the Game is the environment, moves
of an agent at each step define states, and the goal of the agent is to get a high
score. Agent receives feedback in terms of punishment and rewards.
Due to its way of working, reinforcement learning is employed in different fields such
as Game theory, Operation Research, Information theory, multi-agent systems.
o VideoGames:
RL algorithms are much popular in gaming applications. It is used to gain
super-human performance. Some popular games that use RL algorithms
are AlphaGO and AlphaGO Zero.
o ResourceManagement:
The "Resource Management with Deep Reinforcement Learning" paper
showed that how to use RL in computer to automatically learn and schedule
resources to wait for different jobs in order to minimize average job
slowdown.
o Robotics:
RL is widely being used in Robotics applications. Robots are used in the
industrial and manufacturing area, and these robots are made more powerful
with reinforcement learning. There are different industries that have their
vision of building intelligent robots using AI and Machine learning technology.
o TextMining
Text-mining, one of the great applications of NLP, is now being implemented
with the help of Reinforcement Learning by Salesforce company.
Disadvantage
o RL algorithms are not preferred for simple problems.
o RL algorithms require huge data and computations.
o Too much reinforcement learning can lead to an overload of states which can
weaken the results.
The curse of dimensionality limits reinforcement learning for real physical systems.