AI Fundamentals

What is Artificial
Intelligence?
U N D E R S TA N D I N G A R T I F I C I A L I N T E L L I G E N C E
Iván Palomares Carrascosa

Senior Data Science & AI Manager
What is AI?
UNDERSTANDING ARTIFICIAL INTELLIGENCE

What is AI?

What is AI?

What is AI?

First things first: Computer Science
The group of technical knowledge needed for the automatic processing of information by
computers: hardware, software, data, networks, ...



Artificial Intelligence (AI)
Machines that learn to mimic reasoning, decision-making, and in general exhibit some
degree of human-like intelligence to solve a problem.
EU Commission, 2019:
"Systems that, given a goal, perceive their environment, interpret the collected data, reason
to derive knowledge, and decide the best action(s) to achieve the goal".

AI vs Artificial General Intelligence
Artificial Intelligence (AI) Artificial General Intelligence (AGI)
Perceives, interprets and learns from data. Equals or exceeds average human
Reasons and makes decisions intelligence
Excels at solving specific tasks Solves a breadth of tasks intelligently

AI vs Artificial General Intelligence
Examples of AI "Halfway" examples towards AGI
Voice assistants Self-driving cars
Facial recognition AlphaGo
Personalized recommendations Generative AI: Language Models (e.g. GPT)
Autonomous industrial robots

Let's practice!
What AI can -and
cannot- do?

Things AI can do

Predictions and inference
Machine Learning: Learn from data how to
make predictions or inferences.

make predictions of inferences.
Predictions: forecasting what will happen in

the future, e.g. weather forecast.

make predictions of inferences.
Predictions: forecasting what will happen in

the future, e.g. weather forecast.
Inference: determine output based on data

inputs (predictors), e.g. books you may like.

Pattern recognition
Identify patterns in the data to help make
decisions:
Clustering (segmentation)
Anomaly detection
Data generation (Generative AI)

Optimization
Find the best possible solution for a
problem at a minimum cost, under
constraints.
Logistics and delivery: smart routing
Energy: Power grid operation and control
Tourism: flights and hotel pricing
Marketing: maximum-revenue campaigns

Automation
Automation: follow set of rules to perform
(usually repetitive) tasks.
Classifying documents, photos, etc.
Job application screening
Parcel management robots

Limitations of AI
Social skills: emotional intelligence, empathy New, unseen situations, e.g. new items to
recommend
Bias: making unfair decisions to some groups Data ...

Let's practice!
Areas and related
disciplines of AI

Subdomains of Artificial Intelligence

Machine Learning: Learn from data;
predictions, inference

Deep Learning: neural networks; solve
most challenging AI problems

Knowledge representation and reasoning:
reason, communicate with other AI systems

Robotics: act and manipulate physical

environment


environment
Computer Vision: visually perceiving

objects in the environment


environment

Natural Language Processing: analyze,

understand, communicate human language


environment

Natural Language Processing: analyze,

understand, communicate human language

Examples of AI applications
Personalized product recommendations Warehouse management
Machine Learning Robotics, Computer Vision, Reasoning
Medical diagnosis Smart voice assistants
Computer Vision, Deep Learning NLP, Deep Learning

Related disciplines

Related disciplines

Related disciplines

Related disciplines

Video takeaways
AI is an umbrella discipline with several popular areas.
Present AI systems and applications combine principles from multiple areas.
Math, Data Science, and Statistics are closely related disciplines to AI.

Let's practice!
Algorithms and AI
systems demystified

What is an algorithm?
Algorithm: a set of (computer) instructions
to solve a problem or perform an action.





Algorithms in Computer Science vs AI algorithms


AI algorithms: learn by themselves to produce better outputs or processes from input data

What is an AI system?
AI system: infrastructure and components needed to implement and deploy AI algorithms in
the real world

the real world

the real world

Let's practice!
Acquiring data

AI functions and areas involved





Data acquisition: sensing the environment
Collect outside sensory information
through sensors: mimic human senses
Transform perceptions into data
Occurs in:
NLP and audio: capturing speech, sounds.
Computer Vision: satellite images,

fingerprint, etc.
Robotics and sensors: temperature, touch,

motion, gravity, etc.

Data acquisition: datasets
Dataset collection of data: data samples or instances of a given type of data
Structured: tabular format, spreadsheets
Unstructured: images, audio, videos, text, ...




Let's practice!
Learning from data



Enter Machine Learning (ML)
Machine Learning: learn from data and identify patterns to perform inference tasks:
predictions, classifications, clustering, ...





Supervised Learning: classification
Classification: assign each data observation the category (class) it may belong to
Binary classification: two classes, e.g. positive/negative, male/female, etc.

Supervised Learning: classification
Classification: assign each data observation the category (class) it may belong to
Binary classification: two classes, e.g. positive/negative, male/female, etc.
Multi-class classification: several mutually exclusive classes, e.g. multiple species

Supervised learning: Data annotation (getting labelled observations with known class a
priori) needed to learn/train a model capable of making inference

Machine Learning algorithm vs Model

Machine Learning algorithm vs Model

Supervised Learning: regression and forecasting
Regression: assign each data observation a numerical output or label based on its inputs
Time series forecasting: predict future values of variable, based on its past behavior

Unsupervised and reinforcement learning
Clustering: find subgroups of data with similar Anomaly detection: detecting abnormal data
characteristics (e.g. k-means algorithm) observations e.g. unusual card transactions
Association rule discovery: find common co- Reinforcement learning: learn by experience
occurrences of items in transaction data (trial and error) to master a complex task

How about Deep Learning?
Highly sophisticated models based on deep neural networks: solve very challenging tasks
where classical ML models become limited.
Need a lot of data to learn: sometimes millions of observations.

How about Deep Learning?
Highly sophisticated models based on deep neural networks: solve very challenging tasks
where classical ML models become limited.
Need a lot of data to learn: sometimes millions of observations.

Let's practice!
Interacting with the
Environment



Robotics
Sensing and perception: collecting data or perceiving signals
Mobility: moving in the environment guided by perceptions of surroundings
Manipulation: the robot modifies its environment
Human-robot interaction: e.g. conversational robots endowed with NLP

Computer Vision
Image processing: intelligently enhance images and video
Object detection: identify subjects in images/video for surveillance, logistics, etc.
Motion analysis: extract motion information like speed and direction of objects
Image and video generation: create realistic visual data from human text

Natural Language Processing (NLP)
Text-based
Text classification
Sentiment analysis: extract positive and

negative feelings in text, e.g. customer
reviews.
Question answering (chatbots)
Text summarization
Speech-based
Text-to-speech
Speech-to-text

Chapter summary
Takeaways from this chapter:
Algorithms are the building blocks of AI systems, along with data, hardware and other
components
Acquiring data, learning and reasoning from data, and interacting with the environment,
are three key functions in AI systems
Data collection into datasets are the fuel of most AI systems, especially those guided by
Machine Learning and Deep Learning

Let's practice!
Establishing an AI
culture

The value of AI in organizations





AI for personalization example: identifying customers' shopping habits leads to more loyalty
and increased sales.

Building an AI-driven organization
1. Roadmap: obtain leadership support and a clear vision for AI adoption

2. Data strategy: plan to collect, use, and govern data for AI

3. Infrastructure resources: scalable computing infrastructure and AI tools

3. Infrastructure resources: scalable computing infrastructure and AI tools
4. Roles: talented AI, Machine Learning, and Data Science roles

5. Collaboration: cross-functional AI projects

6. Success: define and pursue success aims, e.g. customer-centric, impact on revenue, etc.

6. Success: define and pursue success metrics, e.g. customer-centric, impact on revenue, etc.
7. AI & Data literacy: continuous AI and data evangelization for everyone

6. Success: define and pursue success metrics, e.g. customer-centric, impact on revenue, etc.
7. AI & Data literacy: continuous AI and data evangelization for everyone
8. Responsible AI: ethical, secure, and accountable use of AI and data

8 elements, 3 dimensions

AI-driven organization: roadmap



Example: insurance company AI roadmap
1. Objective: efficient claim processing
2. Resources: data scientists; ML experts;

cloud infrastructure; customer, policy and
claim data
3. Implementation: ML model for automated
fraud detection and claim classification,
extendable to customer service

Let's practice!
Data strategy,
resources, and
people

Data strategy and governance
Data strategy: design and development of
data-centric approaches for information
extraction and business decision-making
Data strategy steps:
1. Setting data-oriented objectives
2. Find out necessary data
3. Determine data sources and types
4. Predictive and prescriptive analysis
5. Operationalize data-driven processes

Resources: AI infrastructure
Cloud-based AI infrastructure On premises (self-hosted) AI infrastructure
Scalable computing resources, data Organizations own their hardware

storage, AI & ML development tools and software, data, and network resources to
pre-built models. Elastic, on-demand support AI operations
Pros: High scalability, Cost-effectiveness Pros: Enhanced data control, lower latency
Cons: Data location, Internet needed Cons: Upfront costs, limited scalability
1 Left image: Google Cloud Platform, Microsoft Azure, and Amazon Web Services logos

Resources: MLOps methodology
Machine Learning Operations (MLOps): efficient and reliable management and operation
of ML (AI) systems in the enterprise

Resources: MLOps methodology
Machine Learning Operations (MLOps): efficient and reliable management and operation
of ML (AI) systems in the enterprise

People: AI-related roles
AI Architect Data Scientist
Machine Learning and Data Engineer Others: AI Ethicist, Project Manager
1 Icon made by Freepik, juicy_fish, deemakdaksina from www.flaticon.com

Building your AI team
Leadership and management
AI manager / team lead
AI project manager(s)
Execution & MLOps
AI architects
Data scientists
ML & data engineers
Support

End-to-end data scientists: responsible for whole MLOps lifecycle, over-ambitious skills
Dedicated teams: Dev + Ops teams, strong communication and collaboration needed

Leadership and management
AI manager / team lead
AI project manager(s)
Execution & MLOps
Data scientists
AI architects
ML & data engineers
Support
AI ethicist; domain experts.

Let's practice!
Is your deployed AI
system successful?

AI course instructor, DataCamp
When to measure success?

When to measure success?

Measuring performance offline - accuracy





Beyond accuracy - error and other metrics
Metrics for search and recommendation engines: ranking quality -relevance of ranking items
to the user-, diversity in search results or recommendations, etc.

Measuring success in production
AI/ML metrics: accuracy, error, relevance, diversity, ...
Model degradation: the measured metric value gets worse over the time
Business metrics: Key Performance Indicators (KPIs)

Indicator of performance and progress of organization objectives
Example KPIs: conversion rate, satisfaction (retail) ; turnaround time (healthcare)

Risks: what could possibly go wrong?
Possible risks include:
Data bias
Lack of transparency
Ethical concerns
Dubious system reliability
Vulnerability to cyber threats
Proof-of-Concept (PoC):
Pilot demonstrator to validate feasibility and potential value + early risk identification

Let's practice!
Challenges and
success stories

Challenges
Challenges to build an AI-driven organization
Resources: people, infrastructure, budget

Challenges
Data: availability, quality, governance, privacy

Challenges
Culture: rigid mindset, siloed operations

Challenges
Awareness: "Why AI is critical to the business?"

Success stories: Google
Challenge:
Data quality and accessibility issues
Solution:
Data governance frameworks and data

integration strategies, to leverage large
volumes of data effectively
1 More info: https://www.youtube.com/watch?v=iCVJdFedSv4

Success stories: Airbnb
Challenge:
Talent needed to become AI-driven
Solution:
Talent acquisition and talent development

through upskill training in AI and ML
1 More info: https://www.linkedin.com/pulse/what-made-airbnb-data-team-special-5-traits-i-look-when-claire-

lebarz/

Success stories: IBM
Challenge:
Address ethical and regulatory AI issues
Solution:
AI Ethics Board for responsible AI,

guidelines to mitigate algorithmic bias,
engagement with policymakers
1 More info: https://www.ibm.com/downloads/cas/4DPJK92W

Success stories: Netflix
Challenge:
Large-scale computing infrastructure

needed
Solution:
Cloud infrastructure investments, AI tools

for recommendation, data processing
workflows
1 More info: https://valohai.com/blog/building-machine-learning-infrastructure-at-netflix/

Let's practice!
Democratizing
Artificial Intelligence

AI democratization
AI is deeply impacting our lives
How to bring AI benefits to everyone and eliminate its potentially harmful side?
Access to:
Use of AI-based systems and solutions
Design of AI tools anyone can effortlessly use to supplement their tasks

AI literacy
AI literacy: individuals and organizations'
understanding of AI concepts,
technologies, and their implications in:
Organizations
Society, economy and the environment

AI literacy
Organizations

AI literacy
Organizations

How AI literacy contributes to AI democratization?
Empowered individuals: equipped with Ethics awareness: fairness, privacy,
knowledge and skills to engage with AI transparency, responsible AI
Inclusive participation: engage in AI-related Critical thinking: ability to evaluate AI

activities and participatory decisions systems and make informed judgments
1 Icons made by Paul J.& Freepik from www.flaticon.com

Data democratization
In organizations
Make information underlying data accessible to all roles
Competitive market advantage
Optimizing activities
Proactive strategic mindset
Data upskilling is crucial

Data democratization
In society
Make information accessible to individuals
Enable access, use and contribution to data-driven insights, through:

Open data policies and data-sharing
Data visualization and literacy
Empower communities with data

Let's practice!
Explainability and
interpretability

Explainability and interpretability
Explainability: humans' ability to access and understand AI outputs, e.g. predictions, decisions
Interpretability: understand AI systems' internal processes: algorithm, model, data workflow

White-box vs black-box AI systems
White-box: transparent and easily interpretable models/systems



Black-box: higher complexity, little or no degree of understandability

Black-box: higher complexity, little or no degree of understandability

Basic Explainable AI (XAI) tools
XAI: methods and tools to increase AI systems and models' transparency and explainability
Model introspection: examining internal model parameters to understand decisions


Model documentation: shareable architecture and design considerations


Model documentation: shareable architecture and design considerations
Model visualization: human-friendly representation of data features and model outputs

1 Heatmap source: https://towardsdatascience.com/

XAI tools: feature importance
Feature importance: impact or contribution of
features (predictors) in model outputs
Understand how data-driven models

(ML/DL) make decisions
Detect and mitigate issues, e.g. biases
Impact on model performance if a feature

were removed
SHAP (SHapley Additive exPlanations)
Feature importance visualizations toolbox



SHAP (SHapley Additive exPanations)

Practical implications of XAI
Algorithmic transparency: Ethical considerations:
How algorithms process data and make XAI to address ethical AI concerns:
decisions biases, discrimination, compliance, etc.
Local and global interpretability: Human-AI collaboration:

Understand system behavior for a Reliable collaboration based on trust and
specific prediction, vs feedback
Understand system overall behavior on a

dataset or problem

Let's practice!
Social challenges:
ethics, fairness and
privacy

Responsible AI
Responsible AI: Ethical and accountable development and use of AI systems, with regard to
societal impact

Responsible AI
societal impact

Responsible AI
societal impact

Responsible AI
societal impact

Responsible AI
societal impact

Responsible AI
societal impact

Responsible AI
societal impact

Ethics and fairness
AI ethics: adhere to ethical guidelines and
principles:
Fairness
Transparency
Privacy
Accountability
Liability for AI decisions

Ethics and fairness
principles.
Fairness
Transparency
Privacy
Accountability

Ethics and fairness
principles.
Fairness
Transparency
Privacy
Accountability

Bias in AI systems: examples
Screening job resumes
Biased training data: mostly male hirings
Unfair treatment of female candidates
Solutions: active data collection, bias-

correction algorithms
E-commerce recommendations

Bias in AI systems: examples
Screening job resumes
Biased training data: mostly male
Unfair treatment of female candidates
Solutions: active data collection, bias-

correction algorithms
E-commerce recommendations
Popular products are overly promoted
New or different products are disregarded
Solutions: techniques and metrics for

diverse and fair recommendations

Data privacy in AI systems
Data privacy: safeguarding sensitive or personal information from unauthorized access and
misuse

misuse
1 GDPR: General Data Protection Regulation. CCPA: California Consumer Privacy Act.

misuse
1 GDPR: General Data Protection Regulation. CCPA: California Consumer Privacy Act.

Let's practice!
Social challenges:
the future of AI

How AI may shape our present (and future) society?
Healthcare: advanced diagnosis, personalized Governments and Law: Generative AI; new
treatments, surgical robots, etc. regulations about responsible use
Finance and cybersecurity: risk management, Sustainable Development Goals (SDGs)

fraud detection
1 Icon made by mynamepong, surang (flaticon.com). Image generated in https://stablediffusionweb.com/

AI and sustainability
1 Source: United Nations (https://sdgs.un.org/goals)

AI and sustainability
1 Source: United Nations (https://sdgs.un.org/goals)

AI and the future of workforce
Challenges:
Job displacement, rapidly evolving skillsets
Example: Large Language Models (e.g.

ChatGPT), prompt engineering skills
Opportunities:
New jobs, e.g. AI ethicist, AI educator

Industry transformation

AI and the future of education
Challenges:
Skills gap: educational institutions need

to keep up to date with relevant AI
training
Digital divide: ensure universal access to

AI-powered education
Opportunities:
Personalized learning
Automation of time-consuming
administrative tasks

AI and the future of the environment
Challenges:
Ecological degradation: energy

consumption, electronic waste, carbon
footprint
Opportunities:
Understand climate change
Optimize use of natural resources
Optimize renewable energy use

Let's practice!
One journey ends,
another begins

Chapter 1: What is Artificial Intelligence

Chapter 2: Tasks AI can solve

Chapter 3: Harnessing AI in organizations

Chapter 4: The human side of AI

What to learn next?
AI Essentials Skill Track
An exciting 6-course pathway to consolidate your AI literacy.
Implementing AI Solutions in Business

From use cases to proofs-of-concept, explore AI system deployment in business.
Artificial Intelligence (AI) Concepts in Python

A gentle and practical introduction to implementing AI and ML systems.
1 Image by upklyak (www.freepik.com)

Congratulations!
What is ChatGPT?
I N T R O D U C T I O N T O C H AT G P T
James Chapman
Curriculum Manager, DataCamp
What is ChatGPT?
AI Chatbot application:
Answer questions
Perform tasks
User-inputted text
INTRODUCTION TO CHATGPT
What is ChatGPT?
Traditional chatbots
Predetermined responses
Limited questions
ChatGPT
More generalizable
Uses its understanding of language to

interpret the question and respond
Wide range of potential applications
Generative AI
Subset of AI and Machine Learning
Generates new content

Uses patterns in information it has already
seen
From prompt to response
Step 1: User writes a question or instruction: prompt
Step 2: ChatGPT interprets the prompt
Step 3: Generates new, relevant language in response
Step 4: Response is returned to the user
Summarizing text
ChatGPT is great at summarizing text and

explaining concepts
Save time when summarizing reports

Interpret complex information more easily
Creating marketing content
Creating marketing content
Why utilize ChatGPT?
ChatGPT can perform many tasks with

greater efficiency
New workflow: AI → Human
Save time and money
Greater personalization
Let's practice!
Limitations of
ChatGPT
James Chapman
ChatGPT under the hood
Demystifying the LLM
Limitation 1 - Knowledge cutoff
Trained on data from up to a certain date:

GPT 3.5: January 2022
GPT 4: April 2023
Isn't aware of events beyond this date
Limitation 2 - Training data bias
ChatGPT was on a huge text dataset,

including:
Books
Articles
Websites
Model may learn the biases from the

training data
Could bias the responses
Limitation 3 - Context tracking
Struggles to keep track of the context if the

focus shifts
Can lead to inaccurate or irrelevant results
Tip: Keep conversations to a single topic
Limitation 4 - Hallucination
Model confidently provides inaccurate

information
Often occurs when trying to go beyond the

model's knowledge or abilities
1 https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)
Limitation 5 - Legal and ethical considerations
Example: Creating a song in the style of an

existing artist
Who owns the new song?

existing artist

existing artist
Easy to fall into a legal gray area
Ownership and privacy → Chapter 2
Let's practice!
Writing effective
prompts
James Chapman
Garbage in, garbage out
How does ChatGPT interpret a prompt?
1. Identify the topic
2. Understand the prompt
3. Generate response
Prompt engineering
Prompt engineering is the process of

writing prompts to maximize the quality
and relevance of the response
Writing tips for prompt engineering
Be clear and specific
Include any necessary information
Example: in a summarizing task, specify

the desired length
Keep it concise Use correct grammar and spelling
Remove any information that doesn't ChatGPT uses grammar when interpreting
provide useful context the task
Provide examples if necessary...
Can be a much quicker way of providing
context
Example: Generating example customers
Want the form:
Full Name, Age (Occupation)
Let's practice!
Enabling people to
use ChatGPT
James Chapman
Augmenting workflows
Workflow: Standardized series of tasks to achieve an end goal
Aims:
Highest-quality output
Shortest timeframe
A standard workflow
Example: Summarizing a project scoping

document
30 pages → summary of key findings
We extract the key findings and compile

the summary
Using a spelling and grammar checker to

proofread
A ChatGPT-powered workflow
Example: Summarizing a project scoping

document
30 pages → summary of key findings
ChatGPT allows us to reverse the roles
Human now becomes the proofreader
Huge time-savings!
Allows us to focus on more creative tasks
Leaders
Use cases:
Compose emails
Draft presentations
Brainstorm strategic ideas
Summarize meeting notes
Technical roles
Use cases:
Recall code syntax
Generate examples
Explain code
Troubleshoot errors
Write documentation
HR and people teams
Use cases:
Brainstorm employee engagement and

wellbeing initiatives
Communicate more effectively and

efficiently
Marketing
Use cases:
Write social media posts
Copyediting content
Generating marketing copy
Search Engine Optimization (SEO)
Sales
Use cases:
Generate outreach templates
Personalize outreach content
Brainstorm strategies
Summarize information
Let's practice!
Identifying use cases
for ChatGPT
James Chapman
Coming up...
Can be inaccurate
No predictability in responses
Subject matter expertise is still very
important!
Rule-of-thumb: Don't ask ChatGPT to do

something that we couldn't do ourselves
1 https://openai.com/policies/terms-of-use
Need consent to process the data
Must adhere to data governance laws, such

as GDPR
Legal counsel may be able to enable the

use case
Can claim ownership over ChatGPT output
Other considerations such as copyright
infringement may prevent ownership
Legal and ethics → Coming up!
Example 1: Brainstorming ideas in HR
Improve employee wellbeing
Is this a suitable use?
1. Situation doesn't require definitive answers
2. HR Manager can verify response

3. Sensitive data isn't required
4. Response won't be used
Example 2: Healthcare recommendations
Customers input their symptoms and

receive recommended action
Is this a suitable use?
Use case requires certainty due to

implications of poor recommendations
ChatGPT cannot provide this level of
certainty
Let's practice!
Ownership and
privacy
James Chapman
Ownership and privacy
Ownership and privacy are key

considerations when validating the
suitability of ChatGPT
Neglecting them can risk financial

penalties, lawsuits, and brand damage
Who owns the response?
... As between you and OpenAI, and to the
extent permitted by applicable law... We
hereby assign to you all our right, title, and
interest, if any, in and to Output.
Assuming compliance, users can claim

ownership over the response
... output may not be unique and other Represent that Output was human-
users may receive similar output from our generated when it was not.
Services. Our assignment above does not
extend to other users' output or any Third
Party Output. Use our Services in a way that infringes,
Factual questions or generating small text misappropriates or violates anyone's rights.
snippets → cannot claim ownership Includes copyright infringement
OpenAI's terms of use are updated

frequently
Ownership and copyright
Copyright: the rights of the owner of the

intellectual property (IP) to use or distribute
the material
If generated content resembles copyrighted

property, infringement claims can be made
Who owns the prompt?
Who owns the prompt?
As between the parties and to the extent permitted by applicable law, you own all Input
Prompt privacy
ChatGPT is being continuously developed

and improved
OpenAI may use prompts and responses for

performance improvements
May need to opt-out of usage agreement
Risk of breaching data governance laws
Data governance
Govern how data can be collected, stored,

and used
Example: GDPR governs data usage

impacting EU citizens and residents
Use cases must adhere to data governance

laws
AI ethics
Ensure data is used with people and

society's best interest in-mind
Ask whether the use will negatively or

positively impact people
Let's practice!
Advancements in
generative AI
James Chapman
Coming up...
What's to come in generative AI?
What challenges need to be overcome?
Performance improvements
More human-like content
Handle more complexity

Greater reliability
What's driving the improvements?
Large Language Models (LLMs)

Learns from a huge text dataset
Algorithms detect patterns in text
Fine-tune the model by rating responses
Amount of training data will increase
Amount of training data will increase
Usage data will help in fine-tuning
Building balanced datasets
Challenge: Ensuring data is high quality and

balanced
Quantity of data makes detecting bias

prior to training difficult
Goal: Develop more robust bias mitigation

procedures
Opportunities for misuse
Misrepresenting AI-generated content
Creating malicious content (e.g., spam)
Intervention by lawmakers:
Regulations could help or hinder AI

advancement
From generalized to specialized
ChatGPT is a generalizable model
Generative AI models will become more

specialized
Example: a model specifically designed to
write long and complex code
Other types of generative AI
1 DALL-E 3
AI for everyone!
Accessibility is key to ChatGPT's success
Democratization of AI tools
Everyone should benefit from the
technology
Let's practice!
Congratulations!
James Chapman
Chapter 1 - Interacting with ChatGPT
What can ChatGPT do?
What are its limitations?
How to write effective prompts → prompt engineering
Chapter 2 - Adopting ChatGPT
Augmenting business workflows
Identifying appropriate use cases

Legal and ethical considerations
The future of generative AI
Where next?
Courses: Skill Tracks:
Understanding Prompt Engineering AI Fundamentals
Generative AI Concepts AI Business Fundamentals
Large Language Models (LLMs) Concepts
AI Ethics
Artificial Intelligence (AI) Strategy
Implementing AI Solutions in Business
Congratulations!
What is machine
learning?
U N D E R S TA N D I N G M A C H I N E L E A R N I N G
Lis Sulmont
UNDERSTANDING MACHINE LEARNING
Artificial intelligence (AI)
A huge set of tools for making computers

behave intelligently

Artificial intelligence (AI)
A huge set of tools for making computers

behave intelligently
Machine learning is the most prevalent subset

of AI

Defining machine learning:
A set of tools for making inferences and predictions from data

Defining machine learning: what can it do?
Predict future events
Will it rain tomorrow?
Yes (75% probability)
Infer the causes of events and behaviors

Why does it rain?
Time of the year, humidity levels, temperature, location, etc
Infer patterns
What are the different types of weather conditions?
Rain, sunny, overcast, fog, etc

Defining machine learning: how does it work?
Interdisciplinary mix of statistics and computer science
Ability to learn without being explicitly programmed
Learn patterns from existing data and applies it to new data
Relies on high-quality data
... more to come throughout the course!

Data science
Data science is about discovering and

communicating insights from data

Data science
Data science is about making discoveries and

creating insights from data
Machine learning is often an important tool

for data science work

Machine learning model
A statistical representation of a real-world process based on data




Let's practice!
Machine learning
concepts
Lis Sulmont
Three types of machine learning
1) Reinforcement learning
2) Supervised learning
3) Unsupervised learning

Training data
Training data: existing data to learn from
Training a model: when a model is being built from training data
Can take nanoseconds to weeks

Supervised learning training data





After training (supervised learning)



Supervised vs unsupervised learning
Supervised learning
Training data is "labeled"
Unsupervised learning
Training data only has features
Useful for:
Anomaly detection
Clustering, e.g., dividing data into

groups

Unsupervised learning training data

Unsupervised learning training data

After training (unsupervised learning)

Unsupervised Learning
In reality, data doesn't always come with labels
Requires manual labor to label
Labels are unknown
No labels: model is unsupervised and finds its own patterns

Let's practice!
Machine learning
workflow
Lis Sulmont
Machine learning workflow

Our scenario
Our dataset: NYC property sales from 2015-
2019
Includes:
Square feet
Neighborhood
Year built
Sale price
And more!
Our target: Sale price

Step 1: Extract features

Step 2: Split dataset

Step 3: Train model

Step 3: Train model

Step 4: Evaluate

Step 4: Evaluate
Test dataset: "unseen" data
Many ways to evaluate:

What is the average error of the predictions?
What percent of apartments did the model accurately predict within a 10% margin?

Step 4: Evaluate

Step 4: Evaluate

Step 4: Evaluate
If not, tune the model and re-train it:

e.g., change the model's options, add/remove features


Summary of steps
1. Extract features
Choosing features and manipulating the dataset
2. Split dataset
Train and test dataset
3. Train model
Input train dataset into a machine learning model
4. Evaluate
If desired performance isn't reached: tune the model and repeat Step 3

Let's practice!
Supervised learning
Hadrien Lacroix
Content Developer at DataCamp
Modeling

Types

What is supervised learning?

Classification and regression

Classification

Classification
Classification = assigning a category
Will this customer stop its subscription?
Yes, No
Is this mole cancerous?
Yes, No
What kind of wine is that?
Red, White, Rosé
What flower is that?
Rose, Tulip, Carnation, Lily

Observations

Features

Target

Graphing our data

Splitting data

Manual classifier

Support vector machine - linear classifier

Support vector machine - polynomial classifier

Regression

Regression
Regression = assigning a continuous variable
How much will this stock be worth?
What is this exoplanet's mass?
How tall will this child be as an adult?

Predicting temperature

Training data

Linear regression

Model

Given humidity...

...find temperature

Testing data

Classification vs regression
Regression = continuous
Any value within a finite (height) or infinite (time) interval
20°F, 20.1°F, 20.01°F...
Classification = category
One of few specific values
Cold, Mild, Hot

Let's practice!
Unsupervised
learning
Hadrien Lacroix

Unsupervised learning = no target column
No guidance
Looks at the whole dataset
Tries to detect patterns

Applications

Clustering

Clustering example

Species cluster

Color cluster

Origin cluster

Clustering models
K Means:
Specify the number of clusters
DBSCAN (density-based spatial clustering of applications with noise):

Specify what constitutes a cluster

Iris table

K-Means with 4 clusters

K-Means with 3 clusters

Ground truth

Anomaly detection

Detecting outliers
Anomaly detection = detecting outliers
Outliers = observations that differ from the rest

Outliers

Removing outliers

Some anomaly detection use cases
Discover devices that fail faster or last longer
Discover fraudsters that manage trick the system
Discover patients that resist a fatal disease
...

Association

Association

Let's practice!
Evaluating
performance
Hadrien Lacroix
Evaluate step

Overfitting
Performs great on training data
Performs poorly on testing data
Model memorized training data and can't generalize learnings to new data
Use testing set to check model performance

Illustrating overfitting

Accuracy
Accuracy = correctly classified observations / all observations
48 / 50 = 96%

Limits of accuracy: fraud example
Accuracy of this model:
28 correctly classif ied

= 93.33%
30 total points
Misses majority of fraudulent transactions
Need a better metric

Confusion matrix

True positives

True positives

False negatives

False negatives

Remembering False Negatives

Fill out the rest...

False positives, true negatives

Remembering False Positives
1 https://www.flickr.com/photos/59632563@N04/6104068209

Sensitivity
How many fraudulent transactions did we classify correctly?
true positives
Sensitivity = = 1/3 = 33.33%
true positives + f alse negatives
Rather mark legitimate transactions as suspicious than authorize fraudulent transactions

Specificity
true negatives
Specif icity =
true negatives + f alse positives
Spam filter:
Rather send spam to inbox than send real emails to the spam folder

Evaluating regression

Evaluating regression
Error = distance between point (actual value) and line (predicted value)
Many ways calculate this. e.g, root mean square error

1 https://www.flickr.com/photos/micahdowty/8540188997

Let's practice!
Improving
performance
Hadrien Lacroix

Several options
Dimensionality reduction
Hyperparameter tuning
Ensemble methods

Dimensionality reduction
Reducing the number of features

Dimensionality reduction: example
Irrelevance: some features don't carry useful information

Dimensionality reduction: example
Correlation: some features carry similar information
Keep only one feature

e.g. height and shoe size --> height
Collapse multiple features into one underlying feature

e.g. height and weight --> Body Mass Index






Hyperparameter tuning: example
SVM algorithm hyperparameters:
kernel : "linear" --> "poly"
degree
gamma
shrinking
coef0
tol
...

Ensemble methods

Ensemble methods: classification

Ensemble methods: regression

Let's practice!
Deep learning
Sara Billen
What is deep learning?
AKA: Neural Networks
Basic unit: neurons (nodes)
Special area of Machine Learning
Requires more data
Best when inputs are images or text

Predicting box office revenue









Deep learning
Neural networks are much larger
Deep learning: neural network with many
neurons
Can solve complex problems

When to use deep learning?
Lots of data
Access to processing power
Lack of domain knowledge
Complex problems
Computer vision
Natural language processing

Let's practice!
The process
Sara Billen
Computer vision
Helps computers see and understand the content of digital images

Image data

Image data

Training the neural network

Applications
Facial recognition
Self-driving vehicles
Automatic detection of tumors in CT scans
Deep fake
...

Let's practice!
Natural Language
Processing
Sara Billen
Curriculum Manager at DataCamp
Natural Language Processing (NLP)
The ability for computers to understand the meaning of human language

Bag of words

Bag of words
"U2 is a great band" "Queen is a great band"
Word Count Word Count

U2 1 U2 0
Queen 0 Queen 1
is 1 is 1
a 1 a 1
great 1 great 1
band 1 band 1

Bag of words: n-grams
"That book is not great" 2-gram (bi-gram)
Word Count Word Count

That 1 That book 1
book 1 book is 1
is 1 is not 1
not 1 not great 1
great 1

Bag of words: limitations
Word counts don't help us consider
synonyms
Example: "blue"
"sky-blue"
"aqua"
"cerulean"
Want to group as a single feature

Word embeddings
Word embeddings
Create features that group similar words
Features have a mathematical meaning:
king - man + woman = queen

Language translation

Applications
Language translation
Chatbots
Personal assistants
Sentiment analysis
...

Deep learning
Two types of problems
Computer vision
Natural language processing
Why deep learning?

Complex problems
Automatic feature extraction
Lots of data

Let's practice!
Limits of machine
learning
Sara Billen
Data quality
Garbage in garbage out
Output quality depends on input quality

How it can go horribly wrong
Amazon's gender-biased recruiting Recruiting software to help review resumes
tool Preferred men because it learned from
historic data when more men were hired
It downgraded resumes that

contain the word "women"
implied the applicant was female

How it can go horribly wrong
Microsoft's AI chatbot

Beware
Don't blindly trust your model
Awareness is key
Pay attention to your data
A machine learning model is only as good as

the data you give it

Quality assurance
High-quality data requires:
Data analysis
Review of outliers
Domain expertise
Documentation

Explainability

Explainability
Transparency to increase trust, clarity, and understanding
Use cases: business adoption, regulatory oversight, minimizing bias

Explainable AI
Black box Explainable AI
Deep learning Traditional machine learning
Better for "What?" Better for "Why?"
Highly accurate predictions Understandable by humans

Example: Explainable AI
1. Prediction: Will the patient get diabetes?
2. Inference: Why will this happen

Example: Inexplicable AI
Prediction only: Which letter is this likely to be?

Let's practice!
Congratulations!
Lis Sulmont
Chapter 1
What is machine learning?
Machine learning concepts and workflow

Chapter 2
Different types of machine learning
How we evaluate and improve machine learning models

Chapter 3
Deep learning, including computer vision and natural language processing
Limits of machine learning

What's next?

What's next?
Machine Learning Scientist
Machine Learning Fundamentals
Supervised Machine Learning
Unsupervised Machine Learning

Congrats!
The rise of LLMs in
the AI landscape
LARGE LANGUAGE MODELS (LLMS) CONCEPTS
Vidhi Chugh
AI strategist and ethicist
Rapid developments in AI
1 Freepik, Tesla Youtube Channel

AI-powered recommendations
1 Netflix blog, Medium

AI and data-driven tasks
Sentiment analysis, fraud detection, and

more
Still, lacked human-like interaction
Enter Large Language Models
1 Unsplash

The AI landscape

The AI landscape

The AI landscape

The AI landscape

The AI landscape

Definition of LLMs
Large
Training data and resources
1 Freepik

Definition of LLMs
Large
Training data and resources
Language
Human-like text
1 Freepik

Definition of LLMs
Large
Training data and compute power
Language
Human-like text
Models
Learn complex patterns using text data
1 Freepik

The defining moment

Popular language generators
1 https://zapier.com/blog/best-ai-chatbot/

Applications
Sentiment analysis
Identifying themes
Translating text or speech
Generating code
Next-word prediction

What shall this course cover?
Conceptual understanding of LLMs
Training data considerations

Ethical, privacy and environmental
concerns
The future of LLMs

Let's practice!
Real-world
applications
Vidhi Chugh
Business opportunities
Benefits
Automate tasks
Improve efficiency
Create revenue streams
Enable new capabilities
The possibilities are endless!

Transforming finance industry
Unstructured data or text: data that lacks definition and is presented free-form
1 Freepik



Challenges in healthcare
Doctors' notes: Challenges:
Jargon Hard to understand terms
Abbreviations Difficult to interpret

Domain expertise Difficult to describe patient files
Varying writing style
Varied text data and acronyms

Revolutionizing healthcare sector
Analyze patient data to offer personalized recommendations
Must adhere to privacy laws
1 Freepik

Education
Personalized coaching and feedback
Interactive learning experience

AI-powered tutor
Ask questions
Receive guidance
Discuss ideas
1 Freepik

Personalizing education: text generation

Defining multimodal
Multimodal Non-multimodal
Many types of processing or generation One type of processing or generation
1 Freepik

Visual question answering
Answers to questions about visual content
Object identification & relationships

Scene description
Recognizes the zebra image
Responds with additional information
Makes a joke
1 https://arxiv.org/abs/2302.14045

Let's practice!
Challenges of
language modeling
Vidhi Chugh
Sequence matters!
I only follow a healthy lifestyle. Only I follow a healthy lifestyle.
Different positions = different meanings
1 Freepik

Context modeling

Context modeling

Context modeling

Context modeling

Context modeling

Long-range dependency
Recognize and connect distant words in a sentence
Challenging for traditional language models

Single-task learning
Time and resource expensive

Less flexible compared to modern LLMs

Multi-task learning
Improved performance on each individual task
Might impact accuracy and efficiency

Less training data needed because data is shared

To recap
Challenges of language Single-task learning:
modeling: Task-specific
Word sequences Less flexible
Traditional models and early LLMs
Understanding context
Multi-task learning:
Long-range dependency Versatile
Multiple tasks
More developed LLMs

Let's practice!
Novelty of LLMs
Vidhi Chugh
Using text data
Unstructured data - messy and inconsistent
1 Freepik

Machines do not understand language!
1 Freepik

Need for NLP
1 Freepik

Unique capabilities of LLMs
Linguistic subtleties
Irony
Humor
Pun
Sarcasm
Intonation
Intent
1 Freepik

What's your favorite book?
Natural response: "Oh, that's a tough one!"
Personal opinion: "My all-time favorite book is To Kill a Mockingbird by Harper Lee."
Supporting statement: "It's a powerful story about prejudice, justice, and the human
experience."
Follow-up question: "Have you read it?"

Linguistic subtleties
Sarcasm: "Oh great, another meeting."
Traditional language model: Large language model:
Response: "What's the meeting about?" Response: "Sounds like you're looking
Neutral forward to it!"
Does not pick up sarcasm Playful
Engaging
Matches the sarcasm

How do LLMs understand
Trained on vast amounts of data
Largeness of LLMs: parameters

Parameters represent the patterns and rules
More parameters -> complex patterns
Generates sophisticated and accurate responses

Parameters
Small number of bricks -> limited structures Larger number of bricks -> complex and
detailed structures

Emergence of new capabilities
Emergent abilities
only present in large-scale models
Scale:
The volume of training data
The number of model parameters

Emergence of new capabilities

Building blocks of LLMs

To recap
LLMs: How?
Overcome data's unstructured nature LLMs' "largeness"
Extensive training data

Many parameters
Outperform traditional models
Emergent abilities
Understand linguistic subtleties

Let's practice!
Generalized
overview of NLP
Vidhi Chugh
Where are we?

Text pre-processing
Can be done in a different order as they are independent

Tokenization
Splits text into individual words, or tokens
Text:
"Working with natural language processing techniques is tricky."
Tokenization:
["Working", "with", "natural", "language", "processing", "techniques", "is", "tricky", "."]
Converts into a list

Stop word removal
Stop words do not add meaning
Eliminated through stop word removal
Before stop word removal:

["Working", "with", "natural", "language", "processing", "techniques", "is", "challenging", "."]
After stop word removal:

["Working", "natural", "language", "processing", "techniques", "challenging", "."]

Lemmatization
Group slightly different words with similar Talking -> Talk

meaning
Talked -> Talk
Talk -> Talk
Reduces words to their base form
Mapped to root word

Text representation

Text representation
Text data into numerical form
Bag-of-words
Word embeddings

Bag-of-words
Text into a matrix of word counts
0 represents the absence of a word

Limitations of bag-of-words
Does not capture the order or context
Can lead to incorrect interpretations
Similar sentences but opposite meaning

"The cat chased the mouse swiftly."
"The mouse chased the cat."
Does not capture the semantics between the words

Treats related words as independent
Like "cat" and "mouse"

Word embeddings
Capture the semantic meanings as Predator-prey relationship:
numbers
Cat Mouse
Plant -0.9 -0.8
Furry 0.9 0.7
Carnivore 0.9 -0.8
Cat [-0.9, 0.9, 0.9]

Machine-readable form
Start with text pre-processing

Machine-readable form
Convert pre-processed text to numerical format

Let's practice!
Fine-tuning
Vidhi Chugh
Where are we?

Pre-training Fine-tuning
School education University specialization
1 Freepik

"Largeness" challenges
Fine-tuning can help
Powerful computers
Efficient model training methods
Large amounts of training data

Computing power
Memory
Processing power
Infrastructure
Expensive
LLM:
100,000's Central Processing Units
(CPUs)
10,000's Graphic Processing Units (GPUs)
A personal computer: 4-8 CPU and 1-2

GPUs
1 Freepik

Efficient model training
Training time is huge
May take weeks or even months
Efficient model training = faster training

time
355 years of processing time on a single

GPU

Data availability
Need of high-quality data
To learn the complexities and subtleties of

language
A few hundred gigabytes (GBs) of text data

More than a million books
Massive amount of data

Overcoming the challenges
Fine-tuning
Addresses some of these challenges
Adapts a pre-trained model
Pre-trained model
Learned from general-purpose datasets
Not optimized for specific-tasks
Can be fine-tuned for a specific problem

Fine-tuning vs. Pre-training
Fine-tuning Pre-training
Compute Compute
1-2 CPU and GPU Thousands of CPUs and GPUs
Training time Training time

Hours to days Weeks to months
Data Data
~1 gigabyte Hundreds of gigabytes

Let's practice!
Learning techniques
Vidhi Chugh
Where are we?

Getting beyond data constraints
Fine-tuning: training a pre-trained model for a specific task
But, what if there is little to no labeled data?
N-shot learning: zero-shot, few-shot, and multi-shot

Transfer learning
Learn from one task and transfer to related
task
Transferring knowledge from piano to

guitar
Reading musical notes
Understanding rhythm
Grasping musical concepts
N-shot learning
Zero-shot - no task-specific data
Few-shot - little task-specific data

Multi-shot - relatively more training data

Zero-shot learning
No explicit training
Uses language understanding and context
Generalizes without any prior examples
1 Freepik

Few-shot learning
Learn a new task with a few examples Prior knowledge to answer new question
One-shot learning: fine-tuning from one

example

Multi-shot learning
Requires more examples than few-shot
Previous tasks, plus new examples
For example, a model trained on Golden

Retriever
1 Freepik

Multi-shot learning
Model output: Labrador Retriever
Saves time in collecting and labeling data
No compromise on accuracy
1 Freepik

Building blocks so far
Data preparation workflow
Fine-tuning
N-shot learning techniques
Next up: pre-training

Let's practice!
Building blocks to
train LLMs
Vidhi Chugh
Where are we?

Generative pre-training
Trained using generative pre-training

Input data of text tokens
Trained to predict the tokens within the dataset
Types:
Next word prediction
Masked language modeling

Next word prediction
Supervised learning technique
Model trained on input-output pairs
Predicts next word and generates coherent

text
Captures the dependencies between words
Training Data
Pairs of input and output examples

Training data for next word prediction
Input Output
The quick brown fox
The quick brown fox jumps
The quick brown fox jumps over
The quick brown fox jumps over the
The quick brown fox jumps over the lazy
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog.

Which word relates more with pizza?
More examples = better prediction
For example:
I love to eat pizza with _ _ _ _ _ _
Cheese is more related with pizza than

anything else

Hides a selective word
Trained model predicts the masked word
Original Text: "The quick brown fox jumps over the lazy dog."
Masked Text: "The quick [MASK] fox jumps over the lazy dog."
Objective: predict the missing word
Based on learnings from training data

Let's practice!
Introducing the
transformer
Vidhi Chugh
Where are we?

What is a transformer?
"Attention Is All You Need"
Revolutionized language modeling
Transformer architecture
Relationship between words
Components: Pre-processing, Positional

Encoding, Encoders, and Decoders
1 arXiv: Attention Is All You Need

Inside the transformer
Input: Jane, who lives in New York and works as a software
Output: engineer, loves exploring new restaurants in the city.

Transformers are like an orchestra

Text pre-processing and representation
Text preprocessing: tokenization, stop word removal, lemmatization
Text representation: word embedding

Positional encoding
Information on the position of each word
Understand distant words

Encoders
Attention mechanism: directs attention to specific words and relationships
Neural network: process specific features

Decoders
Includes attention and neural networks
Generates the output

Transformers and long-range dependencies
Initial challenge: long-range dependency
Attention: focus on different parts of the input
Example: "Jane, who lives in New York and works as a software engineer, loves exploring
new restaurants in the city."
"Jane" --- "loves exploring new restaurants"

Processes multiple parts simultaneously
Limitation of traditional language models:
Sequential - one word at a time
Transformers:
Process multiple parts simultaneously
Faster processing
For example:
"The cat sat on the mat"
Processes "cat," "sat," "on," "the," and "mat" at the same time

Let's practice!
Attention
mechanisms
Vidhi Chugh
Attention mechanisms
Understand complex structures
Focus on important words
Book reading analogy:

Clues in a mystery book
Focus on relevant content
Concentrate on crucial input data

Self-attention and multi-head attention
Self-attention Multi-head attention
Weighs the importance of each word Next level of self-attention
Captures long-range dependencies Splits input into multiple heads with each
head focusing on different aspects

Attention in a party
Attention: Self and multi-head
Example:
Group conversation at a party
Selective attention to relevant speaker
Filter noise
Focus on key points
1 Freepik

Party continues
Self-attention Multi-head attention
Focus on each person's words Split attention into "multiple" channels
Evaluate and compare their relevance Focus on different aspects of conversation

Weigh each speaker's input Speaker's emotions, primary topic, and
Combines for a comprehensive related side-topics
understanding Process each aspect and merge

Multi-head attention advantages
"The boy went to the store to buy some groceries, and he found a discount on his favorite
cereal."
Attention: "boy," "store," "groceries," and "discount"
Self-attention: "boy" and "he" -> same person
Multi-head attention: multiple channels

Character ("boy")
Action ("went to the store," "found a discount")
Things involved ("groceries," "cereal")

Let's practice!
Advanced fine-
tuning
Vidhi Chugh
Where are we?

Reinforcement Learning through Human Feedback
Pre-training
Fine-tuning
Reinforcement Learning through Human

Feedback (RLHF)

Pre-training
Large amounts of text data:
Websites, books and articles
Transformer architecture
Learns general language patterns,

grammar, and facts
Next-word prediction
1 Freepik

Fine-tuning
N-shot training
Small labeled dataset for related task

But, why RLHF?
General-purpose training data lacks quality
Noise
Errors
Inconsistencies
Reduced accuracy
Example of reduced accuracy:
Trained on data from online discussion

forums
Unvalidated opinions and facts
Needs external expert validation

Starts with the need to fine-tune
Pre-training
Learns underlying language patterns
Doesn't capture context-specific complexities
Fine-tuning
Quality labeled data improves performance
Enter RLHF!
Human feedback

Simplifying RLHF
Model output reviewed by human
Updates model based on the feedback
Step 1:
Receives a prompt
Generates multiple responses

Enters human expert
Step 2:
Human expert checks these responses
Ranks the responses based on quality

Accuracy
Relevance
Coherence

Time for feedback
Step 3:
Learns from expert's ranking
To align its response in future with their

preferences
And it goes on!

Continues to generate responses
Receives expert's rankings
Adjusts the learning

Recap
Pre-training to learn general language knowledge
Fine-tuning for specific tasks
RLHF techniques to enhance fine-tuning through human feedback
Combination is highly effective!

Completing the LLM

Let's practice!
Data concerns and
considerations
Vidhi Chugh
Data considerations
Data volume and compute power
Data quality
Labeling
Bias
Privacy

LLMs need a lot of data
Similar to a child learning to talk
570 GB, ~1.3 million books
1 Freepik

LLMs need a lot of data
Similar to a child learning to talk
570 GB, ~1.3 million books
Extensive computing power; think of the

energy consumption
Can cost millions of dollars!

Data quality
Quality data is essential
Accurate data = better learning = improved

response quality = increased trust
A child learning to talk

Gibberish-in -> gibberish-out

Labeled data
Correct data label: accurate learning, generalize patterns, accurate responses
Labor-intensive: assigning correct label to each article
Incorrect labels impact model performance
Address errors: identify -> analyze -> iterate

Data bias
Influenced by societal stereotypes
Lack of diversity in training data
Discrimination and unfair outcomes
Spot and deal with the biased data

Evaluate data imbalances
Promote diversity Example:

"The nurse said that..." -> "she" or "her"
Bias mitigation techniques: more diverse
examples

Data privacy
Compliance with data protection and Sensitive or personally identifiable
privacy regulations information (PII)
Privacy is a concern Get permission

Training on data without permission can
lead to a breach
Legal, financial and reputational harm

Let's practice!
Ethical and
environmental
concerns
Vidhi Chugh
Ethical concerns
Transparency risk

Ethical concerns
Transparency risk
Accountability risk -

Ethical concerns
Transparency risk
Accountability risk
Information hazards

Transparency risk
Challenging to understand the output
Difficult to identify issues

Bias
Errors
Misuse
Black box
Example: reasoning behind predicting

disease outcomes

Accountability risk
Responsibility of LLMs' actions
Who is responsible?
Incorrect and harmful advice
Model developer or the company?
Game without rules

No transparency
No accountability
1 Freepik

Information hazards
Disseminating harmful information
Harmful content generation
Misinformation spread
Malicious use
Toxicity

Information hazards
Harmful content generation Misinformation spread
Harmful, offensive, or inappropriate Generate text on any topic
Prompt or biased training data But, no verification!
Example: Example:
Bullying vs. friendly school environment "What's a good diet for losing weight?"
Distressing and harmful Unsubstantiated diet plan

Information hazards
Malicious use Toxicity
Bad actors exploiting LLMs Inappropriate content
Generate deceptive content Training or through manipulated prompts
Example: Example:
Fabricated news Insensitive response
Manipulating public and causing unrest Stereotype

Environmental concerns
Ecological footprint of LLMs
Substantial energy resources to train
Impact through carbon emissions
1 Freepik

Cooling requires electricity too!
Produce considerable heat that needs
cooling
Imagine thousands of laptops overheating

Require complex cooling systems
Adds to environmental impact
Balance the cost and benefits

Use renewable energy
Energy-efficient tech
1 Freepik

Let's practice!
Where are LLMs
heading?
Vidhi Chugh
Journey so far

Journey so far

Journey so far

Journey so far

Model explainability
How do they arrive at their outputs?
Road-trip planning
Why this particular route?
Why these specific spots?
Builds trust and transparency
Identify and correct the biases or errors

1 Freepik

Efficiency
Computational efficiency
High-quality output with less compute
Faster and efficient

Model compression
Optimization
Benefits: better storage, lower energy use
Accessibility and sustainability

Promotes green AI
Reduces operating costs
1 Freepik

Unsupervised bias handling
Biased data -> discrimination
Unsupervised bias handling
Bias detection and mitigation techniques,
automatically
No need of explicit human-labeled data
Identifies and reduces by analyzing

patterns
Challenge
Subtle, difficult to detect
Might introduce new biases

Enhanced creativity
Creativity in text-based and visual art
forms
Artistic content: learned patterns, not

emotional understanding
Lack human-like comprehension of art or

emotions
Demonstrate human-like emotional

behavior
Future: emotion inference
1 https://arxiv.org/pdf/2302.09582.pdf

Let's practice!
Time to wrap-up
Vidhi Chugh
How far we have come!
LLMs transforming interaction with technology

How far we have come!
Substantial data requirements
Challenges and risks - privacy, ethics, and environmental implications
Future research and development

There is more to it
Entire teams devoted to understanding LLMs
Exciting times ahead
Stay updated with the latest developments

More on data ethics
Introduction to ChatGPT

Congratulations!
What is generative
AI?
G E N E R AT I V E A I C O N C E P T S
Daniel Tedesco
Data Lead, Google
We've long dreamed of tools that can create
In ancient stories...
GENERATIVE AI CONCEPTS
We've long dreamed of tools that can create
In ancient stories... ...and modern virtual worlds
Unparalleled creative tools
AI Images AI Chatbots
1 Cosmopolitan Magazine, Anthropic PBC
What is generative AI?
Machine learning models that generate new content
1 Google Bard
1 Facebook Make-a-Video
1 Replit Ghostwriter
How does it work?
How does it work?
How does it work?
Create images
1 Cosmopolitan Magazine using the Dall-E model
Hold conversations
1 Facebook's LLaMA model
Input more than text
1 Runway ML's InPainting Tool, https://runwayml.com/inpainting/
Real-world applications
Generative AI will impact a variety of industries and functions:
Sales: draft sales outreach emails
Finance: analyze financial data

Marketing: generate marketing ads to test
Legal: Simply explain complex regulations
Education: Customize learning for individual students
Medical: Read and analyze medical data
Industrial: Automate repetitive tasks for industrial engineering and design
Games & Entertainment: Create 3D models and scenes
The end of work?
Lots of implementation challenges and risks
Still, lots of opportunity to utilize
Course goals
We'll learn how these models:
Generate content
Present new legal and ethical considerations
Impact society in the coming years
Let's practice!
Generative AI in the
machine learning
landscape
Daniel Tedesco
Data Lead, Google
Models that analyze
Discriminative models
Answer closed-ended questions
Learn from training data
Guess correct answer or categorize
1 Wikimedia Commons
Bagels and puppies
1 Wikimedia Commons
Guessing with confidence
1 Puppy image from DALL·E 2
Models that imagine
Generative models
Guess data for a prediction
Still require training

Generate new content
1 Puppy image from Dall-E 2
Mixing for effect
Generative AI:
Combines generative models with other ML
Models must work together like parts of a

machine
Produce complex creative work
1 Cosmopolitan Magazine
Generative adversarial networks (GANs)
Generators try to trick discriminators

Compare notes and get better in multiple rounds
Bagel Puppy GAN
1 https://twitter.com/teenybiscuit/media
Artificial general intelligence (AGI)
An AI that exhibits intelligence like a human would
Scope of knowledge
Reasoning across domains

Social skills
Creative thinking
Other cognitive competencies (vision, language)
Use the right tool for the job
Discriminative Models Generative AI
Predict tomorrow's weather Write code for a website
Categorize books Answer unique customer service questions

Determine if a picture is a puppy or a bagel Draw a picture of a cat scuba diving
Artificial General Intelligence
Complete traditionally human jobs
Let's practice!
The evolution of
generative AI
Daniel Tedesco
Data Lead, Google
Generative AI burst on the scene in 2023
1 Yahoo Finance
Key factors driving development
Several factors drive generative AI development:
Computing power
Dataset availability
Competitive interests
Model design
Computational power allowed large models
Parallelization and specialized hardware

Graphics Processing Units (GPUs)
Tensor Processing Units (TPUs)
Cloud computing
Hardware-software optimization
1 Compute Trends Across Three Eras of Machine Learning, https://arxiv.org/abs/2202.05924
Models improved with massive datasets
Global Datasphere Growth
1 IDC's Global DataSphere, 2021
Competitive pressures encouraged faster development
Commercial Political
GANs unleashed high quality generation
Transformers brought context and coherence
'it' refers to 'animal' 'it' refers to 'street'
1 https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html
Transformers brought context and coherence
Transformers:
Grasp the context of a given text
Analyze relationships between words
Generate responses that feel natural and informative
RLHF engaged user feedback
Reinforcement Learning with Human Feedback (RLHF):
Reinforcement learning trains models through trial-and-error
Human feedback comes from users scoring model responses
RLHF engaged user feedback
1 Midjourney
Let's practice!
Model design and
data collection
Daniel Tedesco
Data Lead, Google
Know how to fill the tank
1 GM Fairfax Assembly Plant
Developing a model
Model Development Steps
1. Research and design
2. Training data collection

3. Model training
4. Model evaluation
Stable Diffusion's research and development
Example output from Stable Diffusion Stable Diffusion's R&D
Purpose: Decide on image generation
Architecture: Settle on diffusion model

Resources: 256 GPUs, 150k hours, $600k
1 Stability AI, Emad Mostaque Twitter post
Data collection: not your typical ML model
Training data preparation
Massive amounts required
Diverse, context-rich data

Requires preprocessing
1 Laion blog
Data collection: privacy and security are critical
Training data preparation
Personally Identifiable Information (PII)
Anonymize or aggregate
Store in secure location with controlled

access
Let's practice!
Model training
Daniel Tedesco
Data Lead, Google
Pick your mode of train-sportation
1. Hardware
2. Time
Dataset size
Model complexity
Rounds of training
3. Cost
Graduate to advanced techniques
Foundation generative AI model is a first
step
Advanced techniques specialize them for

specific contexts
Key advanced techniques:
1. Transfer learning and fine-tuning
2. RLHF
3. Custom embeddings
From cats to lions
Transfer learning transfers knowledge from one task to another
Fine-tuning is a type of transfer learning for a small dataset
1 Creative Commons Attribution-Share Alike 4.0, Bing image generator
Where does your feedback go?
1 Google Bard, 2023
Thumbs up for better responses
1 Google Bard, 2023
Embeddings as fingerprints
Similar to recognizing a fingerprint
Unique representations of data entities
Capture meaning, context, and

relationships in compact form
1 Wikimedia commons
Embedding Dan
1 Daniel Tedesco, 2023
Embedded Dan
1 Daniel Tedesco portraits from vana.com
Let's practice!
Model evaluation
Daniel Tedesco
Data Lead, Google
Why evaluate anyway?
Assess performance and effectiveness of a model:
Measure progress
Rigorous model comparison
Benchmark human performance
Evaluating generative AIs
Quantitative Metrics Human-centric Metrics
Discriminative model evaluation metrics Human performance comparison
Generative model-specific metrics Intelligent evaluation
Discriminative model evaluation techniques
Measure performance on well-defined tasks
Pros:
Widely accepted and understood
Easy to calculate and compare
Cons:
Do not capture subjective nature of

generated content
Generative model-specific metrics
Customized for particular generative tasks
Pros:
Nuanced criteria, like realism, diversity, and

novelty
Many well-known metrics
Cons:
Cannot capture many subjective elements
Often do not generalize
Human performance comparison
Pros:
Benchmarks against human abilities
Demonstrates practical applicability
Cons:
Unfair comparison
Award-winning AIs
Human Competitions Human Standardized Tests
1 https://twitter.com/colostatefair/status/1565486317839863809, OpenAI
The gold standard
Intelligent evaluation by humans or other AIs
Pros:
Captures subjective aspects
Cons:
Slow, costly, and difficult to standardize
Subject to human biases and irregularity
Turing's classic test
Proposed by computer scientist Alan Turing
Human evaluator judges AI-generated

content
Passes if evaluator cannot distinguish AI

from human
But human behavior is not always the right

standard
Let's practice!
Evaluating and
mitigating social
bias
Daniel Tedesco
Data Lead, Google
What do we mean by social bias?
Systematic unfairness in generative AI
Serious societal consequences
Fairness can be subjective
Focus on broadly shared values
Where bias appears
Training data
The model itself
How the model is used
Bias in data
Skewed or unrepresentative information in the training dataset
Bias in models
Pursuing goals that result in biased outcomes
Bias in use
Applying AI in wrong or malicious ways
Identifying bias in data and models
Representation analysis compares how the model refers to different groups
Fairness metrics evaluate models for equal treatment, opportunity, and accuracy across
groups
Human audits ask real people to review a model's outputs to identify bias
Mitigating bias in data and models
Diversify data collection
Adjust model to prioritize different data

Adversarial training
Continuous improvement
Let's practice!
Copyright and
ownership
Daniel Tedesco
Data Lead, Google
Who won?
The person wrote the prompt
The company built the model

The artists whose works trained the model
The AI which generated the art
1 Colorado State Fair
Law vs. AI
Legal landscape is evolving to meet rapid AI advancement:
1. Intellectual property
2. Privacy implications
3. Evolving norms and regulations
Follow IP best practices
Check copyright status of training data
Seek legal guidance about use

Stay informed of regulatory dynamics
Privacy implications with every prompt
Read terms of service: understand how data is stored and used
Consider what we share: user data may be included in future training
Local alternatives: many generative AIs can be run at home
Evolving norms
Different responses across industries
Norms in one context might not apply in
another
Evolving regulations
Differ across jurisdictions
May depend on location of users, servers, and developers

Stay informed as landscape rapidly evolves
Let's practice!
Responsible
generative AI
applications
Daniel Tedesco
Data Lead, Google
On the eve of the election
Types of malicious use
Deepfakes
Misinformation campaigns
AI-enhanced hacking
1 Pablo Xavier
Detection and prevention
Key usage principles
Human-in-the-loop
Harm prevention
Continuous monitoring
Points of Detection and Prevention
Access
AI can unintentionally aid criminal groups'
non-criminal activities.
Avoid supporting malicious groups
Know Your Customer (KYC)

Verify user identity
Prompts and responses
Moderating prompts Moderating responses
Similar to website or chat group Screen or filter responses before showing

moderation user
Jailbreaking prompts can still subvert

developer guidelines
Applications
Malicious actors can apply benign responses
to illegal or unethical activity.
Invisible watermarks can help determine

source of content
May require law enforcement intervention
Communication and feedback
Clear usage guidelines
Feedback loops
User studies and stakeholder roundtables
Partner with civil society organizations
Feedback opportunities in product
Let's practice!
Artificial general
intelligence (AGI)
Daniel Tedesco
Data Lead, Google
Revisiting AGI
An AI that exhibits intelligence like a human would:
Scope of knowledge
Reasoning across domains

Social skills
Creative thinking
Other cognitive competencies (vision, language)
Immense pros
Productivity
Research progress
Engineering solutions
Companionship and wisdom
Severe cons
Negative economic disruption
Malicious use
Value alignment problems
Existential catastrophe
The safety debate
AGI can empower AGI can have negative consequences
Controlling AGI outcomes
Requirements for aligning AGI and human values:
Clear rules and expectations
Constructive feedback
1. Hard constraints
2. Alignment strategies
3. Government intervention
Hard constraints
1. Boxing restricts access to the wider world
2. Interruptibility adds a stop or off switch
Alignment strategies
Iterative development
Constitutional AI
Multi-stakeholder engagement
1 Dave Gray
Government intervention
CEOs of various AI companies meeting with
UK PM Rishi Sunak in 2023-
Beneficial regulations
Safety regulations
Rules for testing and oversight
Transparency standards
International collaboration
1 UK Prime Minister
Let's practice!
Bringing new AI into
old workflows
Daniel Tedesco
Data Lead, Google
Meeting our "replacement"
Advantages and limitations
Advantages Limitations
Knowledge of trained fields Hallucination and potential bias

Very fast No common sense
Inexpensive Implementation challenges
Augmentation
Co-creation
Replacement
A novel implementation
Augmentation: AI suggests edits, human decides
Co-creation: AI and human collaboratively write a novel
Replacement: AI generates and publishes social media posts
Identify opportunity
Decompose the process
Test an AI solution
Scale up
A new way of working
Treat AI as partners rather than

competitors
Prepare for lifelong learning, with AI

support
Be patient with integration
Let's practice!
Progress in
generative AI
Daniel Tedesco
Data Lead, Google
A collaborative effort
Universities
Governments and civic organizations
Open-source communities
Startups and large companies
Universities
New research, such as invention of GANs
New researchers
Partnerships with other sectors
1 University of Montreal website
Governments and civic organizations
Governments
Establish and enforce regulatory

environment
Fund basic research
Civic institutions
Provide independent analyses
Resources, such as datasets
Project funding
1National Defense Magazine, https://www.nationaldefensemagazine.org/articles/2016/5/11/darpa-shows-off-

technology-at-demo-day
Open-source communities
Some generative AI Open-source projects
Provide open access to tools and models
Lower barriers to entry, experimentation,

and sharing
Are difficult to sustain and maintain quality
Raise risks of misuse
Startups and large companies
Seek competitive advantage
Bring generative AI to broad adoption
Showcase advances to attract talent and impress investors
Large companies additionally:
Fund research, such as introduction of transformers by Google
Acquire startups
Offer hardware and cloud resources
The openness challenge
Pros Cons
Developer support Lose competitive advantage
Talent attraction Risk liability for misuse
Broader feedback
The boundaries of generative AI development
Accelerators Decelerators
Decreasing hardware costs Technological limits
Research developments Overbearing regulation
Competitive and geopolitical pressures Closed ecosystems
Limited economic resources
Let's practice!
Preparing for a
future of generative
AI
Daniel Tedesco
Data Lead, Google
Do more with less
Individuals become teams
Small teams create big things
Bureaucracies become more streamlined
The AI divide
Access: availability, cost
Literacy: mindset, capability
1 International Telecommunications Union, 2023
Education and jobs
Replacement, augmentation, and co-

creation, too
Reshape around generative AI

In education: access to AI, move away
from memorization
In the workplace: support from AI partner
Difficult transition
1 Various headlines from BBC, The Economist, Forbes, Business Insider, and Gitnux
Media and entertainment
Creative explosion Which is real and which is AI-generated?
Personalized media
Requires new forms of trust
1 https://www.reddit.com/r/midjourney/comments/12uij2l/one_is_a_real_photo_and_one_is_ai_generated_can/
Science and technological progress
Faster discoveries
Faster technology transfer
Human direction still needed
1 Deepmind website, https://www.deepmind.com/research/highlighted-research/alphafold
Values: do they think and feel like us?
Let's practice!
You made it!
Daniel Tedesco
Data Lead, Google
Congratulations
Four chapters of fun
Chapter 1: Got to know generative AI
Chapter 2: Learned how these models are developed

Chapter 3: How to use generative AI and its content responsibly
Chapter 4: Got ourselves ready for the Age of Generative AI
The learning just started
Explore more DataCamp courses:
Introduction to ChatGPT
Large Language Models (LLMs) Concepts
Experiment with generative AI in your own workflows
Stay up to date by following topical sources:

DataCamp's DataFramed Podcast
My podcast: www.youtube.com/@thecraftpodcast and Twitter @dtedesco1
Congratulations!
AI ethics: What's the
buzz?
AI ETHICS
Joe Franklin
Associate Data Literacy and Essentials
Manager, DataCamp
Intro to ethics
AI growth and the surge in public attention
AI ethics: A crucial discussion
Definition of ethics: Guiding behavior

based on moral principles
The intersection of AI and ethics
1 Icons made by surang & Flowicon from www.flaticon.com
AI ETHICS
AI meets ethics
AI revolutionizes various sectors:
Healthcare
Media
Insurance
Examples
AI in healthcare - improves surgical
accuracy, early disease detection
AI in finance - automates processing,

fraud detection
1 Icons made by Freepik, wanicon, Pixel perfect from www.flaticon.com
AI ETHICS
Why AI ethics?
Why AI ethics?
Risks of unchecked biases, illustrated by
insurance claim denial scenario
Human influence on AI
Biases seep into decision-making
Wider impact
Legal professions, judiciary, public
decision-making
AI ETHICS
Ethics in practice
Aligning AI systems with ethical principles
Example:
Fairness in insurance model
Equal treatment for all claims
Guidelines in Insurance, Finance, and

Banking sectors
Ethical boundaries for fair, unbiased results
The potential of ethically built AI
1 Icons made by Freepik & noomath from www.flaticon.com
AI ETHICS
The big picture
AI ethics: Beyond avoiding harm or bias
Importance of accountability: Who is
responsible for AI's outcomes and
construction?
The role of transparency: Understanding

AI's decision-making process
Need for transparency and literacy in AI

development and results
1 Icons made by Freepik & Wichai.wi from www.flaticon.com
AI ETHICS
Wrapping up
AI ethics: The guiding beacon in an AI-
driven world
Ensures benefits of AI without

compromising moral values
Ethical AI: Not just good practice, but good

business
AI ETHICS
Let's practice!
AI ETHICS
Digging deeper: AI
ethics principles
AI ETHICS
Joe Franklin
Manager, DataCamp
Meet MedTech Innovations
MedTech Innovations
Healthcare company
Using AI to improve patient care and

uphold ethics
Personal reflections
Consider the application of AI ethics in...
Personal life
Career
Known businesses
AI ETHICS
The principle of fairness
MedTech's AI in patient care
Personalized treatment plans
Challenge:
Unintentional bias in AI systems
Potential discrimination
Principle of fairness
Equal treatment
Avoidance of discrimination
AI ETHICS
The principle of accountability
Scenario: MedTech's AI system mistake affecting patient treatment
Principle of accountability:
Someone should always be accountable for AI outcomes
AI ETHICS
The principle of transparency
Scenario: MedTech's AI system
recommends a specific treatment
Challenge: Understanding why AI made the

decision
Principle of transparency:
Decisions by AI should be explainable
and comprehensible
Sharing knowledge and information

across different stakeholders
AI ETHICS
Applying AI ethics
Fairness:
Continual testing of AI systems to detect and rectify bias
Accountability:
Clear responsibilities defined for each AI system's outcomes
Transparency:
Make AI systems explainable and understandable
Commitment:
Ethical adherence builds trust, mitigates risks
AI ETHICS
Why do they matter?
Understanding & applying principles: Ensures ethical AI use
Building trust: Transparency and accountability foster patient trust
Mitigating risks: Ethical AI use reduces potential risks
Promoting AI: Transparency and knowledge-building enhances societal trust and utilization
of AI
1 Icons made by Freepik & Smashicons from www.flaticon.com
AI ETHICS
Let's practice!
AI ETHICS
AI ethics: where's the
line?
AI ETHICS
Joe Franklin
Manager, DataCamp
The privacy-personalization paradox
AI personalizes user experiences,
enhancing appeal
The privacy-personalization paradox

Personalization can compromise user
privacy
Solution:
AI literacy
Clear privacy policies
Example: Spotify
AI ETHICS
The bias-fairness conundrum
Bias-fairness conundrum:
AI learns from data that can carry
societal biases
Result:
AI may unintentionally amplify these
biases
Example:
Early versions of ChatGPT
Solution:
Train AI models with fairer, bias-free data
AI ETHICS
The transparency-complexity trade-off
Transparency-complexity trade-off:
Complex AI models lack transparency
but are highly accurate
Simpler models are more transparent but

less accurate
AI literacy is vital for comprehension and

ethical implications
1 Icon made by Freepik from www.flaticon.com
AI ETHICS
The autonomy-control dilemma
Autonomy-control dilemma:
AI can act autonomously but might
operate outside human control
Question:
Should we prioritize autonomy or control?
No one-size-fits-all answer
Example:
Tesla's Autopilot system emphasizes
driver vigilance and readiness to take
control
AI ETHICS
Navigating the challenges
Navigating ethical dilemmas in AI requires thoughtful trade-offs
Importance of human element in decision-making
Striving for better decisions in complex situations
Need for diverse stakeholders' involvement and continuous AI monitoring
AI ETHICS
Let's practice!
AI ETHICS
Unpacking the
blackbox:
Transparency
AI ETHICS
Joe Franklin
Llama Enthusiast
Black-box nature
AI implementations are often black boxes
A black box in AI:
Known inputs and outputs
AI ETHICS
Ambiguousness is non-ideal
Ambiguity in AI: Ethical challenge
Question of trust:
Can we validate AI decisions without
understanding them?
Transparency:
Making an AI's decision-making process
understandable
Example:
Factors in AI sales model
AI ETHICS
Throughout the AI life cycle
Transparency in AI involves all stages of the
AI life-cycle
Purpose:
Understand the workings of the AI
system
Gauge comfort level with its operation
AI ETHICS
A deciding factor
Current state:
Transparency in AI is uncommon
Hesitation in AI adoption
Future implications:
Transparency will become a deciding
factor in users' choice of AI systems
Actionable:
Organizations should prioritize
transparency
1 Icon made by Eucalyp from www.flaticon.com
AI ETHICS
Openness is key
Openness about AI challenges and
learnings is key
Transparency encourages innovation in AI
It leads to more advanced, reliable AI

systems
AI ETHICS
Embracing transparency in AI
Transparency in AI can be intimidating but
is beneficial for businesses
Transparency leads to predictable

regulations and public perception
Companies can compete based on

strengths, culture, customer relationships
rather than secrecy
AI ETHICS
Let's practice!
AI ETHICS
AI fairness: not just a
dream
AI ETHICS
Joe Franklin
Manager, DataCamp
Fairness in AI
Fairness: Ensure no group is favored over AI should predict patient outcomes
another equitably
Concerns race, gender, socioeconomic There should be no bias towards any

status, etc. specific group
AI ETHICS
Why does fairness matter?
AI's rapid processing can result in large-
scale impacts
Fairness prevents negative targeting of

vulnerable populations
Essential for responsible AI implementation,

ensures equitable consideration for all
1 Icons made by noomtah & Parzival' 1997 from www.flaticon.com
AI ETHICS
Promoting fairness
Fairness promotion is challenging but possible
Reduces potential bias by omitting certain variables
Variables include race, gender, age, socioeconomic status, sexual orientation, religion
AI ETHICS
Unintentional issues exist
Even with unawareness, unintentional bias

can still occur
Robust strategies needed to ensure fairness
1 Icons made by Freepik from www.flaticon.com
AI ETHICS
Minimizing bias
The main objective of AI fairness is minimizing bias
The first step is acknowledging bias exists
Remain skeptical and vigilant of AI
Conduct frequent monitoring and audits for fairness
AI ETHICS
Let's practice!
AI ETHICS
Safeguarding AI:
Accountability
AI ETHICS
Joe Franklin
Manager, DataCamp
Define accountability
Accountability:
Assigning responsibility for AI outcomes
Critical in AI's development, deployment,

and use
AI isn't a responsibility-evading "magic

wand"
AI ETHICS
Accountability is vital
People trust AI systems more when there is accountability
Accountability ensures ethical use and mitigates potential harm
Accountability means not absolving humans from responsibility
AI ETHICS
The paradox of accountability
Increasing AI accountability can improve
trust
Yet, excessive trust in AI can lead to

misguided decisions
Example:
Georgia Tech study where participants
followed misguided robot guidance
AI ETHICS
The Tesla story
Misunderstanding of the auto-pilot
capabilities among consumers
Criticism for Tesla's insufficient safeguards
Both Tesla and consumers share

responsibility
AI ETHICS
Achieving accountability
AI producers:
Achieving accountability involves
transparency and solving the 'Black Box'
problem
Attributing responsibility is key
AI consumers:
'Trust but verify'
Producers and consumers both play a role

in creating ethical AI
Challenges are opportunities for innovation
1 Icons made by Eucalyp & Sumitsaengtong from www.flaticon.com
AI ETHICS
No one-size-fits-all
Accountability in AI is a continuous journey
With each AI advancement, the accountability conversation evolves
No one-size-fits-all approach; varies across industries
AI ETHICS
Let's practice!
AI ETHICS
Explainable AI
AI ETHICS
Joe Franklin
Manager, DataCamp
What's explainable AI?
AI systems whose internal workings are understood by humans
Goal: Making AI's decision-making clear, understandable, and explainable
Helps understand why and how AI makes decisions
Major step towards ethical AI usage
1 Icon made by vectorsmarket15 from www.flaticon.com
AI ETHICS
The central pillars
Transparency, fairness, accountability are
central
AI conclusions should be accessible and

logical to humans
Models built with explainability at their core
Uses interpretable models like decision

trees or linear regression
Power in seeing the process, despite

possibly lower performance
1 Icons made by juicy_fish & Becris from www.flaticon.com
AI ETHICS
How does it work?
AI ETHICS
How does it work?
AI ETHICS
Local Interpretable Model-agnostic Explanations (LIME)
LIME as a translator that helps the model
communicate
Creates a simpler version of the model's

decision process for a specific prediction
Example:
Explains a movie's hit prediction based
on factors like director popularity and
high budget
AI ETHICS
SHapley Additive exPlanations (SHAP)
SHAP: A detective of AI, revealing feature
importance
SHAP in Action
Director: 50%
Cast: 30%
Genre: 15%
Budget: 5%
AI ETHICS
Future of XAI
Many more techniques and approaches exist in XAI
The gap between XAI and traditional AI is shrinking
Ongoing research is improving AI interpretability
AI ETHICS
Let's practice!
AI ETHICS
Ethical frameworks
AI ETHICS
Joe Franklin
Manager
The background story
Numerous ethical frameworks guide AI decision-making
Deontological vs. consequentialist approaches
No universal framework for applying AI ethics
The diversity is actually beneficial
AI ETHICS
Ethical framework defined
Ethical frameworks provide scaffolding for ethical decisions
Example: AI in healthcare needs to respect privacy and ensure fairness
AI ETHICS
Organizational benefits
Benefits of ethical framework
Allows foresight in AI decision impact
Provides a clear starting point for AI

usage
Ethical frameworks & innovation

AI ethics no longer feared for stifling
innovation
Seen as promoting innovation by

alleviating ambiguity and challenges
1 Icon made by Prosymbols Premium from www.flaticon.com
AI ETHICS
Meet AgroTech!
Ethical frameworks vary across industries
AgroTech
New agricultural company, innovating
crop harvesting
Ethical framework pillars: environmental

sustainability, economic viability, social
equity
Guides the development of their Smart

Harvester drone series
AI ETHICS
Meet AgroTech!(2)
Smart harvesters shouldn't focus only on
expensive, resource-intensive crops
A potential risk: farmers incentivized to

plant specific crops, threatening
sustainability
Solutions must also be economically viable

to be used effectively
AI ETHICS
Challenges are unavoidable
Balancing ethical considerations is complex
Cultural and regional variations complicate
frameworks
They guide AI development from a human

perspective
Aid in building trust with AI systems
1 Icons made by Flat Icons & Freepik from www.flaticon.com
AI ETHICS
Let's practice!
AI ETHICS
The value of ethical
AI
AI ETHICS
Joe Franklin
Manager, DataCamp
Balancing the scale
AI brings fast decision-making with widespread impact
Responsible AI use prevents crossing unanticipated barriers
Potential brand risk from system misbehavior is significant
AI ETHICS
AI ethics isn't optional
AI ethics: a necessity, not accessory
Balances the immense benefits and
potential pitfalls of AI
Case study: financial services industry

AI is indispensable
Absence could lead to disastrous

consequences
AI ETHICS
Bring in tangible impacts
Not just defensive but an offensive strategy
Propels organizations ahead of the curve
Creates trusted entities, enhancing

customer loyalty and brand reputation
Can lead to tangible impact on the bottom

line
1 Icon made by Vectoricons from www.flaticon.com
AI ETHICS
New field, new challenges
Recent emergence with large-scale AI use
Challenging to find examples of AI gone
bad
Too early in AI's evolution to see many

ethical missteps
1 Icon made by ultimatearm from www.flaticon.com
AI ETHICS
Let's practice!
AI ETHICS
The future of AI
ethics
AI ETHICS
Joe Franklin
Manager, DataCamp
Understanding the present
AI ETHICS
Anticipating future ethical dilemmas
With each AI advancement, new ethical
challenges emerge
Questions arise about data privacy,

potential bias, and decision-making
autonomy
Preparation is key to address future ethical

dilemmas
Future of AI ethics is unpredictable, but

patterns from history can guide us
AI ETHICS
The dynamic nature of AI ethics
Evolves with technological advancements
Ethical principles must adapt to new AI
applications and societal values
Learn from advancements in data security,

privacy, and ethics
Stay alert to emerging trends, new

techniques, and potential pitfalls
1 Icons made by Freepik from www.flaticon.com
AI ETHICS
Ethical AI by Design
Awareness of technology and techniques is a prerequisite
Ethical AI by Design: integrating ethics from initial design stage of AI systems
Safeguards on data collection and storage
Place ethical principles at the forefront of the decision-making process
AI ETHICS
Ethical AI in practice
Healthcare AI systems:
Ensure transparency and explainability
for trust in AI-driven diagnoses
Retail AI systems:
Avoid bias and ensure accountability
AI ETHICS
Let's practice!
AI ETHICS
Honing ethics by
design
AI ETHICS
Joe Franklin
Manager, DataCamp
Deceptively simple
AI Ethics by Design: Consider ethical ramifications of AI in advance
AI ETHICS
The big ones
Define objectives
Align with stakeholders
Collect and manage data
Design transparently
Evaluate bias
Address concerns
Review and iterate
AI ETHICS
Defining objectives
AI ETHICS
The gangs all here
AI ETHICS
The right data in the right place
AI ETHICS
Transparency in design
AI ETHICS
The end is only the beginning
AI ETHICS
It's a wonderful world out there
Introduction to Data Ethics by Shalini Kurapati
Explore the intersection of ethics and data. Learn valuable skills to collect and manage
data ethically.
Forming Analytical Questions by Konstantinos Kattidis
Discover how to ask a good question and connect with stakeholders to drive change with
analytics.
AI ETHICS
Thank you!
AI ETHICS

AI Fundamentals

Uploaded by

Copyright:

You might also like

AI Fundamentals

Uploaded by

Document Information

Copyright

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

AI Fundamentals

Uploaded by

Copyright:

What is Artificial

Iván Palomares Carrascosa

UNDERSTANDING ARTIFICIAL INTELLIGENCE

UNDERSTANDING ARTIFICIAL INTELLIGENCE

UNDERSTANDING ARTIFICIAL INTELLIGENCE

UNDERSTANDING ARTIFICIAL INTELLIGENCE

UNDERSTANDING ARTIFICIAL INTELLIGENCE

UNDERSTANDING ARTIFICIAL INTELLIGENCE

UNDERSTANDING ARTIFICIAL INTELLIGENCE

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Artificial Intelligence (AI) Artificial General Intelligence (AGI)

Excels at solving specific tasks Solves a breadth of tasks intelligently

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Examples of AI "Halfway" examples towards AGI

Voice assistants Self-driving cars

Facial recognition AlphaGo

Personalized recommendations Generative AI: Language Models (e.g. GPT)

Autonomous industrial robots

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Iván Palomares Carrascosa

UNDERSTANDING ARTIFICIAL INTELLIGENCE

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Predictions: forecasting what will happen in

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Predictions: forecasting what will happen in

Inference: determine output based on data

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Predictions and inference

Data generation (Generative AI)

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Logistics and delivery: smart routing

Energy: Power grid operation and control

Tourism: flights and hotel pricing

Marketing: maximum-revenue campaigns

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Classifying documents, photos, etc.

Job application screening

Parcel management robots

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Bias: making unfair decisions to some groups Data ...

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Iván Palomares Carrascosa

UNDERSTANDING ARTIFICIAL INTELLIGENCE

UNDERSTANDING ARTIFICIAL INTELLIGENCE

UNDERSTANDING ARTIFICIAL INTELLIGENCE

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Robotics: act and manipulate physical

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Robotics: act and manipulate physical

Computer Vision: visually perceiving

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Robotics: act and manipulate physical

Computer Vision: visually perceiving

Natural Language Processing: analyze,

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Robotics: act and manipulate physical

Computer Vision: visually perceiving

Natural Language Processing: analyze,

UNDERSTANDING ARTIFICIAL INTELLIGENCE

Machine Learning Robotics, Computer Vision, Reasoning

Medical diagnosis Smart voice assistants

Computer Vision, Deep Learning NLP, Deep Learning

UNDERSTANDING ARTIFICIAL INTELLIGENCE