Welcome to Scribd!

Reinforcement Learning (RL) : by Abhiram Sharma (19311A12P0)

Uploaded by

0% found this document useful (0 votes)

28 views14 pages

Reinforcement learning (RL) is an area of machine learning that uses reward-based training to solve complex sequential decision-making problems. RL differs from supervised learning in that the model is only given feedback on the success or failure of its actions rather than step-by-step feedback. This allows RL models to learn strategies for tasks like playing video games to a professional level by evaluating large numbers of possible outcomes and actions. Common RL algorithms include policy-based, value-based, and actor-critic methods. RL has applications in areas like self-driving cars, inventory management, and trading systems due to its ability to simulate systems and learn from failures. While powerful for complex problems, RL also requires large datasets and computation to

Original Description:

Original Title

Reinforcement Learning

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pptx, pdf, or txt

0% found this document useful (0 votes)

28 views14 pages

Reinforcement Learning (RL) : by Abhiram Sharma (19311A12P0)

Uploaded by

Abhiram Sarma

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pptx, pdf, or txt

Jump to Page

You are on page 1of 14

Search inside document

Reinforcement learning

(RL)
By Abhiram Sharma(19311A12P0)
Contents
• Introduction
• Reinforcement Learning vs Supervised Learning
• Performing Complex tasks
• Major Types of RL Algorithms
• Applications
• Advantages
• Disadvantages
• Conclusions
• References
Introduction
• Reinforcement learning (RL) is an area of ML, which is a high interdisciplinary
field of study.
• RL is a combination of computer science, probability theory, cognitive science
• cognitive science which itself encompasses, psychology, philosophy , Linguistics
and neuroscience
RL vs Supervised Learning
• The main difference between supervised learning and RL is: in supervised
learning the model is provided with historical data and its respective output and
ground truth so the model is trained in such a way that each and every step of
the model corresponds to a outcome which does not effect the next out come
• In RL the model is only given a task and the answer which contains if the outcome
is a success or failure so each step effects the next step so the answer is given
only for the end of episodes so Rl Models learn a sequential decision making
process
Performing Complex tasks
• With RL , very complex tasks like playing R.T.S games(Dota2 and Starcraft)
which require an agent to evaluate a large number of outcomes and
calculate its next step after anticipating its opponents’ move and creating
a flawless strategy.
• An AI company AI made AI bots using RL and placed them in a fight
against professional players. The bots defeated the human players
• Not only Dota but also other games that require human cognitive skills
like Starcraft and Go, which take years to master because there are
million of possible moves that can be taken. The biggest challenge is that
bots should come up with a strategy better than professional players that
they are playing against
Here the bots use a move so that enemy players in that circle get hit so
that the teammates out of the ring can take some time to regain health
Major Types of RL Algorithms
• There are two types major algorithms based on the learning style of
the agent
• Policy-based learning :policy is represented by explicitly built and kept
in learning
• Value-based learning : The policy is here implicit and can be derived
directly from the value function
Actor Critic is the combination of both Policy-based and Value-based
learnings leveraging the best of the both.
Applications
• Complex gameplay

• Self driving cars

• Inventory management

• Delivery management

• Trading agents
Advantages
• RL can be used to solve complex problems that cannot be solved by
normal techniques

• RL is implemented on robots

• RL can simulate an entire system and tests new actions or approaches

it changes courses when failure happens
Disadvantages
• If excess of RL is performed then the result gets diminished

• This needs a lot of data to learn

• RL cant be applied to simple problems

Conclusion
RL might be the way to make computers perform complex tasks that
otherwise require human cognitive skills and sequential decision
making abilities so that computers can perform the tasks with more
accuracy and despite many obstacles the agent faces during training
the agent tries to find a solution to avoid the obstacle and tries to
achieve success rate or rewards grater than the previous reward so RL
can be the gateway to develop computers that are independent and
make their own decisions and perform much more complex tasks that
needs a high accuracy.
References
• Open-Ai five defeats Dota 2 world champions
• Deep Reinforcement Learning CS-285 U.C.B
• Stack exchange-different types of RL algorithms
Any Questions?
Thank You!

Ai Presentation
Document10 pages
Ai Presentation
Imti Rahman
No ratings yet
1 Introduction
Document12 pages
1 Introduction
hahadevi235
No ratings yet
Applying UML and Patterns: Object-Oriented Analysis and Design (OOA/OOD)
Document18 pages
Applying UML and Patterns: Object-Oriented Analysis and Design (OOA/OOD)
Fang Wang
No ratings yet
CSL0777 L02
Document19 pages
CSL0777 L02
Konkobo Ulrich Arthur
No ratings yet
Introduction To OOAD
Document44 pages
Introduction To OOAD
Aishwarya Thamizharasi
No ratings yet
AI Unit 1 Short Answer
Document14 pages
AI Unit 1 Short Answer
iphone.images11
No ratings yet
Java Programming - Day 1: Long Cycle - JEE
Document65 pages
Java Programming - Day 1: Long Cycle - JEE
theif_of-love
No ratings yet
Requirement Engineering Process & Tasks
Document31 pages
Requirement Engineering Process & Tasks
Swati Choudhary
No ratings yet
Lecture2 SLR
Document30 pages
Lecture2 SLR
Mushahid Hussain Nomee
No ratings yet
ML First Unit
Document70 pages
ML First Unit
Lohit P
No ratings yet
Introduction To The Course
Document14 pages
Introduction To The Course
FAHAD KARIM KHAN
No ratings yet
Overview of Programming
Document5 pages
Overview of Programming
Sarah Mae Gonzales
No ratings yet
Reinforcement Learning
Document10 pages
Reinforcement Learning
Depesh Banik
No ratings yet
Afafdfsregf
Document9 pages
Afafdfsregf
freeintro0404
No ratings yet
Previous Year Placement Questions of ISI KOLKATA
Document9 pages
Previous Year Placement Questions of ISI KOLKATA
Pratyush Raychaudhuri
No ratings yet
Module-3 Analysis and Design
Document50 pages
Module-3 Analysis and Design
Aishwarya
No ratings yet
BSIT-6: Function Points and Use Cases
Document38 pages
BSIT-6: Function Points and Use Cases
Shopify SEO
No ratings yet
Software Requirements
Document16 pages
Software Requirements
Kazi Dip
No ratings yet
Group4 Chapter 3 4 5 Answer
Document8 pages
Group4 Chapter 3 4 5 Answer
Tài Huỳnh Phước
No ratings yet
Object-Oriented Modeling and Design With UML: Chapter 1. Introduction & Chapter 2. Modeling As A Design Technique
Document23 pages
Object-Oriented Modeling and Design With UML: Chapter 1. Introduction & Chapter 2. Modeling As A Design Technique
saravanasasikumar
No ratings yet
Cse-304 Object Oriented Software Engineering: Instructor: Lailma Javed
Document40 pages
Cse-304 Object Oriented Software Engineering: Instructor: Lailma Javed
Saleem Iqbal
No ratings yet
Overview of OOMD
Document57 pages
Overview of OOMD
Pooja Yadav
No ratings yet
Lecture 1 09112023 064541pm
Document65 pages
Lecture 1 09112023 064541pm
AHSAN HAMEED
No ratings yet
Machine Learning Interviews v 2 Week 11715787639480
Document49 pages
Machine Learning Interviews v 2 Week 11715787639480
gg03work
No ratings yet
Robo Code Project 2008
Document5 pages
Robo Code Project 2008
Cristiano Marçal Toniolo
No ratings yet
BA Training Sildes
Document75 pages
BA Training Sildes
chinta bhasker
No ratings yet
FAANG Study Plan
Document10 pages
FAANG Study Plan
QuốcNguyễnVăn
No ratings yet
Reinforcement Learning
Document7 pages
Reinforcement Learning
Vignesh Senthil
No ratings yet
IS-344 Computing Applications in Business Spring 2015 Week 3
Document110 pages
IS-344 Computing Applications in Business Spring 2015 Week 3
Faded Rianbow
No ratings yet
Romi Sad 01 Introduction October2013
Document165 pages
Romi Sad 01 Introduction October2013
Rhantoro ..
No ratings yet
October: Corporate Relations Center UGI, Prayagraj Mphasis Drive - Plan and Guidelines Date: 22
Document6 pages
October: Corporate Relations Center UGI, Prayagraj Mphasis Drive - Plan and Guidelines Date: 22
abhinav mishra
No ratings yet
FlowGorithm Demo Class
Document55 pages
FlowGorithm Demo Class
Ram Prasad
No ratings yet
MACHINE LEARNING TECHNIQUES - PPSX
Document26 pages
MACHINE LEARNING TECHNIQUES - PPSX
fareenfarzanawahed
No ratings yet
IS-344 Computing Applications in Business Spring 2015 Week 1
Document62 pages
IS-344 Computing Applications in Business Spring 2015 Week 1
Faded Rianbow
No ratings yet
Final Exam Software Engineering-L3
Document17 pages
Final Exam Software Engineering-L3
Musaib Umer
No ratings yet
Learning: Chapter 17: Rich & Knight
Document30 pages
Learning: Chapter 17: Rich & Knight
Rupinder Aulakh
No ratings yet
Course Outline
Document14 pages
Course Outline
keyruebrahim44
No ratings yet
Module - 5 - Python - Algorithm Development
Document35 pages
Module - 5 - Python - Algorithm Development
ILEENVIRUS
No ratings yet
Sad QB
Document6 pages
Sad QB
Manan Chokshi
No ratings yet
Unit-4: .1 Responsibilities of Software Project Manager
Document21 pages
Unit-4: .1 Responsibilities of Software Project Manager
Janki Patel
No ratings yet
Brief of Machine Learning
Document13 pages
Brief of Machine Learning
Shubham kumar
No ratings yet
Pressman CH 7 Requirements Engineering
Document34 pages
Pressman CH 7 Requirements Engineering
Muthulakshmi Varadharajulu
No ratings yet
ML Unit1 PDF
Document36 pages
ML Unit1 PDF
Mohanraj Pramanathan
100% (1)
MBATech Unit7 Expertsystem
Document30 pages
MBATech Unit7 Expertsystem
Aditya Iyer
No ratings yet
AI Search STD
Document127 pages
AI Search STD
Yogesh Kumbhalkar
No ratings yet
Rajib Mall Lecture Notes
Document94 pages
Rajib Mall Lecture Notes
Anuj Nagpal
No ratings yet
Learning in Artificial Intelligence
Document8 pages
Learning in Artificial Intelligence
R Ravi Teja
67% (3)
Module 1
Document27 pages
Module 1
PRAJWAL SINGH
No ratings yet
Tata Elxsi - Ece-2018 Batch88
Document9 pages
Tata Elxsi - Ece-2018 Batch88
Mahesh
No ratings yet
Introduction 2
Document36 pages
Introduction 2
reema
No ratings yet
Design and Analysis of Algorithms: Muhammad Nasir
Document22 pages
Design and Analysis of Algorithms: Muhammad Nasir
Mohammad Abid Shamkani
No ratings yet
OOAD With UML and The UP - Session 4 - Elaboration
Document82 pages
OOAD With UML and The UP - Session 4 - Elaboration
sree7krish
No ratings yet
Operations Research
Document16 pages
Operations Research
ruhiarakkal
No ratings yet
SUMSEM2021-22 PLA1001 LT BL2021227000018 Reference Material I 13-May-2022 PLA1001 MCA2023 Part-1
Document302 pages
SUMSEM2021-22 PLA1001 LT BL2021227000018 Reference Material I 13-May-2022 PLA1001 MCA2023 Part-1
aditi singh
No ratings yet
BN209 MN507 Lecture1
Document34 pages
BN209 MN507 Lecture1
gauchodepoa
No ratings yet
Concepts of Algorithms Cs211: Teacher: Ghaida Alhamidi G.alhamidi@qu - Edu.sa
Document13 pages
Concepts of Algorithms Cs211: Teacher: Ghaida Alhamidi G.alhamidi@qu - Edu.sa
Nora Saleh
No ratings yet
Unit - 1
Document138 pages
Unit - 1
hrishabhjoshi123
No ratings yet
Analysis & Design
Document50 pages
Analysis & Design
rushabh90
No ratings yet
Software Engineering & Object Oriented Modeling
From Everand
Software Engineering & Object Oriented Modeling
Jitendra Patel
No ratings yet
LTE Roaming A Complete Guide - 2020 Edition
From Everand
LTE Roaming A Complete Guide - 2020 Edition
Gerardus Blokdyk
No ratings yet
Friction
Document18 pages
Friction
VGPRO YT
No ratings yet
Andre Blair Resume 2021
Document2 pages
Andre Blair Resume 2021
api-553022926
No ratings yet
Ease of Use: Product Sheet Thermoflexx 80
Document4 pages
Ease of Use: Product Sheet Thermoflexx 80
Chamakhi Ameur
No ratings yet
1 .2 Ultrapure Fittings: Fittings With Either Tri-Weld® For Orbital Welding Connections. Line in Compliance Current
Document41 pages
1 .2 Ultrapure Fittings: Fittings With Either Tri-Weld® For Orbital Welding Connections. Line in Compliance Current
Atul Sharma
No ratings yet
Analysis and Design of Shallow and Deep Foundations: Lymon C. Reese William M. Isenhower Shin-Tower Wang
Document2 pages
Analysis and Design of Shallow and Deep Foundations: Lymon C. Reese William M. Isenhower Shin-Tower Wang
André Oliveira
No ratings yet
Reward Charts Stickers Stamps
Document10 pages
Reward Charts Stickers Stamps
Alegria G.
No ratings yet
Project - 02 - Basic Calculation Quantum Espresso No Gui
Document4 pages
Project - 02 - Basic Calculation Quantum Espresso No Gui
debdip1993
No ratings yet
Sri Ram Narayan Singh Memorial High School: (Affiliated To The Council For The ISC, New Delhi)
Document32 pages
Sri Ram Narayan Singh Memorial High School: (Affiliated To The Council For The ISC, New Delhi)
computer.rnsmhs
No ratings yet
Study Guide:: Ricks P. Ortiz Principles of Soil Science - Dede
Document4 pages
Study Guide:: Ricks P. Ortiz Principles of Soil Science - Dede
Ricks P. Ortiz
No ratings yet
5422160319
Document308 pages
5422160319
Madhavi
No ratings yet
Educ 104 Module 2
Document3 pages
Educ 104 Module 2
Real Deal III
No ratings yet
Cvpr2022 Glip Grounded Language Image Pre Training
Document20 pages
Cvpr2022 Glip Grounded Language Image Pre Training
郭妙恬
No ratings yet
Psychology Research Paper Sample Apa
Document7 pages
Psychology Research Paper Sample Apa
afeemfrve
100% (1)
The Effect of Leadership, Adversity Quotient On Organizational Commitment
Document9 pages
The Effect of Leadership, Adversity Quotient On Organizational Commitment
Licia Salim
No ratings yet
1 s2.0 S0098135410001754 Main
Document2 pages
1 s2.0 S0098135410001754 Main
Rheoman
No ratings yet
Talking About Jobs PDF
Document5 pages
Talking About Jobs PDF
Juan Esteban S. Méndez
No ratings yet
Glossary of CDM Terms
Document27 pages
Glossary of CDM Terms
Harsh Vasani
No ratings yet
UltimateIvyLeagueGuide College Resume Template
Document2 pages
UltimateIvyLeagueGuide College Resume Template
Rugved Bhugaonkar
No ratings yet
Larong Pinoy IQG
Document4 pages
Larong Pinoy IQG
Leslie Ann Baguio
No ratings yet
Physical Characteristics of The Earth
Document1 page
Physical Characteristics of The Earth
aguiluz rontos
100% (7)
Related Videos Relevance Guideline For Articles: Almost The Same
Document3 pages
Related Videos Relevance Guideline For Articles: Almost The Same
Gil Silva
No ratings yet
Sustainable Luxury
Document228 pages
Sustainable Luxury
bach
No ratings yet
Micro para Notes
Document12 pages
Micro para Notes
ALYSSA NICOLE GINES
No ratings yet
Windows Used or Impact Testing
Document5 pages
Windows Used or Impact Testing
Ahmed El Tayeb
No ratings yet
Chapter 9 Conflict and Negotiation
Document9 pages
Chapter 9 Conflict and Negotiation
Charles Cabarles
No ratings yet
CONTEXTUALIZED DLP Q1 W1 Day1
Document6 pages
CONTEXTUALIZED DLP Q1 W1 Day1
QUEENIE BUTALID
No ratings yet
SD2-TRA-016 General Carpentry Works
Document5 pages
SD2-TRA-016 General Carpentry Works
Рашад Ибрагимов
No ratings yet
Vpa Candles
Document15 pages
Vpa Candles
ankit1844
No ratings yet
M. SC Bro. Final 20-21
Document125 pages
M. SC Bro. Final 20-21
sanjay s
No ratings yet
Tutorial Sheet 3
Document3 pages
Tutorial Sheet 3
Ayush Kumar
No ratings yet