Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 14

Reinforcement learning 

(RL)
By Abhiram Sharma(19311A12P0)
Contents
• Introduction
• Reinforcement Learning vs Supervised Learning
• Performing Complex tasks
• Major Types of RL Algorithms
• Applications
• Advantages
• Disadvantages
• Conclusions
• References
Introduction
• Reinforcement learning (RL) is an area of ML, which is a high interdisciplinary
field of study.
• RL is a combination of computer science, probability theory, cognitive science
• cognitive science which itself encompasses, psychology, philosophy , Linguistics
and neuroscience
RL vs Supervised Learning
• The main difference between supervised learning and RL is: in supervised
learning the model is provided with historical data and its respective output and
ground truth so the model is trained in such a way that each and every step of
the model corresponds to a outcome which does not effect the next out come
• In RL the model is only given a task and the answer which contains if the outcome
is a success or failure so each step effects the next step so the answer is given
only for the end of episodes so Rl Models learn a sequential decision making
process
Performing Complex tasks
• With RL , very complex tasks like playing R.T.S games(Dota2 and Starcraft)
which require an agent to evaluate a large number of outcomes and
calculate its next step after anticipating its opponents’ move and creating
a flawless strategy.
• An AI company AI made AI bots using RL and placed them in a fight
against professional players. The bots defeated the human players
• Not only Dota but also other games that require human cognitive skills
like Starcraft and Go, which take years to master because there are
million of possible moves that can be taken. The biggest challenge is that
bots should come up with a strategy better than professional players that
they are playing against
Here the bots use a move so that enemy players in that circle get hit so
that the teammates out of the ring can take some time to regain health
Major Types of RL Algorithms
• There are two types major algorithms based on the learning style of
the agent
• Policy-based learning :policy is represented by explicitly built and kept
in learning
• Value-based learning : The policy is here implicit and can be derived
directly from the value function
Actor Critic is the combination of both Policy-based and Value-based
learnings leveraging the best of the both.
Applications
• Complex gameplay

• Self driving cars

• Inventory management

• Delivery management

• Trading agents
Advantages
• RL can be used to solve complex problems that cannot be solved by
normal techniques

• RL is implemented on robots

• RL can simulate an entire system and tests new actions or approaches


it changes courses when failure happens
Disadvantages
• If excess of RL is performed then the result gets diminished

• This needs a lot of data to learn

• RL cant be applied to simple problems


Conclusion
RL might be the way to make computers perform complex tasks that
otherwise require human cognitive skills and sequential decision
making abilities so that computers can perform the tasks with more
accuracy and despite many obstacles the agent faces during training
the agent tries to find a solution to avoid the obstacle and tries to
achieve success rate or rewards grater than the previous reward so RL
can be the gateway to develop computers that are independent and
make their own decisions and perform much more complex tasks that
needs a high accuracy.
References
• Open-Ai five defeats Dota 2 world champions
• Deep Reinforcement Learning CS-285 U.C.B
• Stack exchange-different types of RL algorithms
Any Questions?
Thank You!

You might also like