Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

CASE STUDY ON DQN AGENT

SECTION:- 418-B

NAME UID’s
VEDANT 23BCS11830
RUDRANSH SHARMA 23BCS11824
DIVYANSH PANDIT 23BCS11898
ARUSHI 23BCS11810
Introduction to Deep Q-
Network (DQN)
Deep Q-network is an influential reinforcement learning algorithm that merges deep
neural networks and the Q-learning procedure. DQN has taken the field of artificial
intelligence to the next level by allowing agents to perform outstandingly in complex
decision tasks based on learning from raw sensory data.
Reinforcement Learning and Markov Decision
Decision Processes
1 Reinforcement 2 Markov Decision 3 Optimal Policy
Learning Process In reinforcement learning,
This kind of machine
- A mathematical framework learning, the aim is to seek
learning paradigm occurs framework for modeling seek the best policy: to
occurs when an agent learns decision-making in dynamic, maximize the total reward
learns by interacting with his dynamic, stochastic reward over the future.
his environment under environments where the
explicit reward and penalty. future state depends only on
penalty. only on the current state and
and action
The Q-Learning Algorithm
Q- tells how good it is Convergence
Learning to pick an action in
The Q-learning
a state.
Q-learning is a update rule lets lets
Update Rule
simple way to teach the computer
teach a computer The Q-learning update rule improve its QQ-
to make choices lets the computer improve its values as it learns
choices without a Q-values as it learns from the from the rewards it
model. It learns rewards it gets and its guesses gets and its guesses
learns a function, of future rewards. of of future
Q(s, a), that tells rewards.
Limitations of Q-Learning
High-Dimensional Inputs Instability
Q-learning finds it hard to work with vast input like The method can be shaky, easily thrown off, and
input like raw picture data, as the -Q
table gets too and picky about settings, especially when using
too big quickly. using function guesses.

Limited Scalability
Its Q-table way limits how well it can handle big, complex areas, pushing the need for stronger guess methods.
methods.
Introducing Deep
Q-Network
1 Deep Neural Networks 2
Improved Stability
DQN uses a deep brain-like network to DQN learns the Q-value right from
basic guess the value of Q without the old Q- basic input, like screen
looks, skipping
table limits. skipping the step of making features by
features by hand.
3 End-to-End Learning
DQN brings in tricks like memory replay and stable target networks to keep learning smooth learning smooth
and stop it from going off track.

Key Components of DQN


Experie memory memory makes learning
nce bank and picks better.
Replay from it it when Target Network
learning. This
DQN keeps DQN has an extra steady
breaks the the link
memories of model, refreshed now and
between back-to-
what happened then, to figure out the goal
back back
(scene, move, scores. This keeps the learning
memories and
score, score, next smooth and stops ups and
scene) in a downs.
Neural The brain setup in well, making making
Network DQN often has has it good at dealing
Architecture many layers that with with complex
can really get get stuff.
what’s in pictures
Training and Optimization of DQN
Loss Function 1
DQN minimizes a mean-squared error loss
function, which compares the predicted Q -
values with the target Q -values computed 2 Optimization
from the Bellman equation. "DQN utilizes stochastic gradient descent
and backpropagation for adjusting the
neural network parameters and enhancing
the Q-value predictions."

Exploration-Exploitation 3
"DQN uses a strategy called epsilon -greedy
to find a balance between exploring new
actions and exploiting the best option while
learning."
Applications and Limitations of
DQN
Game Atari video R control robots,
games and allowing them to
Playing the ancient
o acquire intricate skills
Deep Q- game of b and maneuver
Network Go. o through difficult
(DQN) has surroundings."
proven t
effective in i
dominating c
intricate,
high- s
dimensiona "DQN has the
l games like capability to
C The Deep Q-Network L computatio
(DQN) algorithm has nal
o been used in a wide
i resources,
n range of control m necessitati
t tasks, including i ng
autonomous driving, substantial
r energy management,
t amounts of
o and industrial a training
l automation. t data.
Moreover,
i it may face
S o challenges
y n in
environme
s s nts where
t Training rewards
e DQN can are few
be quite and far
m costly in between or
s terms of the
dynamics
are
intricate.

You might also like