Welcome to Scribd!

Skip carousel

Deep Q-Network

Uploaded by

aishika.ranjan2021

0% found this document useful (0 votes)

6 views15 pages

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

6 views15 pages

Deep Q-Network

Uploaded by

aishika.ranjan2021

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 15

Search inside document

REINFORCEMENT AND REPRESENTATION

LEARNING

DEEP Q-NETWORKS
Presented by Group-8;
21BAI10156 AISHIKA RANJAN
WHAT IS DEEP Q-NETWORKS ??

DQN (Deep Q-Network) is a Reinforcement Learning technique that uses a

powerful tool called a Neural Network to learn the best actions to take in the
situation

This method allows an AI agent to learn how to navigate an environment

through trial and error, by using deep learning to analyze the environment and
take actions that lead to positive rewards.
Deep Q Networks (DQN) combine deep learning and
reinforcement learning techniques to approximate optimal
action-selection policies.

DQN is particularly effective in environments with large, high-

dimensional state spaces, such as video games and robotic
control tasks. For eg. Atari

By leveraging neural networks to approximate Q-values, DQN

enables agents to learn complex decision-making strategies.
BUT, WHAT IS REINFORCEMENT LEARNING ??

RL is a method by which an AI agent

learns through interacting with its
environment.
The agent receives rewards for actions
that bring it closer to a goal and learns
to avoid actions that do not
UNDERSTANDING DEEP Q-NETWORK

This AI(agent) uses Q-table to navigate through

A Q-table is a data structure
the environment that contains sets of actions
and states. It's a lookup table
that calculates the maximum
expected future rewards for
each action at each state.
UNDERSTANDING DEEP Q-NETWORK

But if the Q-table gets too big ...

with more states & actions our AI
can’t handle it

So, We can have a function instead of a large Q-table as conscience of AI

state
Q- values
action
UNDERSTANDING DEEP Q-NETWORK

state
Q- values
action

This Function is nothing but Neural Network

state

Q-values

action
Qπ(s,a) = how good is
it to perform action a
in state s while
following policy π
-----
UNDERSTANDING DEEP Q-NETWORK

-
BACK PROPOGATION

}
--
Q-Network
Loss Loss
Function

Q- Learning is to learn the

values in the Q-table; Deep
Q-Learning is to learn the
Target parameters of Q-Network
Network
(Ideal Conscience) or a Neural Network
REPLAY MEMORY
Replay Memory is used to store each of the
Replay Memory interactions that the agent has with the
environment.

It has a fixed size of N.

N
As the memories fills up, older experiences are
removed to make space for newer ones.

Replay Memory

The use of replay memory ensures that the agent sees each data
point multple times, before the data point is removed from
memory.
This is especially good for environments where data samples are
costly to collect.
{
1. DATA
COLLECTION

2 PHASES
2. TRAINING
1. DATA
COLLECTION
(state, action, reward, new state)

(s,a,r,s’) values get stored in “REPLAY MEMORY”

then s’ -> s (new state becomes current state)

And Repeat the process of Data Collection

2. TRAINING
Data gets collected in “REPLAY MEMORY”
take a batch of data from the Replay Memory
Train the Q-network using back propagation
-----

-
BACK PROPOGATION

}
Q-value

--
Q-Network
Mean 1.8821
Sq. Error
Q-value

Target
Network
(Ideal Conscience)

Repeat for few more Batches And our Q-Network Gets Trained
EPSILON-GREEDY POLICY
The epsilon-greedy approach selects the action with the highest estimated reward most
of the time. The aim is to have a balance between exploration and exploitation.

1 Let = 0.5 Threshold value

2 r = random.random ( )
3 if r < :
4 perform random action

5 else :

6 perform QNet action

APPLICATION
Deep Q-Learning has been applied to a wide range of problems, including game playing, robotics,
and autonomous vehicles.

For example, it has been used to train agents that can play games such as Atari and Go, and to
control robots for tasks such as grasping and navigation.
THANK YOU!

Ubank Verification Form
Document2 pages
Ubank Verification Form
Rachel Z
No ratings yet
Engineering Change Number
Document20 pages
Engineering Change Number
Anonymous IVxadA7HR
No ratings yet
Complaint Management Systems PDF
Document24 pages
Complaint Management Systems PDF
Sonu Bawane
100% (3)
Case Study C Neww
Document12 pages
Case Study C Neww
Rudransh Sharma
No ratings yet
Deep Reinforcement Learning On Atari 2600
Document3 pages
Deep Reinforcement Learning On Atari 2600
International Journal of Innovative Science and Research Technology
No ratings yet
Distributed Deep Q-Learning: Hao Yi Ong, Kevin Chavez, and Augustus Hong
Document8 pages
Distributed Deep Q-Learning: Hao Yi Ong, Kevin Chavez, and Augustus Hong
pog
No ratings yet
111 Report
Document6 pages
111 Report
Nguyễn Tự Sang
No ratings yet
Introduction To Deep Q-Network (DQN) : by Divyansh Pandit
Document10 pages
Introduction To Deep Q-Network (DQN) : by Divyansh Pandit
Rudransh Sharma
No ratings yet
Simulation of The Navigation of A Mobile Robot by The Q-Learning Using Artificial Neuron Networks
Document12 pages
Simulation of The Navigation of A Mobile Robot by The Q-Learning Using Artificial Neuron Networks
techlab
No ratings yet
Playing Atari With Deep Reinforcement Learning Highlighted
Document9 pages
Playing Atari With Deep Reinforcement Learning Highlighted
Felipe Duarte
No ratings yet
Corrected ExtendedAbstract JoaoMaria67923
Document10 pages
Corrected ExtendedAbstract JoaoMaria67923
mahdi tahiri
No ratings yet
Assignment 3 - ReinforcementLearning - 200508263 - AdityaAnantharaman - Trikkur
Document9 pages
Assignment 3 - ReinforcementLearning - 200508263 - AdityaAnantharaman - Trikkur
adyanrfuture
No ratings yet
On Improving DRL For POMDP
Document7 pages
On Improving DRL For POMDP
ZHANG JUNJIE
No ratings yet
Assignment3 Yash Patel
Document10 pages
Assignment3 Yash Patel
adyanrfuture
No ratings yet
Multi-Agent Deep Reinforcement Learning: Maxim Egorov Stanford University
Document8 pages
Multi-Agent Deep Reinforcement Learning: Maxim Egorov Stanford University
Creativ Pinoy
No ratings yet
DeepMind Whitepaper
Document9 pages
DeepMind Whitepaper
crazylifefreak
No ratings yet
RL Project - Deep Q-Network Agent Report
Document11 pages
RL Project - Deep Q-Network Agent Report
Phạm Tịnh
No ratings yet
Lillicrap - Continuous Control With Deep RL
Document14 pages
Lillicrap - Continuous Control With Deep RL
aleong1
No ratings yet
An Introduction To Artificial Neural Networks - by Srivignesh Rajan - Towards Data Science
Document11 pages
An Introduction To Artificial Neural Networks - by Srivignesh Rajan - Towards Data Science
Ravan Farmanov
No ratings yet
Automated Theorem Proving in Intuitionistic Propositional Logic by Deep Reinforcement Learning
Document10 pages
Automated Theorem Proving in Intuitionistic Propositional Logic by Deep Reinforcement Learning
Smail Smart
No ratings yet
Applying Q (λ) -learning in Deep Reinforcement Learning to Play Atari Games
Document6 pages
Applying Q (λ) -learning in Deep Reinforcement Learning to Play Atari Games
omidbundy
No ratings yet
Sample Term Paper
Document7 pages
Sample Term Paper
Kabul Ka Pathan
No ratings yet
A Probabilistic Theory of Deep Learning: Unit 2
Document17 pages
A Probabilistic Theory of Deep Learning: Unit 2
Harshit
No ratings yet
HWCR
Document25 pages
HWCR
Anirudh Chhabra
No ratings yet
Tensorflow
Document25 pages
Tensorflow
Sudharshan Venkatesh
No ratings yet
DDQN PDF
Document13 pages
DDQN PDF
Bhanu Priya
No ratings yet
Pplication of Deep Reinforcement Learning For Ndian Stock Trading Automation
Document9 pages
Pplication of Deep Reinforcement Learning For Ndian Stock Trading Automation
ekene
No ratings yet
Making The Car Faster On Highway in Deeptraffic
Document3 pages
Making The Car Faster On Highway in Deeptraffic
api-339792990
No ratings yet
Activation Functions and Their Characteristics in Deep Neural Networks
Document6 pages
Activation Functions and Their Characteristics in Deep Neural Networks
Satyam
No ratings yet
Rainbow - Combining Improvements in Deep Reinforcement Learning (1710.02298)
Document14 pages
Rainbow - Combining Improvements in Deep Reinforcement Learning (1710.02298)
koveje
No ratings yet
Deep Learning QP
Document4 pages
Deep Learning QP
Gowri Ilayaraja
No ratings yet
Deep Learning Assignment 1 Solution: Name: Vivek Rana Roll No.: 1709113908
Document5 pages
Deep Learning Assignment 1 Solution: Name: Vivek Rana Roll No.: 1709113908
vik
No ratings yet
DL Anonymous Question Bank
Document22 pages
DL Anonymous Question Bank
Anuradha Pise
No ratings yet
cs224r Practical Deep RL
Document77 pages
cs224r Practical Deep RL
michael Amponsah
No ratings yet
362 Report
Document6 pages
362 Report
Nguyễn Tự Sang
No ratings yet
Human Activity Classification Poster
Document1 page
Human Activity Classification Poster
nikhil singh
No ratings yet
KSC2016 - Recurrent Neural Networks
Document66 pages
KSC2016 - Recurrent Neural Networks
shafiahmedbd
No ratings yet
A Deep Neural Network For Image Quality Assessment
Document5 pages
A Deep Neural Network For Image Quality Assessment
gitov13916
No ratings yet
Lecture Notes Deep Reinforcement Learning: Generalizability in Deep RL
Document7 pages
Lecture Notes Deep Reinforcement Learning: Generalizability in Deep RL
Rajat Rai
No ratings yet
A Deep Convolutional Neural Network Based On Nested Residue Number System
Document6 pages
A Deep Convolutional Neural Network Based On Nested Residue Number System
于富昇
No ratings yet
Traffic Sign Recognition With Multi-Scale Convolutional Networks
Document5 pages
Traffic Sign Recognition With Multi-Scale Convolutional Networks
Roger Sacchelli
No ratings yet
Autonomous Driving System Based On Deep Q Learnig: Takafumi Okuyama, Tad Gonsalves Jaychand Upadhay
Document5 pages
Autonomous Driving System Based On Deep Q Learnig: Takafumi Okuyama, Tad Gonsalves Jaychand Upadhay
erick quispe supo
No ratings yet
Application of Neural Q-Learning Controllers On The Khepera II Via Webots Software
Document8 pages
Application of Neural Q-Learning Controllers On The Khepera II Via Webots Software
techlab
No ratings yet
Neural Network Study Group
Document24 pages
Neural Network Study Group
Carmel Jean Madanguit
No ratings yet
DLCV Ch2 Neural Network
Document68 pages
DLCV Ch2 Neural Network
Mario Parot
No ratings yet
11757-Article Text-15285-1-2-20201228 PDF
Document8 pages
11757-Article Text-15285-1-2-20201228 PDF
Murilo Toledo
No ratings yet
Heterogeneous, Client-Server Modalities: Javier Sauler
Document6 pages
Heterogeneous, Client-Server Modalities: Javier Sauler
Javier Sauler
No ratings yet
Institute of Engineering and Technology Davv, Indore: Lab Assingment On
Document14 pages
Institute of Engineering and Technology Davv, Indore: Lab Assingment On
Nikhil Khatloiya
No ratings yet
Large-Scale Deep Reinforcement Learning
Document6 pages
Large-Scale Deep Reinforcement Learning
Lance Legel
No ratings yet
3.5 Intro2DeepQLearning
Document12 pages
3.5 Intro2DeepQLearning
anxo4spam
No ratings yet
Notes On Introduction To Deep Learning
Document19 pages
Notes On Introduction To Deep Learning
thumpsup1223
No ratings yet
Playing Geometry Dash With Convolutional Neural Networks
Document7 pages
Playing Geometry Dash With Convolutional Neural Networks
friedman
No ratings yet
Ruan 2019
Document5 pages
Ruan 2019
Nirban Das
No ratings yet
Deepgraphonet: A Deep Graph Operator Network To Learn and Zero-Shot Transfer The Dynamic Response of Networked Systems
Document10 pages
Deepgraphonet: A Deep Graph Operator Network To Learn and Zero-Shot Transfer The Dynamic Response of Networked Systems
data science
No ratings yet
Data Normalzation
Document6 pages
Data Normalzation
Konstantinos Kostoulas
No ratings yet
(2018 - ICCC - IEEE) RNN Deep Reinforcement Learning For Routing Optimization
Document5 pages
(2018 - ICCC - IEEE) RNN Deep Reinforcement Learning For Routing Optimization
Nam Quach
No ratings yet
Learning To Play Atari Games: David Hershey, Rush Moody, Blake Wulfe (Dshersh, Rmoody, Wulfebw) @stanford
Document6 pages
Learning To Play Atari Games: David Hershey, Rush Moody, Blake Wulfe (Dshersh, Rmoody, Wulfebw) @stanford
Bakthakolahalan Shyamsundar
No ratings yet
ML Assignment6Question Git
Document2 pages
ML Assignment6Question Git
vullingala syanthan
No ratings yet
Deep Learning Unit 2
Document30 pages
Deep Learning Unit 2
Aditya Pratap Singh
No ratings yet
Rrnns Mckay 2018
Document31 pages
Rrnns Mckay 2018
mmcyoung
No ratings yet
Reinforcement Learning For Selective Key
Document25 pages
Reinforcement Learning For Selective Key
Alex
No ratings yet
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
Rating: 4 out of 5 stars
4/5 (9)
Cisco Packet Tracer Implementation: Building and Configuring Networks: 1, #1
From Everand
Cisco Packet Tracer Implementation: Building and Configuring Networks: 1, #1
S. R. Jena
No ratings yet
Unit 6 (Struktur Dan Enumerasi) : Anggotaa
Document57 pages
Unit 6 (Struktur Dan Enumerasi) : Anggotaa
Wenny Hu
No ratings yet
Practical: 1 Unit Impulse Response: Num (0 0 1) Den (1 0.2 1) Impulse (Num, Den) Grid Title
Document20 pages
Practical: 1 Unit Impulse Response: Num (0 0 1) Den (1 0.2 1) Impulse (Num, Den) Grid Title
alwaysharsh
No ratings yet
Google Maps Add API (JSON Format) :: Sample Body
Document3 pages
Google Maps Add API (JSON Format) :: Sample Body
Karan
No ratings yet
Driver Booster UpToDate List
Document24 pages
Driver Booster UpToDate List
Maulana Ichsan
No ratings yet
Rules For Significant Figures
Document2 pages
Rules For Significant Figures
Jason Ong Wicky
No ratings yet
C & C++
Document119 pages
C & C++
Kiran Shenvi
No ratings yet
Types of Object Relationship
Document3 pages
Types of Object Relationship
Ranjith Krishnan
No ratings yet
Solaris Network Commands
Document11 pages
Solaris Network Commands
kayak_186
No ratings yet
Alv Output Editable and Saving The Data To Database
Document10 pages
Alv Output Editable and Saving The Data To Database
Kabil Rocky
100% (1)
Docid 1346328.1
Document5 pages
Docid 1346328.1
Mara Ordinary
No ratings yet
Ruby Final Draft Enu 20100825
Document331 pages
Ruby Final Draft Enu 20100825
leheib
No ratings yet
Neural Cryptography For Secret Key Exchange
Document4 pages
Neural Cryptography For Secret Key Exchange
IJMTST-Online Journal
No ratings yet
Writing A Scientific Paper (Colorado State University)
Document2 pages
Writing A Scientific Paper (Colorado State University)
APES2000
No ratings yet
5.2.2.4 Packet Tracer - ACL Demonstration
Document3 pages
5.2.2.4 Packet Tracer - ACL Demonstration
Dian Ariska Saputra
No ratings yet
Tutorial Mirroring
Document3 pages
Tutorial Mirroring
bima
No ratings yet
Hse Audit Checklist: Sat. Av. Low. Parameter Rating Remarks
Document2 pages
Hse Audit Checklist: Sat. Av. Low. Parameter Rating Remarks
Senthilnathan Nagarajan
No ratings yet
Math 11122222
Document2 pages
Math 11122222
Anonymous MXEBirlwu
No ratings yet
Location Wise Stock For Region
Document3 pages
Location Wise Stock For Region
Far
No ratings yet
Oracle: Human Capital Management Cloud Securing HCM
Document232 pages
Oracle: Human Capital Management Cloud Securing HCM
Narendra Palla
No ratings yet
Cara Penulisan Daftar Pustaka
Document13 pages
Cara Penulisan Daftar Pustaka
Rahma Wati
No ratings yet
Presentazione Aziendale Inglese
Document4 pages
Presentazione Aziendale Inglese
CRelectronic
No ratings yet
Defect Based Test
Document14 pages
Defect Based Test
Ciprian Florea
No ratings yet
MSBI Resume
Document3 pages
MSBI Resume
Maheswara Reddy M
No ratings yet
Linux and Unix Grep Command
Document13 pages
Linux and Unix Grep Command
Rocky
No ratings yet
Splunk For Monitoring and Auditing Active Directory
Document9 pages
Splunk For Monitoring and Auditing Active Directory
ANNguyen
No ratings yet
Analytical Aquifer ECLIPSE Petrel
Document13 pages
Analytical Aquifer ECLIPSE Petrel
Faheem
No ratings yet
Web Server Monitoring and Performance Counter Analysis
Document6 pages
Web Server Monitoring and Performance Counter Analysis
anon_916335746
No ratings yet