Open navigation menu

Welcome to Scribd!

VR Part2 Lecture 6 Annotated

Uploaded by

Achintya Harsha

0% found this document useful (0 votes)

2 views10 pages

Copyright

© © All Rights Reserved

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

© All Rights Reserved

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

2 views10 pages

VR Part2 Lecture 6 Annotated

Uploaded by

Achintya Harsha

Copyright:

© All Rights Reserved

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 10

Search inside document

VISUAL RECOGNITION – PART 2

Lecture 6: Transformer based Language Modelling ;

Transformers for Image Classification
http://jalammar.github.io/illustrated-transformer/

Position Encoding in Transformers

Will changing the order of input sequence affects the respective ‘𝑧 ′ values ?

We need to add additional position information to every token to maintain sequence information
https://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf

Machine Translation : Self Attention + Cross Attention

Representation Learning with Self-Supervision + Transformers: BERT

Bi-directional Modeling done with the help of [MASK] tokens

Mask a percentage of input tokens at random (e.g. 15%)

Predict the masked token using Transformer encoder architecture

Sentence Embeddings :
Generative Modeling with Self-Supervision + Transformers: GPT
Vision Transformer (ViT)

Transformers can replace CNNs in image recognition !

Vision Transformer Steps:

• Split an image into fixed-size patches,

• Linearly embed each of them
• Add position embeddings

• Feed the resulting sequence of vectors to a standard

Transformer encoder.

AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE

ViT : Patch Creation

𝑃 = PatchSize
𝑥𝑝1 𝑥𝑝N

AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE

ViT : Patch Input Embedding
Patch Embedding

𝑥𝑝1
𝑥𝑝2

𝑧0

𝑥𝑝1 𝑥𝑝N 𝑥𝑝N

Learnable Position
Embeddings

AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE

ViT : Encoder & Final MLP Head

Probabilities

Cat Dog Horse Pattern

AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE

Vision Transformer – Attention Maps

AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE

You might also like

Osha Questions and Answers
Document5 pages
Osha Questions and Answers
Osueni Aitiemigele
91% (11)
E.F. Area of Blank Xno. of Rowsx100 Pitch X Strip Width
Document1 page
E.F. Area of Blank Xno. of Rowsx100 Pitch X Strip Width
Ankit Bhadesia
No ratings yet
Video Quality Assessment (VQA) Using Vision Transformers
Document6 pages
Video Quality Assessment (VQA) Using Vision Transformers
International Journal of Innovative Science and Research Technology
No ratings yet
Using Transformers For Computer Vision - by Cameron R. Wolfe, Ph.D. - Towards Data Science
Document2 pages
Using Transformers For Computer Vision - by Cameron R. Wolfe, Ph.D. - Towards Data Science
josehre1
No ratings yet
Vision Transformers For Dense Prediction Tasks: Junyong Lee Computer Graphics Lab
Document22 pages
Vision Transformers For Dense Prediction Tasks: Junyong Lee Computer Graphics Lab
Junyong Lee
No ratings yet
Transformer Part3 16 Mar 23 PDF
Document59 pages
Transformer Part3 16 Mar 23 PDF
arpan singh
No ratings yet
Transformer Part2 15 Mar 23 PDF
Document17 pages
Transformer Part2 15 Mar 23 PDF
arpan singh
No ratings yet
Apresentação Deep
Document28 pages
Apresentação Deep
Igor Caetano Diniz
No ratings yet
Vision Transformers For Vein Biometric Recognition
Document23 pages
Vision Transformers For Vein Biometric Recognition
Vicente Carmino
No ratings yet
A Present A Ç Ão Deep Learning
Document28 pages
A Present A Ç Ão Deep Learning
Igor Caetano Diniz
No ratings yet
Deeplearning - Ai Deeplearning - Ai
Document58 pages
Deeplearning - Ai Deeplearning - Ai
9f8z4k2cxs
No ratings yet
Lecture4 - Convnets For CV Slide
Document65 pages
Lecture4 - Convnets For CV Slide
mohdharislcp
No ratings yet
Endsem
Document738 pages
Endsem
Vratika
No ratings yet
Linear Factor Models and Auto-Encoders
Document28 pages
Linear Factor Models and Auto-Encoders
Muhammad Rizwan
No ratings yet
2021 NeurIPS VAAT Akbari, Yuan, Qian, Chuang, Chang, Cui, Gong
Document16 pages
2021 NeurIPS VAAT Akbari, Yuan, Qian, Chuang, Chang, Cui, Gong
Manuel Alvarez
No ratings yet
1-1-1 面向人工智能统一神经架构和预训练方法
Document51 pages
1-1-1 面向人工智能统一神经架构和预训练方法
landweel
No ratings yet
Pretraining Part1 16 Mar 23 PDF
Document32 pages
Pretraining Part1 16 Mar 23 PDF
arpan singh
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
Document62 pages
Natural Language Processing With Deep Learning CS224N/Ling284
dinhmanh hoang
No ratings yet
H 264/avc
Document23 pages
H 264/avc
dkchqmp
No ratings yet
Image Retrieval - Transformer
Document10 pages
Image Retrieval - Transformer
Jason Zhang
No ratings yet
Brain Cells (SheCodes)
Document9 pages
Brain Cells (SheCodes)
Seema Sultana
No ratings yet
S52095 - The Future of Generative AI For Content Creation - 1679442103451001Fz15
Document40 pages
S52095 - The Future of Generative AI For Content Creation - 1679442103451001Fz15
Red Dragon
No ratings yet
DL6 - Convnets 4
Document57 pages
DL6 - Convnets 4
razifa0
No ratings yet
Auto (v3)
Document27 pages
Auto (v3)
daredevilcho
No ratings yet
Variational Image Captioning Using Deterministic Attention
Document7 pages
Variational Image Captioning Using Deterministic Attention
Insta
No ratings yet
Learning Character-Level Representations For Part-Of-Speech Tagging
Document9 pages
Learning Character-Level Representations For Part-Of-Speech Tagging
Dewi Kirana
No ratings yet
Image Display and Exploration
Document8 pages
Image Display and Exploration
Julihot Sihotang
No ratings yet
Ug Consultants Digital Image Processing Course Contents
Document9 pages
Ug Consultants Digital Image Processing Course Contents
Madhusudhana Rao
No ratings yet
Cvpr17 Pointnet Slides
Document68 pages
Cvpr17 Pointnet Slides
Dr. Chekir Amira
No ratings yet
The Annotated S4
Document41 pages
The Annotated S4
mritunjoy
No ratings yet
Lect-4 Digital Image Processing
Document51 pages
Lect-4 Digital Image Processing
Raaz Khan
No ratings yet
Digital Image Processing: Lecture # 4 Spatial Enhancement-I
Document51 pages
Digital Image Processing: Lecture # 4 Spatial Enhancement-I
Ahsan
No ratings yet
Pix 2 Style 2 Pix
Document21 pages
Pix 2 Style 2 Pix
aakashgarg80
No ratings yet
Decision Transformer: Reinforcement Learning Via Sequence Modeling
Document21 pages
Decision Transformer: Reinforcement Learning Via Sequence Modeling
ycdu66
No ratings yet
Presentation 1
Document2 pages
Presentation 1
rohan v
No ratings yet
Deep Learning: Yann Lecun
Document58 pages
Deep Learning: Yann Lecun
Anonymous t4uG4pFd
No ratings yet
ViT Explained
Document15 pages
ViT Explained
윤정현
No ratings yet
DAA FinalReport
Document14 pages
DAA FinalReport
Faiq Qazi
No ratings yet
465-Lecture 16 ViT
Document18 pages
465-Lecture 16 ViT
labuni.jeni
No ratings yet
Unit - VI: Computer Graphics
Document30 pages
Unit - VI: Computer Graphics
sri ram reddy arimanda
No ratings yet
Lec 22
Document21 pages
Lec 22
joey131413141314
No ratings yet
RobView Single Sheet
Document2 pages
RobView Single Sheet
Rafael Flores H
No ratings yet
A Multimodal Text Block Segmentation Framework For Photo Translation
Document12 pages
A Multimodal Text Block Segmentation Framework For Photo Translation
bob wu
No ratings yet
Floating Point Instructions: Ray Seyfarth
Document18 pages
Floating Point Instructions: Ray Seyfarth
irshad
No ratings yet
Fundamentals of Industrial Robotics - Session 3 - Tools
Document39 pages
Fundamentals of Industrial Robotics - Session 3 - Tools
Anonymous m9rvPnc
100% (1)
EECE 5639 Computer Vision I: Descriptors, Feature Matching, Hough Transform
Document68 pages
EECE 5639 Computer Vision I: Descriptors, Feature Matching, Hough Transform
Sourabh Sisodia Srß
No ratings yet
Le Cun Support
Document77 pages
Le Cun Support
Angéniol
No ratings yet
15 - NEW 2020 ATTENTION ENC DEC TRANSFORMERS Lect15
Document50 pages
15 - NEW 2020 ATTENTION ENC DEC TRANSFORMERS Lect15
codewizard.19
No ratings yet
Conditional Positional Encodings For Vision Transformers
Document13 pages
Conditional Positional Encodings For Vision Transformers
tadeas kelly
No ratings yet
ATV - CVPR'23 Tutorial
Document152 pages
ATV - CVPR'23 Tutorial
parameters
No ratings yet
DSD Chapter 5
Document89 pages
DSD Chapter 5
Hamza Javed
No ratings yet
Modern Language Models
Document28 pages
Modern Language Models
John Hawkins
No ratings yet
C Theory Notes
Document16 pages
C Theory Notes
Hema Malini
No ratings yet
How To Use C's Volatile Keyword - Embedded Systems Experts
Document6 pages
How To Use C's Volatile Keyword - Embedded Systems Experts
daj86
No ratings yet
An Image Is Worth 16x16 Words Vit
Document21 pages
An Image Is Worth 16x16 Words Vit
Somanna M Research Scholar
No ratings yet
1 March - 14 - DL
Document80 pages
1 March - 14 - DL
Vratika
No ratings yet
Class XI (As Per CBSE Board) : Computer Science
Document27 pages
Class XI (As Per CBSE Board) : Computer Science
Anjum Akhthar
No ratings yet
Deep Network Notes
Document54 pages
Deep Network Notes
nellutlaramya
No ratings yet
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
Document23 pages
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
Naida Dacic
No ratings yet
03.venus Plug-In Interface (VPI) Development Guide
Document59 pages
03.venus Plug-In Interface (VPI) Development Guide
Quân Icon
No ratings yet
Computer Graphics Tutorial - Javatpoint
Document1 page
Computer Graphics Tutorial - Javatpoint
Sajid Bhatt
No ratings yet
High Efficiency Video Coding: Coding Tools and Specification
From Everand
High Efficiency Video Coding: Coding Tools and Specification
Mathias Wien
No ratings yet
Travel Management - User Manual
Document24 pages
Travel Management - User Manual
Alok Samataray
100% (2)
Gaming Industry E Mail Id Only
Document4 pages
Gaming Industry E Mail Id Only
Sundararajan Srinivasan
No ratings yet
GROOVY Magazine #8
Document27 pages
GROOVY Magazine #8
Rista Fuji
No ratings yet
Instrument Panel
Document7 pages
Instrument Panel
indrajith.me
No ratings yet
37 Nyquist Criterion For Zero ISI
Document3 pages
37 Nyquist Criterion For Zero ISI
JGS MAGICAL
No ratings yet
Seminar Report Demo
Document5 pages
Seminar Report Demo
nomaan_farooqi
No ratings yet
Label Non 3T v1
Document9 pages
Label Non 3T v1
Febry Hisbullah Nurman
No ratings yet
Instructions PDF
Document1 page
Instructions PDF
Nikola Urošević
No ratings yet
Trouble Shooting Filing
Document6 pages
Trouble Shooting Filing
Tripurari Kumar
No ratings yet
Design Engineer CV Template
Document2 pages
Design Engineer CV Template
yogolain
No ratings yet
E 20240407 Charge Calc Methods
Document3 pages
E 20240407 Charge Calc Methods
amarsoni03697
No ratings yet
The National Service Training Program Its Legal Basis, Objectives and Significance
Document13 pages
The National Service Training Program Its Legal Basis, Objectives and Significance
Angela Danielle Tan
No ratings yet
4TH Summative Test Q2
Document22 pages
4TH Summative Test Q2
Joehan Dimaano
No ratings yet
(GUIDE) (30!10!2013) New To Adb and Fastboot Guide - Xda-Developers
Document6 pages
(GUIDE) (30!10!2013) New To Adb and Fastboot Guide - Xda-Developers
Juan Diaz
No ratings yet
Autonomic Downlink Inter-Cell Interference Coordination in LTE Self-Organizing Networks
Document5 pages
Autonomic Downlink Inter-Cell Interference Coordination in LTE Self-Organizing Networks
Tarek Al Ashhab
No ratings yet
Coral Triangle Phi
Document105 pages
Coral Triangle Phi
Denny Boy Mochran
No ratings yet
IEEE 1149.6 - A Practical Perspective
Document9 pages
IEEE 1149.6 - A Practical Perspective
18810175224
No ratings yet
Base and Derived Quantity
Document4 pages
Base and Derived Quantity
Ht Gan
No ratings yet
Practice Paper X
Document14 pages
Practice Paper X
Rohit Kumar
No ratings yet
Promc Guide
Document69 pages
Promc Guide
Chijioke Zion Okabie
No ratings yet
Applied Chemistry MCQs
Document10 pages
Applied Chemistry MCQs
iangarvins
100% (1)
Cambridge International AS & A Level Information Technology: Topic Support Guide
Document9 pages
Cambridge International AS & A Level Information Technology: Topic Support Guide
Mohammad Ihab Mehyar
No ratings yet
BL M22 1011 PDF
Document46 pages
BL M22 1011 PDF
Dmitry
No ratings yet
Cse4001 Cloud-Computing TH 1.0 2 Cse4001
Document2 pages
Cse4001 Cloud-Computing TH 1.0 2 Cse4001
sai22.ssb
No ratings yet
Compaction Trends Shale CleanSands Gulf of Mexico
Document8 pages
Compaction Trends Shale CleanSands Gulf of Mexico
Afonso Elva
No ratings yet
European Ict Professional Role Profiles: Part 2: User Guide
Document37 pages
European Ict Professional Role Profiles: Part 2: User Guide
Daniele Ballo
No ratings yet
NMBM 15 Western RD - May 2023
Document2 pages
NMBM 15 Western RD - May 2023
Tshepo Kau
No ratings yet
17
Document105 pages
17
Ahmad Turmizi
No ratings yet