Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Computer Vision

Lab 10: Introduction to Transformers & Transfer Learning

Based on the Lecture


Prepared by
Amjad Dife
2023 / 2024

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 1
Motivation

1 2 3

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 2
Motivation

4 5

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 3
Motivation

6 7

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 4
Motivation

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 5
Outline

9 10

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 6
Outline

11 12

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 7
Outline
❑ Transformers
➢ Introduction to Transformers as a neural network
➢ The need for Word2Vec Network
▪ The main usage
▪ Measuring Similarities
▪ Different Operations on Vectors
▪ Drawbacks of Word2Vec, and the Solution.
➢ The main Block in Transformers
▪ Self Attention Block.
▪ Multi-headed Attention.
❑ Transfer Learning
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 8
Outline
❑ Transformers
➢ Introduction to Transformers as a neural network
➢ The need for Word2Vec Network
▪ The main usage
▪ Measuring Similarities
▪ Different Operations on Vectors
▪ Drawbacks of Word2Vec, and the Solution.
➢ The main Block in Transformers
▪ Self Attention Block.
▪ Multi-headed Attention.
❑ Transfer Learning
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 9
Outline
❑ Transformers
➢ Introduction to Transformers as a neural network
➢ The need for Word2Vec Network
▪ The main usage
▪ Measuring Similarities
▪ Different Operations on Vectors
▪ Drawbacks of Word2Vec, and the Solution.
➢ The main Block in Transformers
▪ Self Attention Block.
▪ Multi-headed Attention.
❑ Transfer Learning
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 10
Introduction to Transformers

▪ Transformers is a Neural Network, it is mainly developed for NLP tasks, and then used for
Computer Vision.
▪ It requires a huge data to be trained.
▪ The beginning was: "Attention is All you need", by Google.

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 11
Outline
❑ Transformers
➢ Introduction to Transformers as a neural network
➢ The need for Word2Vec Network
▪ The main usage
▪ Measuring Similarities
▪ Different Operations on Vectors
▪ Drawbacks of Word2Vec, and the Solution.
➢ The main Block in Transformers
▪ Self Attention Block.
▪ Multi-headed Attention.
❑ Transfer Learning
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 12
Word2Vec

▪ The main idea is the embedding (512 number for example) of the similar words nearest to
each other in the vector space.

▪ How to measure the similarity between


two word's vectors? using the dot product
(Cosine Similarity = Cosine Distance)

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 13
Outline
❑ Transformers
➢ Introduction to Transformers as a neural network
➢ The need for Word2Vec Network
▪ The main usage
▪ Measuring Similarities
▪ Different Operations on Vectors
▪ Drawbacks of Word2Vec, and the Solution.
➢ The main Block in Transformers
▪ Self Attention Block.
▪ Multi-headed Attention.
❑ Transfer Learning
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 14
Word2Vec

▪ Note that: The highest the dot product the


nearest the vectors. (the more related
words)

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 15
Word2Vec

▪ Different operations on the


vectors:
➢ France – Paris + Italy =
Roma

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 16
Outline
❑ Transformers
➢ Introduction to Transformers as a neural network
➢ The need for Word2Vec Network
▪ The main usage
▪ Measuring Similarities
▪ Different Operations on Vectors
▪ Drawbacks of Word2Vec, and the Solution.
➢ The main Block in Transformers
▪ Self Attention Block.
▪ Multi-headed Attention.
❑ Transfer Learning
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 17
Word2Vec

▪ The Drawback of this (Word2Vec)


network is that:
▪ the vector of each word is fixed it
doesn't change based on the
Context.
▪ Note that: in Natural Languages, the
same words can be used in different
Context and have different meanings
based on the context (the
surrounding words)

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 18
Outline
❑ Transformers
➢ Introduction to Transformers as a neural network
➢ The need for Word2Vec Network
▪ The main usage
▪ Measuring Similarities
▪ Different Operations on Vectors
▪ Drawbacks of Word2Vec, and the Solution.
➢ The main Block in Transformers
▪ Self Attention Block.
▪ Multi-headed Attention.
❑ Transfer Learning
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 19
Self Attention

▪ The main block in The Transformers is the "Self-Attention Block "


▪ The Self-attention block has three main steps:
➢ Find alignment Scores: by calculating the Dot product (matrix multiplication).
➢ Get the Weights: by Normalizing the Scores (from step 1) with Softmax
➢ Reweighting the original embedding: using the weights (from step 2).

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 20
Self Attention | The Big Picture

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 21
Self Attention | Step 1

1. Find alignment Scores

alignment map

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 22
Self Attention | The Big Picture

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 23
Self Attention | Step 2

2. Get the Weights: by Normalizing the Scores (from step 1) with Softmax

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 24
Self Attention | The Big Picture

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 25
Self Attention | Step 3

3. Reweighting the original embedding: using the weights (from step 2).

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 26
Self Attention | The Big Picture

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 27
Outline
❑ Transformers
➢ Introduction to Transformers as a neural network
➢ The need for Word2Vec Network
▪ The main usage
▪ Measuring Similarities
▪ Different Operations on Vectors
▪ Drawbacks of Word2Vec, and the Solution.
➢ The main Block in Transformers
▪ Self Attention Block.
▪ Multi-headed Attention.
❑ Transfer Learning
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 28
Multi-headed Attention

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 29
Outline
❑ Transformers
➢ Introduction to Transformers as a neural network
➢ The need for Word2Vec Network
▪ The main usage
▪ Measuring Similarities
▪ Different Operations on Vectors
▪ Drawbacks of Word2Vec, and the Solution.
➢ The main Block in Transformers
▪ Self Attention Block.
▪ Multi-headed Attention.
❑ Transfer Learning
Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 30
Transfer Learning

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 31
Transfer Learning

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 32
Transfer Learning

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 33
Thank You

The image Generated by a


neural Network ☺

Computer Vision 2023 / 2024 Assiut University | Faculty of computers and information 34

You might also like