Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Additive Attention Mechanism Calculation

Encoder Embedding Vectors

hturn = 0.1 0.2 0.3 0.4
hoff = 0.5 0.6 0.7 0.8
hthe = 0.9 1.0 1.1 1.2
hlight = 1.3 1.4 1.5 1.6

Decoder Hidden State

hdecoder1 = 0.0 0.4 1.0 0.3

Step 1: Concatenate Decoder Hidden State with Each Encoder Hidden State
For hturn :  
concat(hdecoder1 , hturn ) = 0.0 0.4 1.0 0.3 0.1 0.2 0.3 0.4
For hoff :  
concat(hdecoder1 , hoff ) = 0.0 0.4 1.0 0.3 0.5 0.6 0.7 0.8
For hthe :  
concat(hdecoder1 , hthe ) = 0.0 0.4 1.0 0.3 0.9 1.0 1.1 1.2
For hlight :  
concat(hdecoder1 , hlight ) = 0.0 0.4 1.0 0.3 1.3 1.4 1.5 1.6

Step 2: Apply Weight Matrix W

Assume W is a weight matrix of appropriate dimensions. For simplicity, let W be an identity matrix of
size 8 × 8 for demonstration purposes.
For hturn :
W · concat(hdecoder1 , hturn ) = 0.0 0.4 1.0 0.3 0.1 0.2 0.3 0.4

For hoff :  
W · concat(hdecoder1 , hoff ) = 0.0 0.4 1.0 0.3 0.5 0.6 0.7 0.8
For hthe :
W · concat(hdecoder1 , hthe ) = 0.0 0.4 1.0 0.3 0.9 1.0 1.1 1.2

For hlight :
W · concat(hdecoder1 , hlight ) = 0.0 0.4 1.0 0.3 1.3 1.4 1.5 1.6

Step 3: Apply v Vector and tanh Activation

Assume v is a vector of size 8. For simplicity, let v be a vector of ones: v = 1 1 1 1 1 1 1 1 .
For hturn :

score(hdecoder1 , hturn ) = v·tanh W · concat(hdecoder1 , hturn ) = tanh(0.0)+tanh(0.4)+tanh(1.0)+tanh(0.3)+tanh(0.1)+ta

= 0.0 + 0.3799 + 0.7616 + 0.2913 + 0.0997 + 0.1974 + 0.2913 + 0.3799 = 2.4011

For hoff :

score(hdecoder1 , hoff ) = v·tanh W · concat(hdecoder1 , hoff ) = tanh(0.0)+tanh(0.4)+tanh(1.0)+tanh(0.3)+tanh(0.5)+tanh(

= 0.0 + 0.3799 + 0.7616 + 0.2913 + 0.4621 + 0.5370 + 0.6044 + 0.6640 = 3.7003

For hthe :

score(hdecoder1 , hthe ) = v·tanh W · concat(hdecoder1 , hthe ) = tanh(0.0)+tanh(0.4)+tanh(1.0)+tanh(0.3)+tanh(0.9)+tanh

= 0.0 + 0.3799 + 0.7616 + 0.2913 + 0.7163 + 0.7616 + 0.8005 + 0.8337 = 4.5449

For hlight :

score(hdecoder1 , hlight ) = v·tanh W · concat(hdecoder1 , hlight ) = tanh(0.0)+tanh(0.4)+tanh(1.0)+tanh(0.3)+tanh(1.3)+ta

= 0.0 + 0.3799 + 0.7616 + 0.2913 + 0.8617 + 0.8854 + 0.9051 + 0.9216 = 5.0066

Step 4: Apply Softmax to Scores

softmax(score) = P
Calculate the exponential values:

exp(2.4011) ≈ 11.0342, exp(3.7003) ≈ 40.4559, exp(4.5449) ≈ 94.3468, exp(5.0066) ≈ 149.9468

Sum of exponentials:

11.0342 + 40.4559 + 94.3468 + 149.9468 = 295.7837

Calculate the softmax values:

αturn = ≈ 0.0373
αoff = ≈ 0.1368
αthe = ≈ 0.3190
αlight = ≈ 0.5069

Step 5: Calculate Context Vector ct

ct = αturn · hturn + αoff · hoff + αthe · hthe + αlight · hlight

ct = 0.0373· 0.1 0.3 0.4 0.5 +0.1368· 0.6 0.7 0.8 0.9 +0.3190· 1.0 1.1 1.2 1.3 +0.5069· 1.4 1.5 1.6 1.7

= 0.0037 0.0112 0.0149 0.0186 + 0.0821 0.0958 0.1094 0.1231 + 0.3190 0.3509 0.3828 0.4147 + 0.7097 0

= 1.1145 1.2182 1.3179 1.4180

You might also like