Professional Documents
Culture Documents
Višeprocesorski I Paralelni Sistemi
Višeprocesorski I Paralelni Sistemi
Višeprocesorski I Paralelni Sistemi
Zoran.Kalafatic@fer.hr, Tomislav.Hrkac@fer.hr
ZEMRIS, 2013/14
. . . . . .
. . . . . .
. . . . . .
Uvođenje paralelizma:
nudi alternativu povećanju performanse ubrzanjem radnog takta
logično je performansu sustava povećati raspodjelom posla na
više procesnih jedinica
ideja prisutna od početaka računarstva
korištena u računalima posebne namjene – super-računala
napredak tehnologije – dovođenje visoke performanse ”na stol”
kuda dalje ?
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Cores 3,120,000
Linpack 33,862.7 TFlop/s
Power 17,808.00 kW
Memory 1,024,000 GB
Interconnect TH Express-2
OS Kylin Linux
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
1 ∑
N−1
X(k) = x(n)e−j2Πnk/N
N
n=0
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1
P2
P3
-
0 5 10 15 20 25
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1 A
P2
P3
-
0 5 10 15 20 25
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1 A
P2 B
P3
-
0 5 10 15 20 25
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1 A
P2 B
P3 C
-
0 5 10 15 20 25
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1 A D
P2 B
P3 C
-
0 5 10 15 20 25
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1 A D
P2 B
P3 C E
-
0 5 10 15 20 25
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1 A D
P2 B
P3 C E F
-
0 5 10 15 20 25
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1 A D H M O
P2 B G I K N P
P3 C E F J L
-
0 5 10 15 20 25
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1 A D H M O
P2 B G I K N P
P3 C E F J L Q
-
0 5 10 15 20 25
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1 A D H M O
P2 B G I K N P
P3 C E F J L Q R
-
0 5 10 15 20 25
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1 A D H M O S
P2 B G I K N P
P3 C E F J L Q R
-
0 5 10 15 20 25
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1 A D H M O S
P2 B G I K N P T
P3 C E F J L Q R
-
0 5 10 15 20 25
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1 A D H M O S
P2 B G I K N P T
P3 C E F J L Q R U
-
0 5 10 15 20 25
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1 A D H M O S V
P2 B G I K N P T
P3 C E F J L Q R U
-
0 5 10 15 20 25
. . . . . .
B J R
s(1) S(1)
C K S
s(2) S(2)
D L T
s(3) S(3)
E M U
s(4) S(4)
F N V
s(5) S(5)
G O W
s(6) S(6)
H P X
s(7) S(7)
P1 A D H M O S V
P2 B G I K N P T W
P3 C E F J L Q R U X
-
0 5 10 15 20 25
. . . . . .
Sp 2.88
Ep = = = 0.96
p 3
. . . . . .
Sp 2.88
Ep = = = 0.96
p 3
za 24 procesora
S∞ 4.8
Ep = = = 0.2
p 24
. . . . . .
3 5 7 10 11 15 t
. . . . . .
P2 B L X
P3 C K S
P4 D N V
P5 E M U
P6 F P
P7 G O W
P8 H
0 5 10 15
. . . . . .
P2 B M L W
P3 C G I K U
P4 D O S T
P5 E H N Q R V
0 5 10 15 17
72
S5 = 17 = 4.235 < S∞
4.235
E5 = 5 = 0.847
. . . . . .
P2 F O W
P3 D L V
P4 B K U
P5 A C E G M N I J Q R S T
0 5 10 15
72
S5 = 15 = 4.8 = S∞
4.8
E5 = 5 = 0.96
. . . . . .
. . . . . .
T1 = p + s
T2 = p/N + s
T1 p+s
speedup = =
T2 p/N + s
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
s′ + p′ × N
Scaled speedup =
s′ + p′
uz s′ + p′ = 1
Scaled speedup = s′ + p′ × N = s′ + (1 − s′ ) × N = N + (1 − N) × s′
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
4 konvoja
vektorsko množenje i vektorsko učitavanje mogu se odvijati
istovremeno
uz 1 stazu i duljinu vektora VL=64 =⇒ 4 x 64 = 256 ciklusa takta
(4 ciklusa po komponenti rezultata)
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
y1 = w1 x1 + w2 x2 + w3 x3
. . . . . .
y2 = w1 x2 + w2 x3 + w3 x4
. . . . . .
y3 = w1 x3 + w2 x4 + w3 x5
. . . . . .
. . . . . .
. . . . . .
http://www.ttu.ee/users/nalle/SoC/socaX1 02.pdf
N. Gupta, A VLSI Architecture for Image Registration in Real Time,
IEEE trans. VLSI Systems, 2007
. . . . . .