Welcome to Scribd!

Advanced VLSI Architecture: Lecture 26: Data Level Parallelism

Uploaded by

0% found this document useful (0 votes)

4 views15 pages

The document discusses vector length registers, strip mining, and memory banks in vector processors. Vector length registers allow the vector length to vary at runtime. Strip mining breaks loops into pieces that fit into vector registers. Memory banks allow independent, parallel access to memory to support high bandwidth vector loads and stores from multiple processors.

Original Description:

Original Title

Lecture_26

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

4 views15 pages

Advanced VLSI Architecture: Lecture 26: Data Level Parallelism

Uploaded by

Udai Valluru

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 15

Search inside document

CKV

Advanced VLSI Architecture

MEL G624

Lecture 26: Data Level Parallelism

CKV
Vector Length Registers

Vector length not known at compile time?

for (i=0; i<n; i=i+1)

C[i] = A[i] + B[i];

n value may not even be known at run time

Use Vector Length Register (VLR)

For n < Maximum Vector Length (MVL), set VLR to n

For n > MVL ? Strip Mining

CKV
Vector Length Registers

“Strip‐mining”

Break loops into pieces that fit into vector registers

Modify

for (i=0; i<n; i=i+1)

C[i] = A[i] + B[i];
CKV
Vector Length Registers
for (i=0; i<n; i=i+1)
C[i] = A[i] + B[i];
low = 0;
VL = (n % MVL); /*find odd-size piece using modulo op % */

for (j = 0; j <= (n/MVL); j=j+1) { /outer loop/

for (i = low; i < (low+VL); i=i+1) /runs for length VL/

C[i] = A[i] + B[i] ; /*main operation*/

low = low + VL; /start of next vector/

VL = MVL; /*reset the length to maximum vector length*/
}
CKV
Vector Length Registers

Write assembly code MTC1 VLR, R1

low = 0;
VL = (n % MVL); /*find odd-size piece using modulo op % */

for (j = 0; j <= (n/MVL); j=j+1) { /outer loop/

for (i = low; i < (low+VL); i=i+1) /runs for length VL/

C[i] = A[i] + B[i] ; /*main operation*/

low = low + VL; /start of next vector/

VL = MVL; /*reset the length to maximum vector length*/
}
CKV
Vector Length Registers
ANDI R1, N, 63 # N mod 64
MTC1 VLR, R1 # Do remainder
loop: LV V1, RA
LV V2, RB
ADDV.D V3, V1, V2
SV V3, RC
DSLL R2, R1, 3 # Multiply by 8
DADDU RA, RA, R2 # Bump pointer
DADDU RB, RB, R2
DADDU RC, RC, R2
DSUBU N, N, R1 # Subtract elements
LI R1, 64
MTC1 VLR, R1 # Reset full length
BGTZ N, loop # Any more to do?
CKV
Maximum Vector Length (MVL)

Determine the maximum number of elements in a vector
for a given architecture

All blocks but the first are of length MVL
Utilize the full power of the vector processor

Later generations may grow the MVL
No need to change the ISA
CKV
Vector‐Mask Register
for (i = 0; i < 64; i=i+1)
if (X[i] != 0)
X[i] = X[i] – Y[i];
CKV
Vector Mask Registers

A Boolean vector to control the execution of a vector
instruction

VMR is part of the architectural state

For vector processor, it relies on compilers to
manipulate VMR explicitly

For GPU, it gets the same effect using hardware

Both GPU and vector processor spend time on masking
CKV
Vector Mask Registers Example
for (i = 0; i < 64; i=i+1) SNEVS.D V1, F0
if (X[i] != 0)
X[i] = X[i] – Y[i];

LV V1, Rx ;load vector X into V1

LV V2, Ry ;load vector Y
L.D F0, #0 ;load FP zero into F0
SNEVS.D V1, F0 ;sets VM(i) to 1 if V1(i)!=F0
SUBVV.D V1, V1, V2 ;subtract under vector mask
SV Rx,V1 ;store the result in X
CKV
Vector‐Mask Control

Simple Implementation Density-time Implementation

Execute all N operations, turn off scan mask vector and only execute
result write-back according to mask elements with non-zero masks
CKV
Memory Banks

The start‐up time for a load/store vector unit is the time
to get the first word from memory into a register.

Penalties for start‐ups on load/
store units are higher than those for arithmetic unit

Memory system must be designed to support high bandwi
dth for vector loads and stores
CKV
Memory Banks
Vector processor use memory banks which allow multiple
independent access rather than memory interleaving
mainly to:
Support multiple loads or stores per clock. Be able to control
the addresses to the banks independently

Support (multiple) non‐sequential loads or stores

Support multiple processors sharing the same memory syste
m, processor will be generating its own independent stream
of addresses
CKV
Memory Banks
Example: Cray T90

32 processors, each generating 4 loads and 2 stores per
cycle

Processor cycle time is 2.167ns

SRAM cycle time is 15ns

How many memory banks needed?
CKV

Thank You for Attending

Node.js 63 Interview Questions and Answers
From Everand
Node.js 63 Interview Questions and Answers
John Edward Cooper Berg
No ratings yet
Mastering KVM Virtualization
From Everand
Mastering KVM Virtualization
Humble Devassy Chirammal
Rating: 5 out of 5 stars
5/5 (1)
Mastering VMware NSX for vSphere
From Everand
Mastering VMware NSX for vSphere
Elver Sena Sosa
No ratings yet
VLSI Interview Questions
Document7 pages
VLSI Interview Questions
Vlsi Guru
No ratings yet
Advanced VLSI Architecture: Lecture 25: Data Level Parallelism
Document19 pages
Advanced VLSI Architecture: Lecture 25: Data Level Parallelism
Udai Valluru
No ratings yet
Lecture 27
Document14 pages
Lecture 27
Udai Valluru
No ratings yet
Unit Iii Data-Level Parallelism in Vector, Simd, and Gpu Architectures
Document26 pages
Unit Iii Data-Level Parallelism in Vector, Simd, and Gpu Architectures
Shweta
No ratings yet
Advanced VLSI Architecture: Lecture 23 - 24: Data Level Parallelism
Document27 pages
Advanced VLSI Architecture: Lecture 23 - 24: Data Level Parallelism
Udai Valluru
No ratings yet
CS7103 - MultiCore Architecture Ppts Unit-II
Document43 pages
CS7103 - MultiCore Architecture Ppts Unit-II
Surya
No ratings yet
CH 04. Data-Level Parallelism in Vector, SIMD, and GPU Architectures
Document50 pages
CH 04. Data-Level Parallelism in Vector, SIMD, and GPU Architectures
Faheem Khan
No ratings yet
Data-Level Parallelism in Vector, SIMD, and GPU Architectures
Document58 pages
Data-Level Parallelism in Vector, SIMD, and GPU Architectures
Dhananjai Yadav
No ratings yet
XX-BSC Compact Vector Processing
Document49 pages
XX-BSC Compact Vector Processing
mheba11
No ratings yet
7-VECTOR PROCESSING-04-Jan-2020Material - I - 04-Jan-2020 - VECTOR - PROCESSING PDF
Document31 pages
7-VECTOR PROCESSING-04-Jan-2020Material - I - 04-Jan-2020 - VECTOR - PROCESSING PDF
ANTHONY NIKHIL REDDY
No ratings yet
Unit Iii - Aca
Document13 pages
Unit Iii - Aca
Anitha Denis
No ratings yet
TechTalk Kruppe Espasa RISC V Vectors and LLVM
Document23 pages
TechTalk Kruppe Espasa RISC V Vectors and LLVM
Lộc Nguyễn Tấn
No ratings yet
CA 13 VectorProcessors
Document16 pages
CA 13 VectorProcessors
Caroline Sam
No ratings yet
Real-Time VLSF Architecture For Video Compression: Fatemi and Panchanathan
Document4 pages
Real-Time VLSF Architecture For Video Compression: Fatemi and Panchanathan
कर्म सिंह मलिक
No ratings yet
UEC2302 Digital System Design: Unit 5: Lecture-9 Test Benches
Document50 pages
UEC2302 Digital System Design: Unit 5: Lecture-9 Test Benches
Amoga Lekshmi
No ratings yet
Advanced Computer Architecture: Nguyễn Kim Khánh
Document44 pages
Advanced Computer Architecture: Nguyễn Kim Khánh
tuan
No ratings yet
Admin No: U18EC073 Aim:: Practical 3
Document8 pages
Admin No: U18EC073 Aim:: Practical 3
Avinash
No ratings yet
Towards Automating Cryptographic Hardware Implementations: A Case Study of HQC
Document15 pages
Towards Automating Cryptographic Hardware Implementations: A Case Study of HQC
Ajay Rodge
No ratings yet
U18ec038 Vlsi Lab 3
Document8 pages
U18ec038 Vlsi Lab 3
Timir Patel
No ratings yet
Module 1.6
Document53 pages
Module 1.6
project mission
No ratings yet
Fundamentals of Parallel Processing: By: Solomon S
Document35 pages
Fundamentals of Parallel Processing: By: Solomon S
Amanuel
No ratings yet
Lecture 28
Document22 pages
Lecture 28
Udai Valluru
No ratings yet
CA Classes-201-205
Document5 pages
CA Classes-201-205
SrinivasaRao
No ratings yet
Vector
Document38 pages
Vector
Hareesh Ramachandran
No ratings yet
Last Modified On September 3, 2004 by Stanley Wang and Henry Jen
Document3 pages
Last Modified On September 3, 2004 by Stanley Wang and Henry Jen
Thomas George
No ratings yet
Lab Exercise - 14 Advance VLSI Aim: To Design and Simulate The Operation of A Schmitt Trigger Circuit and Show Its Application As An Oscillator
Document4 pages
Lab Exercise - 14 Advance VLSI Aim: To Design and Simulate The Operation of A Schmitt Trigger Circuit and Show Its Application As An Oscillator
rktiwary256034
No ratings yet
Ca Part 3
Document20 pages
Ca Part 3
Subhadip Das
No ratings yet
Lecture 20: Data Level Parallelism - Introduction and Vector Architecture
Document47 pages
Lecture 20: Data Level Parallelism - Introduction and Vector Architecture
Phani Kumar
No ratings yet
Very Large Scale Instruction Word
Document22 pages
Very Large Scale Instruction Word
rboy1993
No ratings yet
G Nageshwara Reddy - 13MVD1036
Document8 pages
G Nageshwara Reddy - 13MVD1036
Nagesh Reddy
No ratings yet
Module 5
Document37 pages
Module 5
Pavithra
No ratings yet
A Hybrid Block Based Watermarking Algorithm Using DWT-DCT-SVD Techniques For Color Images
Document7 pages
A Hybrid Block Based Watermarking Algorithm Using DWT-DCT-SVD Techniques For Color Images
Navdeep Goel
No ratings yet
Dic 2021
Document1 page
Dic 2021
akash rawat
No ratings yet
Digital Watermarking For Big Data
Document17 pages
Digital Watermarking For Big Data
Anirban Dasgupta
No ratings yet
How To Write FSM Is Verilog
Document8 pages
How To Write FSM Is Verilog
Ayaz Mohammed
No ratings yet
Lab 5 Counters and Shift Registers
Document8 pages
Lab 5 Counters and Shift Registers
Ahmed Razi Ullah
No ratings yet
Unit-5: Cocurrent Processors: Vector & Multiple Instruction Issue Processors
Document106 pages
Unit-5: Cocurrent Processors: Vector & Multiple Instruction Issue Processors
Jitender Garg
No ratings yet
20bce2114 Theory Da
Document8 pages
20bce2114 Theory Da
Mayank yadav
No ratings yet
Vector Processor
Document83 pages
Vector Processor
Lekshmi
No ratings yet
Working With Vectors
Document3 pages
Working With Vectors
Roger M. Idrovo
No ratings yet
NB:-Write The Answers in Your Own Way and Do Not Copy From Other
Document2 pages
NB:-Write The Answers in Your Own Way and Do Not Copy From Other
SayanMaiti
No ratings yet
EVL562 ECL523 Digital IC Design Dec2020
Document3 pages
EVL562 ECL523 Digital IC Design Dec2020
akash rawat
No ratings yet
Simulation: Definition, Motivation: Simulation: To Construct and Test A Computer
Document23 pages
Simulation: Definition, Motivation: Simulation: To Construct and Test A Computer
Naty Mat
No ratings yet
Lab2 ASIC
Document12 pages
Lab2 ASIC
pskumarvlsipd
No ratings yet
Data-Level Parallelism: Nima Honarmand
Document59 pages
Data-Level Parallelism: Nima Honarmand
warriors
No ratings yet
Vector Quantization: Data Compression and Data Retrieval
Document39 pages
Vector Quantization: Data Compression and Data Retrieval
Pooja Bharti
No ratings yet
Digital Microelectronic Digital Microelectronic Circuits Circuits ( (
Document24 pages
Digital Microelectronic Digital Microelectronic Circuits Circuits ( (
Gowtham Hari
No ratings yet
Mini Project Title: Group No
Document21 pages
Mini Project Title: Group No
vidhi Choudhari
No ratings yet
Homework 3 - Solutions
Document14 pages
Homework 3 - Solutions
Yahya Salam Saleh
No ratings yet
Vliw Processors
Document20 pages
Vliw Processors
RuneCoder
No ratings yet
Important Topics Solution On Systemverilog (SV) Needed in Frontend Domain of Vlsi
Document8 pages
Important Topics Solution On Systemverilog (SV) Needed in Frontend Domain of Vlsi
Salman M.s
No ratings yet
VLSI Lab Mannual
Document107 pages
VLSI Lab Mannual
Sadananda Krrish
No ratings yet
Aca305 2000
Document8 pages
Aca305 2000
Vanitha Vivek
No ratings yet
Verilog Interview Questions
Document7 pages
Verilog Interview Questions
Vishal K Srivastava
No ratings yet
Digital TEST 28.06.23
Document2 pages
Digital TEST 28.06.23
SUDHEER S
No ratings yet
Self-Commutating Converters for High Power Applications
From Everand
Self-Commutating Converters for High Power Applications
Jos Arrillaga
No ratings yet
DSP Applications Using C and the TMS320C6x DSK
From Everand
DSP Applications Using C and the TMS320C6x DSK
Rulph Chassaing
No ratings yet
Lecture 30 GPU Programming Loop Parallelism
Document16 pages
Lecture 30 GPU Programming Loop Parallelism
Udai Valluru
No ratings yet
2015 371813 Shrii-Lalitoo
Document452 pages
2015 371813 Shrii-Lalitoo
Udai Valluru
No ratings yet
Reading Assignment1
Document15 pages
Reading Assignment1
Udai Valluru
No ratings yet
FULLTEXT01
Document94 pages
FULLTEXT01
Udai Valluru
No ratings yet
Lecture 20
Document28 pages
Lecture 20
Udai Valluru
No ratings yet
Tu0401switcherki2009isic 140121090158 Phpapp01
Document128 pages
Tu0401switcherki2009isic 140121090158 Phpapp01
Udai Valluru
No ratings yet
Tu0404bgrqpldrki2009isic 140121090507 Phpapp02
Document146 pages
Tu0404bgrqpldrki2009isic 140121090507 Phpapp02
Udai Valluru
No ratings yet
Tu0403switcherki2009isic 140305063913 Phpapp02
Document40 pages
Tu0403switcherki2009isic 140305063913 Phpapp02
Udai Valluru
No ratings yet
Tu0402switcherki2009isic 140305063756 Phpapp01
Document48 pages
Tu0402switcherki2009isic 140305063756 Phpapp01
Udai Valluru
No ratings yet
Lecture 18
Document38 pages
Lecture 18
Udai Valluru
No ratings yet
Lecture 28
Document22 pages
Lecture 28
Udai Valluru
No ratings yet
Advanced VLSI Architecture: Lecture 7: Memory Hierarchy
Document20 pages
Advanced VLSI Architecture: Lecture 7: Memory Hierarchy
Udai Valluru
No ratings yet
Advanced VLSI Architecture: Lecture 8: Memory Hierarchy
Document35 pages
Advanced VLSI Architecture: Lecture 8: Memory Hierarchy
Udai Valluru
No ratings yet
Lecture 17
Document20 pages
Lecture 17
Udai Valluru
No ratings yet
9.ion Implantation
Document18 pages
9.ion Implantation
Udai Valluru
No ratings yet
Advanced VLSI Architecture: Lecture 25: Data Level Parallelism
Document19 pages
Advanced VLSI Architecture: Lecture 25: Data Level Parallelism
Udai Valluru
No ratings yet
Advanced VLSI Architecture: Lecture 23 - 24: Data Level Parallelism
Document27 pages
Advanced VLSI Architecture: Lecture 23 - 24: Data Level Parallelism
Udai Valluru
No ratings yet
Lecture 21
Document25 pages
Lecture 21
Udai Valluru
No ratings yet
Lecture 22
Document29 pages
Lecture 22
Udai Valluru
No ratings yet
MEL G611: Ic Fabrication Technology: Section X: Annealing
Document12 pages
MEL G611: Ic Fabrication Technology: Section X: Annealing
Udai Valluru
No ratings yet
(MEL G621) : Lecture 1: Introduction
Document8 pages
(MEL G621) : Lecture 1: Introduction
Udai Valluru
No ratings yet
Sec X, L 30 & 31: Annealing: MEL G611: Ic Fabrication Technology
Document17 pages
Sec X, L 30 & 31: Annealing: MEL G611: Ic Fabrication Technology
Udai Valluru
No ratings yet
Sec VII, L 18-21: TFD: MEL G611: Ic Fabrication Technology
Document22 pages
Sec VII, L 18-21: TFD: MEL G611: Ic Fabrication Technology
Udai Valluru
No ratings yet
Iot Architecture, Onem2M Model, Reference Model: L-T-P - (C) 4-0-0
Document25 pages
Iot Architecture, Onem2M Model, Reference Model: L-T-P - (C) 4-0-0
deva datta
No ratings yet
Preparation For Workshop: Satellite Data Analysis and Machine Learning Classification With Qgis
Document3 pages
Preparation For Workshop: Satellite Data Analysis and Machine Learning Classification With Qgis
nancer
No ratings yet
VLT Fieldbus Solutions
Document14 pages
VLT Fieldbus Solutions
asolelpancho
No ratings yet
Cloud Curriculum
Document7 pages
Cloud Curriculum
Vinay Kumar
No ratings yet
How To Decrypt IKE and ESP Packets On A Palo Alto Networks Device
Document3 pages
How To Decrypt IKE and ESP Packets On A Palo Alto Networks Device
Tomy Tomy
No ratings yet
BTS Startup Scenarios For FSMr3
Document14 pages
BTS Startup Scenarios For FSMr3
Quốc Minh Nguyễn
No ratings yet
Manual Testing Session-4: WWW - Pavanonlinetrainings
Document8 pages
Manual Testing Session-4: WWW - Pavanonlinetrainings
Abhishaker karri
No ratings yet
2201 3B
Document10 pages
2201 3B
dorinkarakoglu
No ratings yet
Smart Attendance Monitoring System Using Raspberry PI
Document3 pages
Smart Attendance Monitoring System Using Raspberry PI
Abhijeet Nangare
No ratings yet
E Office
Document22 pages
E Office
sumandas1963
No ratings yet
Micro Procesor Text
Document61 pages
Micro Procesor Text
Vinayak Saini
No ratings yet
Easy Explanation of Synology Router Setup Guide
Document1 page
Easy Explanation of Synology Router Setup Guide
synologylog router
No ratings yet
13.4.5 Packet Tracer - Troubleshoot WLAN Issues
Document4 pages
13.4.5 Packet Tracer - Troubleshoot WLAN Issues
Ivan Herrera Corona
No ratings yet
SD ProductDocumentation 021222 1513
Document2 pages
SD ProductDocumentation 021222 1513
Tibco Group
No ratings yet
Customer Tips: Xerox Network Scanning - Twain Configuration For The Workcentre 7328/7335/7345
Document12 pages
Customer Tips: Xerox Network Scanning - Twain Configuration For The Workcentre 7328/7335/7345
Jirawat Konanon
No ratings yet
Fortios v5.6.11 Release Notes
Document33 pages
Fortios v5.6.11 Release Notes
Ti BrokerSLZ
No ratings yet
Understanding Domain Name and Webhosting
Document3 pages
Understanding Domain Name and Webhosting
Learner's License
No ratings yet
Cloud Cryptography User End Encryption
Document4 pages
Cloud Cryptography User End Encryption
Yashil Patel
No ratings yet
Trifork QTI
Document4 pages
Trifork QTI
Nabeel Merchant
No ratings yet
Pad 116
Document34 pages
Pad 116
Michael T. Bello
No ratings yet
Project Documentation
Document29 pages
Project Documentation
Shaikh Mosin
No ratings yet
Data Warehousing and Data Mining - Thara - M.Tech Cse
Document11 pages
Data Warehousing and Data Mining - Thara - M.Tech Cse
thilaga
No ratings yet
Practical No.11:: Become Familiar With Looping in Assembly Programming
Document5 pages
Practical No.11:: Become Familiar With Looping in Assembly Programming
Vishal Kumar
No ratings yet
Sample Motivation Letter Master-Harvard University - en
Document2 pages
Sample Motivation Letter Master-Harvard University - en
Даниил
100% (1)
Inovasi Digital Di Industri Smart Farming: Konsep Dan Implementasi
Document7 pages
Inovasi Digital Di Industri Smart Farming: Konsep Dan Implementasi
jamban
No ratings yet
DNA Encoding and Channel Shuffling For Secured Encryption of Audio Data
Document24 pages
DNA Encoding and Channel Shuffling For Secured Encryption of Audio Data
Aman Kumar Trivedi
No ratings yet
RMR RITC II LTE Curs v0.4 06ian2015
Document36 pages
RMR RITC II LTE Curs v0.4 06ian2015
dan.artimon2791
No ratings yet
Ex. 3 Image Acquisition With Guymager - Kali LinuX
Document7 pages
Ex. 3 Image Acquisition With Guymager - Kali LinuX
Wonderweiss Margela
No ratings yet
O' & A' Level: Programming & Problem Solving Through
Document9 pages
O' & A' Level: Programming & Problem Solving Through
shrey patel
No ratings yet
Stonesoft Administrators Guide v5-6
Document1,398 pages
Stonesoft Administrators Guide v5-6
Bernardi Lucien
No ratings yet