Professional Documents
Culture Documents
TD Concepts
TD Concepts
Training
By
Umaa S Krisnan
by Umaa S Krishnan 1
What is Teradata?
Teradata is an RDBMS designed for enterprise data
warehousing.
Massively Parallel Processing system(MPP)
Parallelism throughout Platform
“Share Nothing” architecture
Linear Scalability
Shared Nothing Software
• Delivers linear scalability
– Maximizes utilization of SMP resources
– To any size configuration
– Allows flexible configurations
– Incremental upgrades
• Linear with a slope of 1 at any size
VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs
Amps Amps Amps Amps Amps Amps Amps Amps Amps Amps Amps Amps Amps Amps Amps Amps
BYNET
Node
Node1 Node2 Node3 Node4 Work
Users
Data
Retail customer
I/O Utilization – 228 nodes
1400
1200
1000
600
400
200
0
8/1/2002 8/8/2002 8/15/2002 8/22/2002 8/29/2002 9/5/2002
VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs VPROCs
Amps Amps Amps Amps Amps Amps Amps Amps Amps Amps Amps Amps Amps Amps Amps Amps
PE PE PE
BYNET 0
BYNET 1
by Umaa S Krishnan 8
PARSING ENGINE (PE)
The Parsing Engine does three things every time
you run an SQL statement.
• Checks the syntax of your SQL
• Checks the security to make sure you have
access to the table
• Comes up with a plan for the AMPs to follow
by Umaa S Krishnan 9
PARSING ENGINE – PE (CONTD)
• The PE creates a PLAN that tells the AMPs exactly
what to do in order to get the data.
• The PE knows how many AMPs are in the system,
how many rows are in the table, and the best
way to get to the data.
• Query Optimizer is in the Parsing Engine
• The Parsing Engine verifies SQL requests for
proper syntax, checks security, maintains up to
120 individual user sessions, and breaks down
the SQL requests into steps.
by Umaa S Krishnan 10
PARSER ENGINE
LOGON
PE
by Umaa S Krishnan 11
Access Module Processors (AMPs)
• The philosophy of parallel processing revolves around the AMPs.
Teradata takes each table and spreads the rows evenly among all
the AMPs. When data is requested from a particular table each
AMP retrieves the rows for the table that they hold on their
disk.
• If the data is spread evenly then each AMP should retrieve
their rows simultaneously with the other AMPs.
• That is what we mean when we say Teradata was born to be
parallel.
• The AMPs will also perform output conversion while the PE
performs input conversion.
• The AMPs do the physical work associated with retrieving an
answer s et.
by Umaa S Krishnan 12
AMP CONTD
PE
by Umaa S Krishnan
13
AMPS
by Umaa S Krishnan 14
Data Management - Bottom Line
• No reorgs
– Don’t even have a reorg utility
• No index rebuilds
• No re-partitioning
• No detailed space management
• Easy database and table definition
• Minimum ongoing maintenance
– All performed automatically
by Umaa S Krishnan 16
FALLBACK CLUSTER
PE
by Umaa S Krishnan
17
FALLBACK CLUSTER
• IF 1 AMP IN A CLUSTER FAILS, NO PROBLEM
• IF MORE THAN 1 AMP IN A CLUSTER FAILS,
THEN QUERY WILL BE PROCESSED IF THAT
AMP IS NOT IN USED IN THE QUERY
AMP AMP AMP AMP
by Umaa S Krishnan
18
FALLBACK CLUSTER REVIEW
• IF AMP 1 and AMP 4 FAILS, GIVE 2 EAMPLES
OF A QUERY THAT WILL FAIL
• GIVE 2 EXAMPLES OF A QUERY THAT WILL
RETURN VALID ROWS
AMP AMP AMP AMP
by Umaa S Krishnan
19
CLIQUE
• A group of nodes is connected by hardware is
a clique
• It enables fault tolerance in the event of a
node failure
• The AMP processes of the failed node
transfers to the remaining nodes
by Umaa S Krishnan 20
NODE IN A CLIQUE FAILS
NODE 1 NODE 2
NODE 3 NODE 4
All AMP processes will migrate to surviving node
by Umaa S Krishnan 21
by Umaa S Krishnan 22
Exercise 1
1) Name two Operating Systems that the Teradata Database runs on
----- ---------
2) Which of the following represents a trillion bytes (1 TB) of data ?
a) 10 ⁶ b) 10 ⁹ c) 10₁₂ d) 10 ⁸
by Umaa S Krishnan 23
TERADATA – PRIMARY INDEX
• DATA IS DISTRIBUTED BASED ON THE PRIMARY INDEX
• Teradata's PE examines the Primary Index value of the row.
Teradata takes that Primary Index value and runs it
through a Hashing Algorithm. The output of the
Hashing Algorithm (i.e., a formula) is a 32-bit Row Hash.
by Umaa S Krishnan 24
PRIMARY INDEX
1) When query is issued, PE looks up in the hash map, and decides which AMPS participates in
the query
2) Hash value of a primary PE
index always remains HASH MAP
the same.
3) So if EMP, WITH PRIMARY INDEX ON NAME, AND NAME = JOHN’S HASH VALUE
IN HASH
MAP = 3, THEN
ALGORITHM WILL
ALWAYS
RETURN 3 AMP
AMP AMP AMP
**** WHEN YOU ADD AMPS THEN HASH MAP WILL CHANGE ****
by Umaa S Krishnan
25
PRIMARY INDEX
• Does not have to be unique
• Is the fastest way of retrieving data
• Choice of primary index is extremely important
• Criteria for Primary Index
- Relatively Non Volatile
- Even Distribution amongst AMPS
- Is a frequent criteria for selection
- Must be a Date, Integer, char or varchar
by Umaa S Krishnan 26
CREATE TABLE SYNTAX
CREATE TABLE Order_Table
(Order_No Integer Not Null,
Cust_No Integer Not Null,
Order_Date Date,
Order_Total Decimal(10,2))
Primary Index (OrderNo))
by Umaa S Krishnan 27
ROW DISTRIBUTION
AMP AMP AMP AMP
DAVIS * GATES *
SMITH JOBS
DAVIS * GATES *
KRISHNAN MARKS
WOODS BATES
SACHS
by Umaa S Krishnan 28
ROW DISTRIBUTION
AMP AMP AMP AMP
MALE * FEMALE*
MALE * FEMALE*
MALE* FEMALE*
by Umaa S Krishnan 29
ROW DISTRIBUTION
AMP AMP AMP AMP
by Umaa S Krishnan 30
ROW DISTRIBUTION
AMP AMP AMP AMP
by Umaa S Krishnan 31
PRIMARY INDEX
• Every primary index will be made of the hash
value + uniqueness value
• UPI (Unique Primary Index) accesses 1 AMP
• UPI returns 0 -1 row
• NUPI (Non Unique Primary Index) accesses 1
AMP
• NUPI returns 0-Many rows
by Umaa S Krishnan 32
CREATE TABLE SYNTAX
CREATE TABLE Order_Table
(Order_No Integer Not Null,
Cust_No Integer Not Null,
Order_Date Date,
Order_Total Decimal(10,2))
Primary Index (OrderNo)
PARTION BY Order_Date;
by Umaa S Krishnan 33
PARTITION ELIMINATION CAN AVOID FULL TABLE SCANS
AMP1 AMP2
EMP DEPT NAME EMP DEPT NAME
99 10 TOM 44 10 JERRY
75 10 MIKE 32 10 MIKE
56 10 UMAA 12 10 ANITA
67 20 JAYRA 45 20 TOM
54 20 ANITA 16 20 SALLY
30 20 SASHA 22 20 SASHA
by Umaa S Krishnan 34
SET AND MULTISET TABLE
• Set table during create table ensure no
duplicates
• Multiset table allows duplicates
• Syntax
Create set table emp (emp_id int,
name varchar(10))
;
by Umaa S Krishnan 35
by Umaa S Krishnan 36
by Umaa S Krishnan 37