BISP Teradata Basics

Teradata Basics
Manohar Krishna
1. Architecture
2. PI, SI, PPI
3. Data protection
4. Spaces and Tables
5. Other indexes
Teradata Database System
A Teradata Database system contains one or more nodes. A node is a term for
a processing unit under the control of a single operating system. The node is
where the processing occurs for the Teradata Database. There are two types of
Teradata Database systems:
Symmetric multiprocessing (SMP) - An SMP Teradata Database has a single
node that contains multiple CPUs sharing a memory pool.
Massively parallel processing (MPP) - Multiple SMP nodes working together
comprise a larger, MPP implementation of a Teradata Database. The nodes are
connected using the BYNET, which allows multiple virtual processors on
multiple nodes to communicate with each other.
Node Components
A node is the basic building block of a Teradata Database system, and contains
a large number of hardware and software components. A conceptual
diagram of a node and its major components is shown below. Hardware
components are shown on the left side of the node and software
components are shown on the right side.
Client Connections
Users can access data in the Teradata Database through an application on
both channel-attached and network-attached clients. Additionally, the node
itself can act as a client. Teradata client software is installed on each client
(channel-attached, network-attached, or node) and communicates with
RDBMS software on the node. You may hear either type of client referred to
by the term "host," though this term is not typically used in documentation
or product literature.
Trusted Parallel Application (TPA)
A Trusted Parallel Application (TPA) uses PDE to implement virtual
processors (vprocs). The Teradata Database is classified as a TPA. The four
components of the Teradata Database TPA are:
•AMP (Top Right)

•PE (Bottom Right)
•Channel Driver (Top Left)
•Teradata Gateway (Bottom Left)
BYNET Hardware and Software
The BYNET hardware and software handle the communication between the
vprocs and the nodes.
• Hardware: The nodes of an MPP system are connected with the BYNET
hardware, consisting of BYNET boards and cables.
• Software: The BYNET driver (software) is installed on every node. This BYNET
driver is an interface between the PDE software and the BYNET hardware.
SMP systems do not contain BYNET hardware. The PDE and BYNET software
emulate BYNET activity in a single-node environment.
BYNET Unique Features
The BYNET has several unique features:
• Scalable: As you add more nodes to the system, the overall network bandwidth
scales linearly. This linear scalability means you can increase system size without
performance penalty -- and sometimes even increase performance.
• High performance: An MPP system typically has two BYNET networks

(BYNET 0 and BYNET 1). Because both networks in a system are active, the system
benefits from having full use of the aggregate bandwidth of both the networks.
• Fault tolerant: Each network has multiple connection paths. If the BYNET
detects an unusable path in either network, it will automatically reconfigure that
network so all messages avoid the unusable path. Additionally, in the rare case that
BYNET 0 cannot be reconfigured, hardware on BYNET 0 is disabled and messages are
re-routed to BYNET 1.
• Load balanced: Traffic is automatically and dynamically distributed between

both BYNETs.
Major Architecture components The Parsing Engine is responsible for:
& functionalities • Managing individual sessions

• Parsing and Optimizing your SQL requests
• Dispatching the optimized plan to the AMPs
SQL Request Answer Set Response
• Sending the answer set response back to the
requesting client
The Message Passing Layer is responsible for:
Parser
• Carrying messages between the AMPs and PEs
• Point-to-Point, Multi-Cast, and
Parsing Optimizer Broadcast communications
Engine
• Merging answer sets back to the PE
Dispatcher • Making Teradata parallelism possible
The AMPs are responsible for:
• Accesses storage using Teradata's File
Message Passing Layer (PDE and BYNET) • Lock management
• Sorting rows
• Aggregating columns
AMP AMP AMP AMP
• Join processing
• Output conversion and formatting
• Creating answer set for client
• Disk space management
• Accounting
• Special utility protocols
• Recovery processing
Teradata Parallelism
Teradata Competitive Advantages
• High-performance parallel processing

• Enormous capacity of processing
– Billions of rows
– Terabytes of data
• Network and mainframe connectivity
• Scalability (Manageable growth via modularity)
• High availability (Fault tolerance at all levels of hardware
and software)
Teradata –A Brief History
Teradata –Functional overview
TDP – Teradata MTDP – Micro Teradata

Director program Director program manages
manages and balances sessions at Network attached
the sessions at clients
mainframe clients MOSI – Micro operating
system interface provides
transparency of OS to
teradata server
Teradata Database…
• A Teradata database is a defined logical repository for:

• Tables
• Views
• Macros
• Triggers
• Stored Procedures
• Data Dictionary:
• The Data Dictionary is a set of relational tables that contains
information about the RDBMS and database objects within it. It
is like the metadata or "data about the data" for a Teradata
Database (except that it does not contain business rules, like
true metadata does). The Data Dictionary resides in Database
DBC. Some of the items it tracks are: Definitions, Owners,
Access and Disk Space
User: A Special Kind of Database
A user may be a collection of tables, views, macros, triggers, and stored
procedures. A user is a specific type of database, and has attributes in
addition to the ones listed above:
– User ID
– Password
So, a user is the same as a database except that a user can actually log on to
the database. To log on to a Teradata Database, you need to specify a user
(which is simply a database with a password).
Multiple Tables on Multiple AMP’s…
AMP AMP AMP AMP
Table A rows
Table B rows
• The rows of every table are distributed among all AMPs

• Each AMP is responsible for a subset of the rows of each table.
• Ideally, each table will be evenly distributed among all AMPs.
• Evenly distributed tables result in evenly distributed workloads.
• The uniformity of distribution of the rows of a table depends on the choice of the Primary
Index.
Creating Primary Index
• A Primary Index is defined at table creation.
• It may consist of a single column, or a combination of columns
CREATE TABLE sample_1

(col_a INTEGER If the index choice of column(s) is unique, we call this a UPI (Unique
Primary Index).
,col_b INTEGER
UPI A UPI choice will result in even distribution of the rows of the table
,col_c INTEGER) across all AMPs.
UNIQUE PRIMARY INDEX (col_b);
CREATE TABLE sample_2

If the index choice of column(s) isn’t unique, we call this a NUPI (Non-
(col_x INTEGER Unique Primary Index).
,col_y INTEGER A NUPI choice will result in even distribution of the rows of the table
NUPI proportional to the degree of uniqueness of the index.
,col_z INTEGER)
PRIMARY INDEX (col_x);
CREATE TABLE sample_2 A NoPI choice will result in distribution of the data between AMPs based
(col_x INTEGER on random generator code.
,col_y INTEGER A common use may be for staging or intermediate tables used with load
operations.
NoPI ,col_z INTEGER)
NoPI is available With Teradata 13.0
NO PRIMARY INDEX ;
• Accessing the row by its Primary Index value is:

Note: Changing the choice of Primary Index requires dropping
– always a one-AMP operation and recreating the table.
– the most efficient way to access a row
Accessing Via a Unique Primary Index
Accessing Via a Non Unique Primary Index
Row Distribution Using a UPI • Often, but not always, the PK
– Case 1 column(s) will be used as a UPI.
Order
Order
Number
Customer
Number
Order
Date
Order
Status • PI values for Order_Number are
PK known to be unique (it’s a PK).
UPI
7325 2 4/13 O • Teradata will distribute different
7324
7415
3
1
4/13
4/13
O
C
index values evenly across all AMPs.
7103 1 4/10 O
7225 2 4/15 C • Resulting row distribution among
7384 1 4/12 C
7402 3 4/16 C AMPs is very uniform.
7188 1 4/13 C
7202 2 4/09 C • Assures maximum efficiency for
parallel operations.
AMP AMP AMP AMP
o_# c_# o_dt o_# c_# o_dt o_# c_# o_dt o_# c_# o_dt
o_st o_st o_st o_st
7202 2 4/09 C 7325 2 4/13 O 7188 1 4/13 C 7324 3 4/13 O
7415 1 4/13 C 7103 1 4/10 O 7225 2 4/15 C 7384 1 4/12 C
7402 3 4/16 C
Row Distribution Using a NUPI – Case 2
Order
Order Customer Order Order
Notes:
Number Number Date Status • Customer_Number may be the preferred access
PK column for ORDER table, thus a good index
NUPI candidate.
7325
7324
2
3
4/13
4/13
O
O
• Values for Customer_Number are somewhat
7415 1 4/13 C non-unique.
7103
7225
1
2
4/10
4/15
O
C
• Choice of Customer_Number is therefore a
7384 1 4/12 C NUPI.
7402
7188
3
1
4/16
4/13
C
C
• Rows with the same PI value distribute to the
7202 2 4/09 C same AMP.
• Row distribution is less uniform or skewed.
AMP AMP AMP AMP
o_# c_# o_dt o_st o_# c_# o_dt o_st o_# c_# o_dt o_st
7325 2 4/13 O 7384 1 4/12 C 7402 3 4/16 C
7202 2 4/09 C 7103 1 4/10 O 7324 3 4/13 O
7225 2 4/15 C 7415 1 4/13 C
7188 1 4/13 C
Row Distribution Using a Highly Non-Unique Primary Index (NUPI) – Case 3
• Values for Order_Status are “highly” non-
Order unique.
Order Customer Order Order
Number Number Date Status • Choice of Order_Status column is a NUPI.
PK • Only two values exist, so only two AMPs will
NUPI
ever be used for this table.
7325
7324
2
3
4/13
4/13
O
O
• Table will not perform well in parallel
7415 1 4/13 C operations.
7103
7225
1
2
4/10
4/15
O
C
• Highly non-unique columns are poor PI
7384 1 4/12 C choices generally.
7402
7188
3
1
4/16
4/13
C
C • The degree of uniqueness is critical to
7202 2 4/09 C efficiency.
AMP AMP AMP AMP

o_# c_# o_dt o_st
7402 3 4/16 C o_# c_# o_dt o_st
7202 2 4/09 C 7103 1 4/10 O
7225 2 4/15 C 7324 3 4/13 O
7415 1 4/13 C 7325 2 4/13 O
7188 1 4/13 C
7384 1 4/12 C
Differences between Keys and Indexes
Primary Key Primary Index
A relational modeling convention used A Teradata Database mechanism used in

in a logical data model. a physical database design.
Uniquely identify a row (Primary Key). Used for row distribution (Primary
Index).
Establish relationships between tables Used for row access (Primary Index and
(Foreign Key). Secondary Index).
Values should not be changed Values may be changed (Delete + Insert)
Can not be a NULL Can be NULL
Must be Unique May be unique or Non unique
No Limit on number of columns 64 columns limit

Hashing Primary Index values
The Hash map
Distributing Rows to AMPs…
Index value(s)
The hashing algorithm is designed to insure even
hashing algorithm
{ distribution of unique values across all AMPs.
Different hashing algorithms are used for different
international character sets.
DSW or
Row Hash
Hash Bucket #
{
A Row Hash is the 32-bit result of applying a hashing
algorithm to an index value.
The DSW or Hash Bucket is represented by the high order
16 bits of the Row Hash.
Hash Map
{
A Hash Map is uniquely configured for each system.
It is a array of 65,536 entries (buckets) which associates
bucket numbers with specific AMPs.
Two systems with the same number of AMPs will have
AMP #
{ the same Hash Map.
Changing the number of AMPs in a system requires a
change to the Hash Map.
Duplicate Hash Values
It is possible for the hashing algorithm to end up with the same row hash value for
two different rows. There are two ways this could happen:
– Duplicate NUPI values: If a Non-Unique Primary Index is used, duplicate

NUPI values will produce the same row hash value.
– Hash synonym: Also called a hash collision, this occurs when the hashing
algorithm calculates an identical row hash value for two different Primary
Index values. Hash synonyms are rare. When using a Unique Primary Index,
you will still get uniform data distribution.
To differentiate each row in a table, every row is assigned a unique Row ID. The Row ID is the
combination of the row hash value and a uniqueness value.
Row ID = Row Hash Value + Uniqueness Value
The uniqueness value is used to differentiate between rows whose Primary Index
values generate identical row hash values. In most cases, only the row hash value
portion of the Row ID is needed to locate the row.
RowID
Row ID Row Hash Uniqueness Id
(32 bits) (32 bits)
Each stored row has a Row

ID as a prefix. Row ID Row Data
Row ID Row Data
Row Hash Unique ID Emp_No Last_Name First_Name
Rows are logically maintained 3B11 5032 0000 0001 1018 Reynolds Jane
in Row ID sequence. 3B11 5032 0000 0002 1020 Reynolds Evan
3B11 5032 0000 0003 1031 Reynolds Jason
3B11 5033 0000 0001 1014 Jacobs Paul
3B11 5034 0000 0001 1012 Chevas Jose
3B11 5034 0000 0002 1021 Carnet Jean
: : : : :
Using Hash Functions to View DistributionHash
Duplicate Rows
A duplicate row is a row in a table whose column values are identical to another
row in the same table. In other words, the entire row is the same, not just the
index.
Because duplicate rows are allowed in the Teradata Database, When you create a
table, the following definitions determine whether or not it can contain
duplicate rows:
– MULTISET tables: May contain duplicate rows. The Teradata Database will not check
for duplicate rows.
– SET tables: The default. The Teradata Database checks for and does not permit
duplicate rows. If a SET table is created with a Unique Primary Index, the check for
duplicate rows is replaced by a check for duplicate index values.
Duplicate Rows…
col_a col_b col_c
A duplicate row is a row of a table whose column values are all Duplicate Rows
identical to another row in the same table. 20 50 A
25 50 A
25 50 A
• Because a PK uniquely identifies each row, ideally a relational table should not have duplicate rows!
• The ANSI standard, however, permits duplicate rows for specialized situations, thus Teradata permits them as well.
• You may select whether your table will or will not allow them.
The Teradata default The ANSI default
CREATE SET TABLE table_A CREATE MULTISET TABLE table_B

: :
: :
Checks for * and disallows duplicate rows. Doesn’t check for and allows duplicate rows.
* Note: If a UPI is selected on a SET table, the duplicate row check is replaced by a check for duplicate index values.
Secondary Indexes
There are 3 general ways to access a table:

Primary Index access (one AMP access)
Secondary Index access (two or all AMP access)
Full Table Scan (all AMP access)

• A secondary Index provides an alternate path to the rows of a table.
• A secondary Index can be used to impose uniqueness within a columns or set of columns
• A table can have from 0 to 32 secondary indexes. Each index can have up to 64 columns
• Secondary Indexes:
– Do not effect table distribution.
– Add overhead, both in terms of disk space and maintenance.
– May be added or dropped dynamically as needed.
– Are chosen to improve table performance.
– Can be unique or non-unique
– Can be NULL and updatable

Choosing a Secondary Index
A Secondary Index may be defined ...
– at table creation (CREATE TABLE)
– following table creation (CREATE INDEX)

USI NUSI
If the index choice of column(s) is unique, If the index choice of column(s) is non-
it is called a USI. unique, it is called a NUSI.
Unique Secondary Index) Non-Unique Secondary Index
Accessing a row via a USI is a 2 AMP Accessing row(s) via a NUSI is an all AMP
operation. operation.
CREATE UNIQUE INDEX CREATE INDEX

(Employee_Number) ON Employee; (Last_Name) ON Employee;
• Secondary Indexes cause an internal sub-table to be built.

Unique Secondary index access
Non Unique Secondary index access
How Secondary Indexes Are Stored
Secondary indexes are stored in index subtables. The subtables for USIs and NUSIs are
distributed differently:
1. USI: The Unique Secondary Indexes are hash distributed separately from the data
rows, based on their USI value. (As you remember, the base table rows are distributed
based on the Primary Index value). The sub table row may be stored on the same AMP
or a different AMP than the base table row, depending on the hash value.
2. NUSI: The Non-Unique Secondary Indexes are stored in subtables on the same AMPs
as their data rows. This reduces activity on the BYNET and essentially makes NUSI
queries an AMP-local operation - the processing for the sub table and base table are
done on the same AMP. However, in all NUSI access requests, all AMPs are activated
because the non-unique value may be found on multiple AMPs.
You can submit a request without specifying a Primary Index and still access
the data. The following access methods do not use a Primary Index:
– Unique Secondary Index (USI)
– Non-Unique Secondary Index (NUSI)
– Full-Table Scan
Comparison of Primary and Secondary indexes
* Not required with NoPI table in Teradata 13.0

Rules for Keys and Indexes
A summary of the rules for keys (in the relational model) and indexes (in the Teradata Database) is shown
below.
Rule Primary Key Foreign Key Primary Index Secondary Index
1 One PK Multiple FKs One PI 0 to 32 SIs
2 Unique values Unique or non-unique Unique or non-unique Unique or non-unique
3 No NULLs NULLs allowed NULLs allowed NULLs allowed
4 Values should not Values may be Values may be changed Values may be changed
change changed (redistributes row)
5 Column should Column should not Column cannot be changed Index may be changed
not change change (drop and recreate table) (drop and recreate index)
6 No column limit No column limit 64-column limit 64-column limit

Data types
Partition Primary Index
Partition Primary Index…
CREATE TABLE CLAIM
( C_CLAIMID INTEGER NOT NULL
,C_CUSTID INTEGER NOT NULL
,C_CLAIMDATE DATE NOT NULL
,C_CLAIM_AM DECIMAL (18,2)
…)
PRIMARY INDEX (C_CLAIMID)
PARTITION BY RANGE_N (C_CLAIMDATE BETWEEN
DATE '2001-01-01' AND DATE '2010-12-31' EACH INTERVAL '1' MONTH );
CREATE TABLE CLAIM

( C_CLAIMID INTEGER NOT NULL
,C_CUSTID INTEGER NOT NULL
,C_CLAIMDATE DATE NOT NULL
, C_CLAIM_AM DECIMAL (18,2)
…)
PRIMARY INDEX (C_CLAIMID)
PARTITION BY CASE_N (C_CLAIM_AM < 100,
C_CLAIM_AM < 1000,
C_CLAIM_AM <10000
NO CASE OR UNKNOWN)
Partition Primary Index…
Potential Disadvantages of PPI
•PPI rows are 2 bytes longer. Table uses more PERM space
•Joins to non-partitioned tables with the same PI may be degraded

Disk level protection
Disk level protection: RAID1
RAID : Redundant Array of Independent Disks
RAID 1 characteristics:
Data is fully replicated in Mirror disk
Provides high data availability and performance, but storage
costs are high.
Disk level protection: RAID1…
AMP level protection: FALLBACK CLUSTER
A cluster is a group of AMPs that act as a single fallback unit
A Fallback row is a copy of a “Primary row” which is stored on a different AMP with in
the same cluster
After the loss of any AMP, a Down-AMP Recovery Journal is started automatically.
Its purpose is to log any changes to rows which reside on the down AMP. Any inserts, updates,or
deletes affecting rows on the down AMP, are applied to the Fallback copy within the cluster. The AMP
that holds the Fallback copy logs the Row ID in its Recovery Journal
RAID1 and FALLBACK
Node level protection: CLIQUE
Node level protection: CLIQUE…
Hot stand by Node: CLIQUE…
Data integrity protection: LOCKS
Data integrity protection: LOCKS…
Locking Modifier:
LOCKING ROW FOR ACCESS SELECT * FROM

Table_A;
LOCKING TABLE Table_B FOR EXCLUSIVE

UPDATE Table_B SET A = 2009;
Lock requests are queued behind all outstanding incompatible lock requests for the
same object.
Transaction level protection: Journals
Transaction level protection: Journals…
Spaces In Teradata Relation of PERM and SPOOL Space
3 Types of Spaces in Teradata
1) Perm space :
The space occupied by the tables,
indices, stored procedures
2) Spool space:
Spool Space is work space used to
hold intermediate answer sets. Any
Perm Space currently unassigned is
available as Spool Space
3) Temp space:
The space occupied by Global
temporary tables
55
Perm Spaces distribution Spool space distribution
Space terminology
57
Assigning Perm and Spool Limits
58
Types of temp Tables
There are three types of temporary tables

implemented in Teradata:
1) Derived
2) Volatile temporary
3) Global temporary
59
Derived Tables select prod_id, sale_date ,amount, AVGSALE
from sales_table,
•It is local to the query -it exists only for the duration of ( Sel AVG(amount) from sales_table ) as TEMP (AVGSALE)
the query. order by 3 DESC
•When the query is done the table is discarded

Prod_id Sale_date Amount AVGSAL
•It is incorporated into SQL query syntax
20 1/4/2012 30000 11571
•Spooled rows, which populate the table, are also 20 1/3/2012 16000 11571
discarded when query finishes
20 1/2/2012 14000 11571
•There is no data dictionary involvement -therefore,
less system overhead 10 1/4/2012 7000 11571
10 1/1/2012 6000 11571
•User spool space is used for materialize the derived
tables 20 1/5/2012 5000 11571
•No DDL, not in DBC.tables 10 1/2/2012 3000 11571
select sales_table.prod_id, sale_date ,amount, AVGSALE
from sales_table,
( Selprod_id, AVG(amount) from sales_tablegroup by 1 ) as
MYTEMP(prod_id,AVGSALE)
where MYTEMP.prod_id=sales_table.prod_id
order by 3 DESC
60
Volatile Tables
Local to a session -it exists throughout the entire session, not just a single query.
• It must be explicitly created using the CREATE VOLATILE TABLE syntax
• It is discarded automatically at the end of the session
• There is no data dictionary involvement
• Data is materialized in spool, definition is kept in cache
• Can have up to 1,000 volatile tables on a single session
•Generally used to hold less data derived through complex calculation
Ex:
CREATE VOLATILE TABLE vt_deptsal
(deptno SMALLINT
,avgsal DEC(9,2)
,maxsal DEC(9,2)
ON COMMIT PRESERVE ROWS;
In the example above, we stated ON COMMIT PRESERVE ROWS. This statement allows us to use the
volatile table again for other queries in the session. The default statement is ON COMMIT DELETE
ROWS, which means the data is deleted when the query is committed. Since this is rarely what is
intended, it is common to include ON COMMIT PRESERVE ROWS in the table creation statement.
The following commands are not applicable to volatile tables:
•COLLECT/DROP/HELP STATISTICS (From TD 13 it is possible)
•CREATE/DROP INDEX
•ALTER TABLE
•GRANT/REVOKE privileges
•DELETE DATABASE/USER(does not drop volatile tables)
• Can not be loaded with Multiload or Fastload utilities
61
Working with Volatile Table
Step 1:Creation
CREATE VOLATILE TABLE vt_deptsal
(deptno SMALLINT
,avgsal DEC(9,2)
,maxsal DEC(9,2)
,minsal DEC(9,2)
,sumsal DEC(9,2)
,empcnt SMALLINT)
ON COMMIT PRESERVE ROWS;
Step 2: Population
INSERT INTO vt_deptsal
SELECT dept ,AVG(sal) ,MAX(sal) ,MIN(sal) ,SUM(sal) ,COUNT(emp)
FROM emp
GROUP BY 1;
Step 3: Using
Show all employees who make the minimum salary in their department.
Note: joining volatile table with my permanent tables
SELECT emp, last, dept, sal
FROM emp INNER JOIN vt_deptsal
ON dept=deptno
WHERE sal=minsal
ORDER BY 3;
62
Global Temporary Table
Characteristics:
Global Temporary Tables are created using the CREATE GLOBAL TEMPORARY command. They require a
base definition which is stored in the Data Dictionary(DD).
Global Temporary Tables are different from Volatile Tables in terns:

• Their base definition is permanent and kept in the DD.
• Alter table can also be possible as the definition is stored in DD
• Space is charged against the user's 'temporary space' allocation.
• A user can materialize up to 2,000 global tables per session.
• They can survive a system restart
Global Temporary Tables are similar to Volatile Tables in terns:

• Each instance of a global temporary table is local to a session.
• Materialized tables are dropped automatically at the end of the session. (But the base definition remains
in the DD)
• They have ON COMMIT PRESERVE/DELETE options
• Materialized table contents are not sharable with other sessions
• Table always starts out empty at beginning of session
NOTE: When you hear the term 'Temporary Table' it might mean different things to different people. In
Teradata terminology, 'Temporary Tables' mean 'Global Temporary tables'.
63
Working with Global Temporary Table
CREATE GLOBAL TEMPORARY TABLE gt_deptsal
•(deptno SMALLINT
•,avgsal DEC(9,2)
•,maxsal DEC(9,2)
•,minsal DEC(9,2)
•,sumsal DEC(9,2)
•,empcnt SMALLINT);
We can Alter Global temporary table -

ALTER TABLE gt_deptsal, ON COMMIT PRESERVE ROWS;
Populate Global temporary table -

INSERT INTO gt_deptsal
SELECT dept ,AVG(sal) ,MAX(sal) ,MIN(sal) ,SUM(sal) ,COUNT(emp)
FROM emp GROUP BY 1;
The table is now materialized and a row is inserted into DBC.TempTables.
DELETE FROM gt_deptsal;

Rows are deleted, but the table remains materialized until it is dropped or until the session terminates
DROP TEMPORARY TABLE gt_deptsal; This drops the local instance of the table only.
DROP TABLE gt_deptsal; This drops the base definition and local instance of the table if present.
It will fail if there are other instances of the table in the system.
DROP TABLE gt_deptsal ALL; This drops the base table and all instances.
It will fail if any instance is in an active transaction.
64
Statistics
In Teradata Statistics can be understand as landmark of the Address.
Where Address is related to huge data and statistics are key information
about data.
Statistics…
NOTE:
Use
“DIAGNOSTIC HELPSTATS
ON FOR SESSION;”
to see the explain plan for

recommendation on collect
stats
Other indexes : Join index
Compression : To save the table storage sapce, Instead of values in a column
that are repetitive Teradata stores them inside the Table Header. Then a bit is
placed at the front of each row to indicate the value inside the row

BISP Teradata Basics

Uploaded by

Copyright:

Available Formats

You might also like

BISP Teradata Basics

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BISP Teradata Basics

Uploaded by

Copyright:

Available Formats

Teradata Basics

•AMP (Top Right)

• High performance: An MPP system typically has two BYNET networks

• Load balanced: Traffic is automatically and dynamically distributed between

& functionalities • Managing individual sessions

• High-performance parallel processing

TDP – Teradata MTDP – Micro Teradata

• A Teradata database is a defined logical repository for:

AMP AMP AMP AMP

• The rows of every table are distributed among all AMPs

CREATE TABLE sample_1

CREATE TABLE sample_2

• Accessing the row by its Primary Index value is:

AMP AMP AMP AMP

AMP AMP AMP AMP

Primary Key Primary Index

A relational modeling convention used A Teradata Database mechanism used in

Can not be a NULL Can be NULL

Must be Unique May be unique or Non unique

No Limit on number of columns 64 columns limit

– Duplicate NUPI values: If a Non-Unique Primary Index is used, duplicate

Each stored row has a Row

Row ID Row Data

Row Hash Unique ID Emp_No Last_Name First_Name

The Teradata default The ANSI default

CREATE SET TABLE table_A CREATE MULTISET TABLE table_B

There are 3 general ways to access a table:

Full Table Scan (all AMP access)

– Do not effect table distribution.

– Add overhead, both in terms of disk space and maintenance.

– May be added or dropped dynamically as needed.

– Are chosen to improve table performance.

– Can be unique or non-unique

– Can be NULL and updatable

– at table creation (CREATE TABLE)

– following table creation (CREATE INDEX)

CREATE UNIQUE INDEX CREATE INDEX

• Secondary Indexes cause an internal sub-table to be built.

* Not required with NoPI table in Teradata 13.0

Rule Primary Key Foreign Key Primary Index Secondary Index

1 One PK Multiple FKs One PI 0 to 32 SIs

2 Unique values Unique or non-unique Unique or non-unique Unique or non-unique

3 No NULLs NULLs allowed NULLs allowed NULLs allowed

6 No column limit No column limit 64-column limit 64-column limit

CREATE TABLE CLAIM

Potential Disadvantages of PPI

•Joins to non-partitioned tables with the same PI may be degraded

LOCKING ROW FOR ACCESS SELECT * FROM

LOCKING TABLE Table_B FOR EXCLUSIVE

There are three types of temporary tables

•When the query is done the table is discarded

Global Temporary Tables are different from Volatile Tables in terns:

Global Temporary Tables are similar to Volatile Tables in terns:

We can Alter Global temporary table -

Populate Global temporary table -

DELETE FROM gt_deptsal;

to see the explain plan for

You might also like