Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

The Error Table contains 

information concerning:
- Data conversion errors Constraint violations and other error conditions:
Contains rows which failed to be manupulated due to constraint violations or Transalation error
Captures rows that contains duplicate Values for UPIs.
It logs errors & exceptions that occurs during the apply phase.
It logs errors that are occurs during the acquisition phase.

How teradata makes sure that there are no duplicate rows being inserted when its a SET table?
Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash
value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the
whole row, and find out if duplicate.
If its a duplicate it silently skips it without throwing any error.
When the target table has UPI then the Row hash of PI column will lead to the AMP and the
duplicate row is rejected with an error.

In case of a NUPI defined on a SET table the duplicate row is compared on all columns and if the
entire row is dups then it is rejected without an error code > 0.

Optimization is the technique of selecting the least expensive plan (fastest plan) for the
query to fetch results. 
Optimization is directly proportional to the availibility of --
1. CPU resources 
2. Systems resources - amps PEs etc.

Teradata performance tuning is a technique of improving the process in order for query to


perform faster with the minimal use of CPU resources.

What is the difference between Multiload & Fastload interms of Performance?

Fast Load: to load empty tables at high speed


MLoad: to insert,update and delete up to 5 different tables

If you want to load empty table then you use the fastload so it will very usefull than the
mutiload..because fastload performs the loading of the data in 2phase..and it does not need
a work table for loading the data.. so it is faster as well as it follows the below steps to load
the data in the table

Phase 1 (Acquisition):
raw data is sent in blocks to the next available session/AMP.
A receiver task on that AMP unpacks the blocks, does CHECKs/CASTs/FORMATs, hashes the
row and sends it to the final AMP, which writes unsorted blocks to disks.
Phase 2 (Sort):
Each AMP sorts it's blocks and writes it to disk .
After that FALLBACK rows are generated.

Multiload:
It does the loading in the 5 phases
Phase1:It will get the import file and checks the script
Phase2:It reads the record from the base table and store in the work table
Phase3:In this Application phase it locks the table header
Phase4:In the DML opreation will done in the tables
Phase 5: In this table locks will be released and work tables will be dropped.

The MultiLoad operation is performed in five phases:

Preliminary
1 Parses and validates all of the MultiLoad commands and
Teradata SQL statements in your MultiLoad job.
2 Establishes sessions and process control with the Teradata
RDBMS.
3 Submits special Teradata SQL requests to the Teradata
RDBMS.
4 Creates and protects temporary work tables and error tables
in the Teradata RDBMS.
DML Transaction
submits the DML statements specifying the insert, update and
delete tasks to the Teradata RDBMS.
Acquisition for an import task:
1 Imports data from the specified input data source.
2 Evaluates each record according to specified application
conditions.
3 Loads the selected records into the worktables in the
Teradata RDBMS.
(There is no acquisition phase activity for a MultiLoad delete
task.)
Application
1 Acquires locks on the specified target tables and views in the
Teradata RDBMS.
2 For an import task, inserts the data from the temporary work
tables into the target tables or views in the Teradata RDBMS.
3 For a delete task, deletes the specified rows from the target
table in the Teradata RDBMS.
4 Updates the error tables associated with each MultiLoad
task.
Cleanup
1 Forces an automatic restart/rebuild if an AMP went down
and came back online during the application phase.
2 Releases all locks on the target tables and views.
3 Drops the temporary work tables and all empty error tables
from the Teradata RDBMS.
4 Reports the transaction statistics associated with the import
and delete tasks.

What is the difference between Global temporary tables and Volatile temporary tables?
Global Temporary tables (GTT) -
1. When they are created, its definition goes into Data Dictionary.
2. When materialized data goes in temp space.
3. thats why, data is active upto the session ends, and definition will remain there upto its not
dropped using Drop table statement.
If dropped from some other session then its should be Drop table all;
4. you can collect stats on GTT.

Volatile Temporary tables (VTT) -


1. Table Definition is stored in System cache
2. Data is stored in spool space.
3. thats why, data and table definition both are active only upto session ends.
4. No collect stats for VTT.

 What is collect State in Teradata ? what it use and how it works??


The Optimizer plans an execution strategy for every
SQL query submitted to it. We have also seen that the
execution strategy for any query may be subject to change depending on various
factors. For the Optimizer to consistently choose the optimum strategy, it
must be provided with reliable, complete, and current demographic information
regarding all of these factors. The best way to assure that the Optimizer has
all the information it needs to generate optimum execution strategies is to
COLLECT STATISTICS.

-Statistics tell the Optimizer how many rows/ value there are.
– The Optimizer uses statistics to plan the best way to access data.
– May improve performance of complex queries and joins.
– NUSI Bit Mapping requires collected statistics.
– Helpful in accessing a column or index with uneven value distribution.
– Stale statistics may mislead the Optimizer into poor decisions.
– Statistics remain valid across a reconfiguration of the system.
– COLLECT is resource intensive and should be done during off hours.

How to find No. of Records present in Each AMP or a Node


for a given Table through SQL?

Sel HASHAMP(HASHBUCKET(HASHROW(PRIMARY_INDEX_COLUMNS)))
AS AMP_NO,
COUNT(*)
From DATABASENAME.TABLE_NAME
GROUP BY 1;

Primary indexes can be either unique or non-unique and partitioned or non-partitioned:

 Unique primary index (UPI)


 Non-unique primary index (NUPI)
 Non-partitioned primary index (NPPI)
 Partitioned primary index (PPI)

Partitioned primary indexes may have a single partitioning expression or multiple


partitioning expressions:

 Single-level PPI (SLPPI)


 Multilevel PPI (MLPPI)
Secondary Indexes provide alternate access paths to the data and may be Hash-ordered
or Value-ordered. Secondary indexes may be added or dropped as needed with the caveat
that building them requires some amount of system resources and you’ll want to check with
your DBA appropriate.

 Unique secondary index (USI)


 Nonunique secondary index (NUSI)

Join Indexes offer a variety of six subtypes that include:

 Single-table join index (STJI)


 Single-table aggregate join index (STAJI)
 Single-table sparse join index
 Multitable simple join index
 Multitable aggregate join index
 Multitable sparse join index

JOIN INDEX:
-----------

Join Index is nothing but pre-joining 2 or more tables or views which are
commonly joined in order to reduce the joining overhead. So teradata uses the
join index instead of resolving the joins in the participating base tables.
They increase the efficiency and performance of join queries. They can have
different primary indexes than the base tables and also are automatically
updated as and when the base rows are updated. they can have repeating values.

There are 3 types of join indexes:


1)Single table join index - here the rows are distributed
based on the foreign key hash value of the base table.
2) Multi table join index - joining two tables.
3) Aggregate join index - performing the aggregates but only
sum and count.

It returns the first NOT NULL value from the string of attributes. Often used in place of non-ANCII
Zeroifnull.

COALESCE(Col1,0) =  return col1 if not NULL, 0 if NULL.  


Can contain more then 2 return values COALESCE(Col1,Col2,col3, 0) Returns first of attributes that is not
null.

date: COALESCE(date_col, date '2007-03-19')


string: COALESCE(trim(date_col), 'blabla'
Look at DBC.Indices view. The IndexType is set to K for Primary Key.

IndexType: Returns the type of an index as:


• P (Nonpartitioned Primary)
• Q (Partitioned Primary)
• S (Secondary)
• J (join index)
• N (hash index)
• K (primary key)
• U (unique constraint)
• V (value ordered secondary)
• H (hash ordered ALL covering secondary)
• O (valued ordered ALL covering secondary)
• I (ordering column of a composite secondary index)
• M (Multi-column statistics)
• D (Derived column partition statistics)
• 1 (field1 column of a join or hash index)
• 2 (field2 column of a join or hash index) 

Also see dbc.ALL_RI_PARENTS, dbc.ALL_RI_Children views. 


It is all explained in Data Dictionary manual.

Teradata DBA's are responsible for:

1. Table design and index selection


2. Table implementations, maintenance, and backup
3. Problem support
4. Workload monitoring and control
5. Policies, procedures and guidelines that govern the teradata environment
6. SQL code review
7. Developer and user support and training
8. Capacity planning
9. System software testing and benchmarking
10. Support and coordination during hardware upgrades

Fact tables
A fact table is a table in a star or snowflake schema that stores facts that measure the business,
such as sales, cost of goods, or profit. Fact tables also contain foreign keys to the dimension
tables. These foreign keys relate each row of data in the fact table to its corresponding
dimensions and levels.

In the DB2 Alphablox Cube Server, the Measures fact table for a DB2 Alphablox cube can be specified in
the cube definition for new or existing cubes, available under the Cubes link in theAdministration tab of
the DB2 Alphablox Admin Pages. To specify the measures fact table, you must select a valid schema and
catalog combination to populate the list of available table options and then select the table or type a valid
table name.

Dimension Tables
A dimension table is a table in a star or snowflake schema that stores attributes that describe aspects of a
dimension. For example, a time table stores the various aspects of time such as year, quarter, month, and
day. A foreign key of a fact table references the primary key in a dimension table in a many-to-one
relationship.

he following figure shows a star schema with a single fact table and four dimension tables. A star schema
can have any number of dimension tables. The multiple branches at the end of the links connecting the
tables indicate a many-to-one relationship between the fact table and each dimension table.

Snowflake schemas
The following figure shows a snowflake schema with two dimensions, each having three levels. A
snowflake schema can have any number of dimensions and each dimension can have any number of
levels.

You might also like