Professional Documents
Culture Documents
Teradata Interview Questions
Teradata Interview Questions
information concerning:
- Data conversion errors Constraint violations and other error conditions:
Contains rows which failed to be manupulated due to constraint violations or Transalation error
Captures rows that contains duplicate Values for UPIs.
It logs errors & exceptions that occurs during the apply phase.
It logs errors that are occurs during the acquisition phase.
How teradata makes sure that there are no duplicate rows being inserted when its a SET table?
Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash
value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the
whole row, and find out if duplicate.
If its a duplicate it silently skips it without throwing any error.
When the target table has UPI then the Row hash of PI column will lead to the AMP and the
duplicate row is rejected with an error.
In case of a NUPI defined on a SET table the duplicate row is compared on all columns and if the
entire row is dups then it is rejected without an error code > 0.
Optimization is the technique of selecting the least expensive plan (fastest plan) for the
query to fetch results.
Optimization is directly proportional to the availibility of --
1. CPU resources
2. Systems resources - amps PEs etc.
If you want to load empty table then you use the fastload so it will very usefull than the
mutiload..because fastload performs the loading of the data in 2phase..and it does not need
a work table for loading the data.. so it is faster as well as it follows the below steps to load
the data in the table
Phase 1 (Acquisition):
raw data is sent in blocks to the next available session/AMP.
A receiver task on that AMP unpacks the blocks, does CHECKs/CASTs/FORMATs, hashes the
row and sends it to the final AMP, which writes unsorted blocks to disks.
Phase 2 (Sort):
Each AMP sorts it's blocks and writes it to disk .
After that FALLBACK rows are generated.
Multiload:
It does the loading in the 5 phases
Phase1:It will get the import file and checks the script
Phase2:It reads the record from the base table and store in the work table
Phase3:In this Application phase it locks the table header
Phase4:In the DML opreation will done in the tables
Phase 5: In this table locks will be released and work tables will be dropped.
Preliminary
1 Parses and validates all of the MultiLoad commands and
Teradata SQL statements in your MultiLoad job.
2 Establishes sessions and process control with the Teradata
RDBMS.
3 Submits special Teradata SQL requests to the Teradata
RDBMS.
4 Creates and protects temporary work tables and error tables
in the Teradata RDBMS.
DML Transaction
submits the DML statements specifying the insert, update and
delete tasks to the Teradata RDBMS.
Acquisition for an import task:
1 Imports data from the specified input data source.
2 Evaluates each record according to specified application
conditions.
3 Loads the selected records into the worktables in the
Teradata RDBMS.
(There is no acquisition phase activity for a MultiLoad delete
task.)
Application
1 Acquires locks on the specified target tables and views in the
Teradata RDBMS.
2 For an import task, inserts the data from the temporary work
tables into the target tables or views in the Teradata RDBMS.
3 For a delete task, deletes the specified rows from the target
table in the Teradata RDBMS.
4 Updates the error tables associated with each MultiLoad
task.
Cleanup
1 Forces an automatic restart/rebuild if an AMP went down
and came back online during the application phase.
2 Releases all locks on the target tables and views.
3 Drops the temporary work tables and all empty error tables
from the Teradata RDBMS.
4 Reports the transaction statistics associated with the import
and delete tasks.
What is the difference between Global temporary tables and Volatile temporary tables?
Global Temporary tables (GTT) -
1. When they are created, its definition goes into Data Dictionary.
2. When materialized data goes in temp space.
3. thats why, data is active upto the session ends, and definition will remain there upto its not
dropped using Drop table statement.
If dropped from some other session then its should be Drop table all;
4. you can collect stats on GTT.
-Statistics tell the Optimizer how many rows/ value there are.
– The Optimizer uses statistics to plan the best way to access data.
– May improve performance of complex queries and joins.
– NUSI Bit Mapping requires collected statistics.
– Helpful in accessing a column or index with uneven value distribution.
– Stale statistics may mislead the Optimizer into poor decisions.
– Statistics remain valid across a reconfiguration of the system.
– COLLECT is resource intensive and should be done during off hours.
Sel HASHAMP(HASHBUCKET(HASHROW(PRIMARY_INDEX_COLUMNS)))
AS AMP_NO,
COUNT(*)
From DATABASENAME.TABLE_NAME
GROUP BY 1;
JOIN INDEX:
-----------
Join Index is nothing but pre-joining 2 or more tables or views which are
commonly joined in order to reduce the joining overhead. So teradata uses the
join index instead of resolving the joins in the participating base tables.
They increase the efficiency and performance of join queries. They can have
different primary indexes than the base tables and also are automatically
updated as and when the base rows are updated. they can have repeating values.
It returns the first NOT NULL value from the string of attributes. Often used in place of non-ANCII
Zeroifnull.
Fact tables
A fact table is a table in a star or snowflake schema that stores facts that measure the business,
such as sales, cost of goods, or profit. Fact tables also contain foreign keys to the dimension
tables. These foreign keys relate each row of data in the fact table to its corresponding
dimensions and levels.
In the DB2 Alphablox Cube Server, the Measures fact table for a DB2 Alphablox cube can be specified in
the cube definition for new or existing cubes, available under the Cubes link in theAdministration tab of
the DB2 Alphablox Admin Pages. To specify the measures fact table, you must select a valid schema and
catalog combination to populate the list of available table options and then select the table or type a valid
table name.
Dimension Tables
A dimension table is a table in a star or snowflake schema that stores attributes that describe aspects of a
dimension. For example, a time table stores the various aspects of time such as year, quarter, month, and
day. A foreign key of a fact table references the primary key in a dimension table in a many-to-one
relationship.
he following figure shows a star schema with a single fact table and four dimension tables. A star schema
can have any number of dimension tables. The multiple branches at the end of the links connecting the
tables indicate a many-to-one relationship between the fact table and each dimension table.
Snowflake schemas
The following figure shows a snowflake schema with two dimensions, each having three levels. A
snowflake schema can have any number of dimensions and each dimension can have any number of
levels.