Professional Documents
Culture Documents
Explaining The Explain
Explaining The Explain
- Pratik Lakhpatwala
What is EXPLAIN?
• The EXPLAIN facility provides an "English" translation of the plan the SQL
Optimizer develops to service a request.
• May be used on any SQL statement, except EXPLAIN itself.
• Look for key words AND phrases
• Execution time AND row count estimates depend on:
• Are statistics collected?
• Actual execution time depends on:
• Is DBS processing other requests?
• Is channel or network busy?
How is EXPLAIN Text Generated?
SQL REQUEST
DD SYNTAXER
Dbase
AccessRights DD Cache RESOLVER
RoleGrants (V2R5)
TVM
TVFields SECURITY
Indexes
GENERATOR
APPLY
DISPATCHER
AMP
Information Known to Optimizer
• Number of nodes in system
• Number and type of CPU’s per node
• Number of configured AMP Vprocs
• Disk array configuration
• Interconnect configuration
• Amount and configuration of memory
First PE PE Second
request request
Determine
Table ID hash
:
3) We do a BMSMS (bit map set manipulation) step that builds a bit map for TFACT.Employee
by way of index # 4 "TFACT.E.Job_Code = 3500" which is placed in Spool 2. The estimated
time for this step is 0.01 seconds.
4) We do an all-AMPs RETRIEVE step from TFACT.E by way of index # 8
TFACT.E.Dept_Number = 1310" and the bit map in Spool 2 (Last Use) with a residual
condition of ("TFACT.E.Job_Code = 3500") into Spool 1 (group_amps), which is built locally
on the AMPs. The size of Spool 1 is estimated with low confidence to be 60 rows (4620
bytes). The estimated time for this step is 0.02 seconds.
5) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the
request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total
estimated time is 0.03 seconds.
Note:
Statistics were collected on the NUSIs Job_Code and Dept_Number.
Synchronized Scanning
In the case of multiple users that access the same table at the same time,
the system can do a synchronized scan (sync scan) on the table.
112747
Query 1766
1 100766 3001 Frankel Allan
034982
Begins
2212 106363 3005 Bench John
310229 2231 108222 3100 Palmer Carson
209181 1235 108221 3001 Smith Buster
123881 2433 101433 3007 Walton Sam
223431 2500 105200 3101 Brooks Steve
221015 1019 108222 3199 Woods Tiger
Query 2
121332 2281 101281 3007 Walton John
Begins
118314 2100 101100 3002 Ramon Anne
104631 1279 100279 3002 Roberts Julie
210110 1201 101222 3003 Douglas Michael
210001 1205 105432 3022 Morgan Joe
100076 1011 104321 3021 Anderson Sparky
100045 1012 101231 3087 Michelson Query
Phil3
319116 1219 121871 3025 Crawford Begins
Cindy
: : : : : :
: : : : : :
Synchronized Scanning (cont.)
EXPLAIN SELECT * FROM daily_sales ORDER BY 1;
:
3) We do an all-AMPs RETRIEVE step from TFACT.daily_sales by way of an all-rows scan
with no residual conditions into Spool 1 (group_amps), which is built locally on the
AMPs. Then we do a SORT to order Spool 1 by the sort key in spool field1
(TFACT.daily_sales.Item_id). The input table will not be cached in memory, but it is
eligible for synchronized scanning. The result spool file will not be cached in memory.
The size of Spool 1 is estimated with high confidence to be 76,685 rows (2,530,605 bytes).
The estimated time for this step is 0.09 seconds.
:
Understanding Row and Time Estimates
The EXPLAIN facility may express “confidence” for a retrieve from a table.
Some of the phrases used are:
. . . with no confidence . . .
− One input relation has no confidence.
− Statistics do not exist for either join field.
Understanding Row and Time Estimates (cont.)
5) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from TFACT.D by way of an all-rows scan with no
residual conditions into Spool 2 (all_amps), which is duplicated on all AMPs. The size
of Spool 2 is estimated with high confidence to be 19,642 rows (726,754 bytes). The
estimated time for this step is 0.02 seconds.
2) We do an all-AMPs RETRIEVE step from TFACT.J by way of an all-rows scan with no
residual conditions into Spool 3 (all_amps), which is duplicated on all AMPs. Then we
do a SORT to order Spool 3 by the hash code of (TFACT.J.Job_Code). The size of Spool
3 is estimated with high confidence to be 12,166 rows (450,142 bytes). The estimated
time for this step is 0.01 seconds.
6) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is
joined to TFACT.E by way of an all-rows scan with a condition of ("NOT (TFACT.E.Job_Code
IS NULL)"). Spool 2 and TFACT.E are joined using a single partition hash_ join, with a join
with a join condition of ("TFACT.E.Dept_Number = Dept_Number"). The result goes into
Spool 4 (all_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 4
by the hash code of (TFACT.E.Job_Code). The size of Spool 4 is estimated with low
confidence to be 26,000 rows (1,690,000 bytes). The estimated time for this step is 0.04
seconds.
7) We do an all-AMPs JOIN step from Spool 3 (Last Use) by way of a RowHash match scan,
which is joined to Spool 4 (Last Use) by way of a RowHash match scan. Spool 3 and Spool 4
are joined using a merge join, with a join condition of ("Job_Code = Job_Code"). The result
goes into Spool 1 (group_amps), which is built locally on the AMPs. Then we do a SORT to
order Spool 1 by the sort key in spool field1 (TFACT.D.Dept_Name, TFACT.E.Last_Name,
TFACT.E.First_Name). The size of Spool 1 is estimated with low confidence to be 26,000
rows (3,822,000 bytes). The estimated time for this step is 0.08 seconds.
8) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the
Query Cost Estimates
Row estimates:
• May be estimated using random samples, statistics or indexes
• Are assigned a confidence level - high, low or none
• Affect timing estimates - more rows, more time needed
Timings:
• Used to determine the ‘lowest cost’ plan
• Total cost generated if all processing steps have assigned cost
• Not intended to predict wall-clock time, useful for comparisons
Miscellaneous Notes:
• Estimates too large to display show 3 asterisks (***).
• The accuracy of the time estimate depends upon the accuracy of the row estimate.
Understanding Row and Time Estimates
EXPLAIN
INSERT INTO Employee_CharPI SELECT * FROM Employee;
:
4) We do an all-AMPs RETRIEVE step from TFACT.Employee by way of an all-rows
scan with no residual conditions into Spool 1 (all_amps), which is redistributed
by the hash code of (TFACT.Employee.Employee_Number (CHAR(10),
CHARACTER SET LATIN, NOT CASESPECIFIC, FORMAT '-(10)9')(CHAR(10),
CHARACTER SET LATIN, NOT CASESPECIFIC, NAMED Employee_Number,
FORMAT 'X(10)', NULL)) to all AMPs. Then we do a SORT to order Spool 1 by
row hash. The size of Spool 1 is estimated with high confidence to be 26,000
rows (1,950,000 bytes). The estimated time for this step is 0.06 seconds.
5) We do an all-AMPs MERGE into TFACT.Employee_CharPI from Spool 1 (Last
Use). The size is estimated with high confidence to be 26,000 rows. The
estimated time for this step is 1.38 seconds.
6) We spoil the parser's dictionary cache for the table.
7) Finally, we send out an END TRANSACTION step to all AMPs involved in
processing the request.
-> No rows are returned to the user as the result of statement 1.
Unexpected Full Table Scan
EXPLAIN
SELECT * FROM Employee_CharPI WHERE employee_number = 1104066 ;