Professional Documents
Culture Documents
Lab 06 Query Optimization: Compare Access Path
Lab 06 Query Optimization: Compare Access Path
Lab 06 Query Optimization: Compare Access Path
__2. Type tune01 to create two row organized tables PERSON_ROW and PRODUCT_ROW.
$ ./tune01
Note: We will run a query using both row and column organized tables to check
the access path determined by db2 and compare that with only column
organized tables.
__3. Type tune02 to load data in row organized PERSON_ROW and PRODUCT_ROW tables.
$ ./tune02
__4. Click [Administer Databases] on the task bar to bring focus to the Data Studio. [Optional:
If you had closed the Data Studio, start it by double clicking on the Data Studio icon on the
desktop and click OK to accept the default workspace and establish connection to the COLDB
database. – Check Lab 02 for the detailed instructions.]
__12. Click up arrow sign on the separator bar to hide the panel.
__13. Use your mouse to select the first SQL statement and click Open Visual Explain button.
__14. Double click Access Plan Diagram to maximize the view. Click Finish.
__16. In the Attributes section (left hand side), click down arrow to select All attributes. Scroll all
the way down and check the origin of the TQ which is the COLUMN-ORGANISED DATA for the
FACT_DX table.
__17. Please notice that the data (CTQ at node 4) from FACT_DX is joined with the PERSON_ROW and
the final table queue (TQ at node 2) is formed after HSJOIN.
__18. Click TQ at node 2 and notice that it is a local table queue formed after HSJOIN.
__19. Double click Access Plan Diagram to bring the view to its original position.
__20. Select the 2nd query and obtain the explain plan as we did in previous steps.
__21. Check the explain plan for the same query using column organized tables.
Note: The Column Table Queue (CTQ at node 3) is now above the HSJOIN and
that is the main difference between two explain plans.
Our attempt for column organized tables should be to see the CTQ above
any type of join and this was not the case on the 1st query having join
between row and column organized table.
__22. Repeat the same exercise for Query 3 and 4 and check the difference in the access path.
__25. Repeat same exercise for Query 5 and 6 and check the difference in the access path.
__28. Please notice that the access path for the same query in all above 3 cases shows lower cost
when using all tables as column organized and the main striking feature is of CTQ after the JOIN
operation which results into the lower cost.
__31. Type Work in the search box. Click Workload Table Organization Advisor.
__32. Change 30 to 0 and uncheck Prompt to run the Workload Statistics Advisor….
Click OK.
__34. Double click tune04.sql and hit OK. [Note: The file is in /home/db2psc/pot_blu/06tune].
Note: We are selecting ROWDB database which has same tables as row
organized compressed. We will run Workload Table Organization Advisor
for queries using row organized tables.
__37. Click up arrow key in the separator bar to collapse the panel.
__41. Right click on the Workload_0 line and click Invoke Workload Advisors and Tools.
__42. Check Re-collect EXPLAIN information before running workload advisors and
click Select What to Run…
__43. Check Table organization and click OK. [Statistics is checked by default.]
__45. Run Workload Table Organization Advisor will start running and please wait for this to
complete.
__48. Click Show DDL Script. Please notice that ADMIN_MOVE_TABLE script is generated which will
convert the table organization from row to column. Click OK.
__53. Run tune05 to run 15 queries against ROWDB database which have row organized tables.
$ ./tune05
__54. Run tune06 to run same 15 queries against COLDB database which have column organized
tables.
$ ./tune06
__55. Run tune08 to compare the elapsed time for each SQL statement and notice the percentage
improvement by the use of column organized tables for the analytics workload.
Note: The workload improved 11 times faster than the row organized tables.
Few query performance is 50 times faster than the row organized tables.
__57. The explain plan for the SQL for the query against ROWDB database is as shown.
__58. The details on Node 8 indicate that the Estimated Output Cardinality is approx. 60K.
__60. The explain plan for the SQL for the query against ROWDB database is as shown.
__64. The above analysis shows that the index scan selectivity for the query in the row organized
tables is way less than the selectivity for the same query for the column organized tables.
__65. The data is in buffer pool but the analysis for higher selectivity takes longer than the lower
selectivity and this particular query took less time in row organized tables.
__66. Please notice that the elapsed time for the query in row and column organized tables is 2.86
and 3.04 seconds.
__67. The second execution of the same query took 0.22 and 2.98 seconds as buffer pools were
already primed due to the previous execution.
__68. The bottom line for the column organized tables is the absence of indexes, ability to skip data,
access required columns only compared to the full row access, SIMD processing to speed things
up and in-memory processing of data with an ability of using both memory as well as disks if
memory is not enough to fit the data.
__69. You will realize maximum benefits by using column organized tables for analytics workload and
up to 10 times saving in the storages and up to 50 times query executions.
__70. Please note that the column organized tables are not fit for the low selectivity index based scans
which are the hallmark of the row organized tables.
__71. Please also note that you could use both row and column organized tables in the same database
and use the best of the both worlds to have OLTP and Analytics workload in the same database.