Lab 06 Query Optimization: Compare Access Path

Lab 06 – Query Optimization IBM Software
Lab 06 Query Optimization

Compare Access Path
__1. In GNOME Command window, type cd6 to change the directory to the Lab 06.
$ cd6
__2. Type tune01 to create two row organized tables PERSON_ROW and PRODUCT_ROW.
$ ./tune01
Note: We will run a query using both row and column organized tables to check
the access path determined by db2 and compare that with only column
organized tables.
IBM DB2 10.5 BLU Acceleration Page 55

IBM Software Lab 06 – Query Optimization
__3. Type tune02 to load data in row organized PERSON_ROW and PRODUCT_ROW tables.
$ ./tune02
__4. Click [Administer Databases] on the task bar to bring focus to the Data Studio. [Optional:
If you had closed the Data Studio, start it by double clicking on the Data Studio icon on the
desktop and click OK to accept the default workspace and establish connection to the COLDB
database. – Check Lab 02 for the detailed instructions.]
__5. Click File  Open File…
__6. Double click pot_blu
Page 56 An IBM Proof of Technology

__7. Double click 06tune.
__8. Double click tune03.sql to open.
__9. Click OK to accept the default statement terminator.
__10. Click No Connection.

__11. Click COLDB to select and then click Finish.
__12. Click up arrow sign on the separator bar to hide the panel.
__13. Use your mouse to select the first SQL statement and click Open Visual Explain button.

__14. Double click Access Plan Diagram to maximize the view. Click Finish.
__15. Click TQ at node 4 at select it.

__16. In the Attributes section (left hand side), click down arrow to select All attributes. Scroll all
the way down and check the origin of the TQ which is the COLUMN-ORGANISED DATA for the
FACT_DX table.
__17. Please notice that the data (CTQ at node 4) from FACT_DX is joined with the PERSON_ROW and
the final table queue (TQ at node 2) is formed after HSJOIN.
__18. Click TQ at node 2 and notice that it is a local table queue formed after HSJOIN.

__19. Double click Access Plan Diagram to bring the view to its original position.
__20. Select the 2nd query and obtain the explain plan as we did in previous steps.
__21. Check the explain plan for the same query using column organized tables.
Note: The Column Table Queue (CTQ at node 3) is now above the HSJOIN and
that is the main difference between two explain plans.
Our attempt for column organized tables should be to see the CTQ above
any type of join and this was not the case on the 1st query having join
between row and column organized table.

__22. Repeat the same exercise for Query 3 and 4 and check the difference in the access path.
__23. Query – 3 access path.

__25. Repeat same exercise for Query 5 and 6 and check the difference in the access path.
__26. Query 5 access path.

__28. Please notice that the access path for the same query in all above 3 cases shows lower cost
when using all tables as column organized and the main striking feature is of CTQ after the JOIN
operation which results into the lower cost.
__29. Close tab tune03.sql.

Tune Analytics Workload

__30. Click Window  Preferences.
__31. Type Work in the search box. Click Workload Table Organization Advisor.
__32. Change 30 to 0 and uncheck Prompt to run the Workload Statistics Advisor….
Click OK.

__33. Click File  Open File…
__34. Double click tune04.sql and hit OK. [Note: The file is in /home/db2psc/pot_blu/06tune].
__35. Click No Connection.
__36. Select ROWDB database and click Finish.
Note: We are selecting ROWDB database which has same tables as row
organized compressed. We will run Workload Table Organization Advisor
for queries using row organized tables.

__37. Click up arrow key in the separator bar to collapse the panel.
__38. Click Start Tuning…
__39. Click Save All to Workload…
__40. Click OK.

__41. Right click on the Workload_0 line and click Invoke Workload Advisors and Tools.
Run Table Organization Advisor
__42. Check Re-collect EXPLAIN information before running workload advisors and
click Select What to Run…
__43. Check Table organization and click OK. [Statistics is checked by default.]

__44. In the next screen, click Start Explain.
__45. Run Workload Table Organization Advisor will start running and please wait for this to
complete.
__46. View Summary and click Table organization tab.
__47. View the estimated performance improvements and recommendations.

__48. Click Show DDL Script. Please notice that ADMIN_MOVE_TABLE script is generated which will
convert the table organization from row to column. Click OK.
__49. Close the Workload node view.
__50. Click Save and Exit.
__51. Click File and Exit to close the Data Studio.

Run Queries and Compare

__52. Run cat tune07.sql to see the 15 queries that we will run against ROWDB and COLDB
databases.
$ cat tune07.sql
__53. Run tune05 to run 15 queries against ROWDB database which have row organized tables.
$ ./tune05

__54. Run tune06 to run same 15 queries against COLDB database which have column organized
tables.
$ ./tune06
__55. Run tune08 to compare the elapsed time for each SQL statement and notice the percentage
improvement by the use of column organized tables for the analytics workload.
Note: The workload improved 11 times faster than the row organized tables.
Few query performance is 50 times faster than the row organized tables.

Investigate Slower Query in Column Organized

__56. Please note that all queries except 2 ran slower in column organized database. Statement 13
and 14 are same SQL and second execution of same SQL ran way faster in row organized table.
__57. The explain plan for the SQL for the query against ROWDB database is as shown.

__58. The details on Node 8 indicate that the Estimated Output Cardinality is approx. 60K.
__59. The Estimated Output Cardinality for Node 11 is 836K.

__60. The explain plan for the SQL for the query against ROWDB database is as shown.
__61. The Estimated Output Cardinality for Node 6 is 50 million.

__62. The Estimated Output Cardinality for Node 7 is 59753.
__63. The Estimated Output Cardinality for Node 5 is 60 million.
__64. The above analysis shows that the index scan selectivity for the query in the row organized
tables is way less than the selectivity for the same query for the column organized tables.
__65. The data is in buffer pool but the analysis for higher selectivity takes longer than the lower
selectivity and this particular query took less time in row organized tables.
__66. Please notice that the elapsed time for the query in row and column organized tables is 2.86
and 3.04 seconds.

__67. The second execution of the same query took 0.22 and 2.98 seconds as buffer pools were
already primed due to the previous execution.
__68. The bottom line for the column organized tables is the absence of indexes, ability to skip data,
access required columns only compared to the full row access, SIMD processing to speed things
up and in-memory processing of data with an ability of using both memory as well as disks if
memory is not enough to fit the data.
__69. You will realize maximum benefits by using column organized tables for analytics workload and
up to 10 times saving in the storages and up to 50 times query executions.
__70. Please note that the column organized tables are not fit for the low selectivity index based scans
which are the hallmark of the row organized tables.
__71. Please also note that you could use both row and column organized tables in the same database
and use the best of the both worlds to have OLTP and Analytics workload in the same database.
__72. Type clear in GNOME Terminal window.

$ clear
** End of Lab 06: Query Optimization

Lab 06 Query Optimization: Compare Access Path

Uploaded by

Copyright:

Available Formats

You might also like

Lab 06 Query Optimization: Compare Access Path

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lab 06 Query Optimization: Compare Access Path

Uploaded by

Copyright:

Available Formats

Lab 06 – Query Optimization IBM Software

Lab 06 Query Optimization

IBM DB2 10.5 BLU Acceleration Page 55

__5. Click File  Open File…

__6. Double click pot_blu

Page 56 An IBM Proof of Technology

__7. Double click 06tune.

__8. Double click tune03.sql to open.

__9. Click OK to accept the default statement terminator.

__10. Click No Connection.

IBM DB2 10.5 BLU Acceleration Page 57

__11. Click COLDB to select and then click Finish.

Page 58 An IBM Proof of Technology

__15. Click TQ at node 4 at select it.

IBM DB2 10.5 BLU Acceleration Page 59

Page 60 An IBM Proof of Technology

IBM DB2 10.5 BLU Acceleration Page 61

__23. Query – 3 access path.

__24. Query – 4 access path.

Page 62 An IBM Proof of Technology

__26. Query 5 access path.

IBM DB2 10.5 BLU Acceleration Page 63

__27. Query – 6 access path.

__29. Close tab tune03.sql.

Page 64 An IBM Proof of Technology

Tune Analytics Workload

IBM DB2 10.5 BLU Acceleration Page 65

__33. Click File  Open File…

__35. Click No Connection.

__36. Select ROWDB database and click Finish.

Page 66 An IBM Proof of Technology

__38. Click Start Tuning…

__39. Click Save All to Workload…

__40. Click OK.

IBM DB2 10.5 BLU Acceleration Page 67

Run Table Organization Advisor

Page 68 An IBM Proof of Technology

__44. In the next screen, click Start Explain.

__46. View Summary and click Table organization tab.

__47. View the estimated performance improvements and recommendations.

IBM DB2 10.5 BLU Acceleration Page 69

__49. Close the Workload node view.

__50. Click Save and Exit.

__51. Click File and Exit to close the Data Studio.

Page 70 An IBM Proof of Technology

Run Queries and Compare

IBM DB2 10.5 BLU Acceleration Page 71

Page 72 An IBM Proof of Technology

Investigate Slower Query in Column Organized

IBM DB2 10.5 BLU Acceleration Page 73

__59. The Estimated Output Cardinality for Node 11 is 836K.

Page 74 An IBM Proof of Technology

__61. The Estimated Output Cardinality for Node 6 is 50 million.

IBM DB2 10.5 BLU Acceleration Page 75

__62. The Estimated Output Cardinality for Node 7 is 59753.

__63. The Estimated Output Cardinality for Node 5 is 60 million.

Page 76 An IBM Proof of Technology