Lab 06 Query Optimization: Compare Access Path

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 23

Lab 06 – Query Optimization IBM Software

Lab 06 Query Optimization


Compare Access Path
__1. In GNOME Command window, type cd6 to change the directory to the Lab 06.
$ cd6

__2. Type tune01 to create two row organized tables PERSON_ROW and PRODUCT_ROW.
$ ./tune01

Note: We will run a query using both row and column organized tables to check
the access path determined by db2 and compare that with only column
organized tables.

IBM DB2 10.5 BLU Acceleration Page 55


IBM Software Lab 06 – Query Optimization

__3. Type tune02 to load data in row organized PERSON_ROW and PRODUCT_ROW tables.
$ ./tune02

__4. Click [Administer Databases] on the task bar to bring focus to the Data Studio. [Optional:
If you had closed the Data Studio, start it by double clicking on the Data Studio icon on the
desktop and click OK to accept the default workspace and establish connection to the COLDB
database. – Check Lab 02 for the detailed instructions.]

__5. Click File  Open File…

__6. Double click pot_blu

Page 56 An IBM Proof of Technology


Lab 06 – Query Optimization IBM Software

__7. Double click 06tune.

__8. Double click tune03.sql to open.

__9. Click OK to accept the default statement terminator.

__10. Click No Connection.

IBM DB2 10.5 BLU Acceleration Page 57


IBM Software Lab 06 – Query Optimization

__11. Click COLDB to select and then click Finish.

__12. Click up arrow sign on the separator bar to hide the panel.

__13. Use your mouse to select the first SQL statement and click Open Visual Explain button.

Page 58 An IBM Proof of Technology


Lab 06 – Query Optimization IBM Software

__14. Double click Access Plan Diagram to maximize the view. Click Finish.

__15. Click TQ at node 4 at select it.

IBM DB2 10.5 BLU Acceleration Page 59


IBM Software Lab 06 – Query Optimization

__16. In the Attributes section (left hand side), click down arrow to select All attributes. Scroll all
the way down and check the origin of the TQ which is the COLUMN-ORGANISED DATA for the
FACT_DX table.

__17. Please notice that the data (CTQ at node 4) from FACT_DX is joined with the PERSON_ROW and
the final table queue (TQ at node 2) is formed after HSJOIN.

__18. Click TQ at node 2 and notice that it is a local table queue formed after HSJOIN.

Page 60 An IBM Proof of Technology


Lab 06 – Query Optimization IBM Software

__19. Double click Access Plan Diagram to bring the view to its original position.

__20. Select the 2nd query and obtain the explain plan as we did in previous steps.

__21. Check the explain plan for the same query using column organized tables.

Note: The Column Table Queue (CTQ at node 3) is now above the HSJOIN and
that is the main difference between two explain plans.

Our attempt for column organized tables should be to see the CTQ above
any type of join and this was not the case on the 1st query having join
between row and column organized table.

IBM DB2 10.5 BLU Acceleration Page 61


IBM Software Lab 06 – Query Optimization

__22. Repeat the same exercise for Query 3 and 4 and check the difference in the access path.

__23. Query – 3 access path.

__24. Query – 4 access path.

Page 62 An IBM Proof of Technology


Lab 06 – Query Optimization IBM Software

__25. Repeat same exercise for Query 5 and 6 and check the difference in the access path.

__26. Query 5 access path.

IBM DB2 10.5 BLU Acceleration Page 63


IBM Software Lab 06 – Query Optimization

__27. Query – 6 access path.

__28. Please notice that the access path for the same query in all above 3 cases shows lower cost
when using all tables as column organized and the main striking feature is of CTQ after the JOIN
operation which results into the lower cost.

__29. Close tab tune03.sql.

Page 64 An IBM Proof of Technology


Lab 06 – Query Optimization IBM Software

Tune Analytics Workload


__30. Click Window  Preferences.

__31. Type Work in the search box. Click Workload Table Organization Advisor.

__32. Change 30 to 0 and uncheck Prompt to run the Workload Statistics Advisor….
Click OK.

IBM DB2 10.5 BLU Acceleration Page 65


IBM Software Lab 06 – Query Optimization

__33. Click File  Open File…

__34. Double click tune04.sql and hit OK. [Note: The file is in /home/db2psc/pot_blu/06tune].

__35. Click No Connection.

__36. Select ROWDB database and click Finish.

Note: We are selecting ROWDB database which has same tables as row
organized compressed. We will run Workload Table Organization Advisor
for queries using row organized tables.

Page 66 An IBM Proof of Technology


Lab 06 – Query Optimization IBM Software

__37. Click up arrow key in the separator bar to collapse the panel.

__38. Click Start Tuning…

__39. Click Save All to Workload…

__40. Click OK.

IBM DB2 10.5 BLU Acceleration Page 67


IBM Software Lab 06 – Query Optimization

__41. Right click on the Workload_0 line and click Invoke Workload Advisors and Tools.

Run Table Organization Advisor

__42. Check Re-collect EXPLAIN information before running workload advisors and
click Select What to Run…

__43. Check Table organization and click OK. [Statistics is checked by default.]

Page 68 An IBM Proof of Technology


Lab 06 – Query Optimization IBM Software

__44. In the next screen, click Start Explain.

__45. Run Workload Table Organization Advisor will start running and please wait for this to
complete.

__46. View Summary and click Table organization tab.

__47. View the estimated performance improvements and recommendations.

IBM DB2 10.5 BLU Acceleration Page 69


IBM Software Lab 06 – Query Optimization

__48. Click Show DDL Script. Please notice that ADMIN_MOVE_TABLE script is generated which will
convert the table organization from row to column. Click OK.

__49. Close the Workload node view.

__50. Click Save and Exit.

__51. Click File and Exit to close the Data Studio.

Page 70 An IBM Proof of Technology


Lab 06 – Query Optimization IBM Software

Run Queries and Compare


__52. Run cat tune07.sql to see the 15 queries that we will run against ROWDB and COLDB
databases.
$ cat tune07.sql

__53. Run tune05 to run 15 queries against ROWDB database which have row organized tables.
$ ./tune05

IBM DB2 10.5 BLU Acceleration Page 71


IBM Software Lab 06 – Query Optimization

__54. Run tune06 to run same 15 queries against COLDB database which have column organized
tables.
$ ./tune06

__55. Run tune08 to compare the elapsed time for each SQL statement and notice the percentage
improvement by the use of column organized tables for the analytics workload.

Note: The workload improved 11 times faster than the row organized tables.

Few query performance is 50 times faster than the row organized tables.

Page 72 An IBM Proof of Technology


Lab 06 – Query Optimization IBM Software

Investigate Slower Query in Column Organized


__56. Please note that all queries except 2 ran slower in column organized database. Statement 13
and 14 are same SQL and second execution of same SQL ran way faster in row organized table.

__57. The explain plan for the SQL for the query against ROWDB database is as shown.

IBM DB2 10.5 BLU Acceleration Page 73


IBM Software Lab 06 – Query Optimization

__58. The details on Node 8 indicate that the Estimated Output Cardinality is approx. 60K.

__59. The Estimated Output Cardinality for Node 11 is 836K.

Page 74 An IBM Proof of Technology


Lab 06 – Query Optimization IBM Software

__60. The explain plan for the SQL for the query against ROWDB database is as shown.

__61. The Estimated Output Cardinality for Node 6 is 50 million.

IBM DB2 10.5 BLU Acceleration Page 75


IBM Software Lab 06 – Query Optimization

__62. The Estimated Output Cardinality for Node 7 is 59753.

__63. The Estimated Output Cardinality for Node 5 is 60 million.

__64. The above analysis shows that the index scan selectivity for the query in the row organized
tables is way less than the selectivity for the same query for the column organized tables.

__65. The data is in buffer pool but the analysis for higher selectivity takes longer than the lower
selectivity and this particular query took less time in row organized tables.

__66. Please notice that the elapsed time for the query in row and column organized tables is 2.86
and 3.04 seconds.

Page 76 An IBM Proof of Technology


Lab 06 – Query Optimization IBM Software

__67. The second execution of the same query took 0.22 and 2.98 seconds as buffer pools were
already primed due to the previous execution.

__68. The bottom line for the column organized tables is the absence of indexes, ability to skip data,
access required columns only compared to the full row access, SIMD processing to speed things
up and in-memory processing of data with an ability of using both memory as well as disks if
memory is not enough to fit the data.

__69. You will realize maximum benefits by using column organized tables for analytics workload and
up to 10 times saving in the storages and up to 50 times query executions.

__70. Please note that the column organized tables are not fit for the low selectivity index based scans
which are the hallmark of the row organized tables.

__71. Please also note that you could use both row and column organized tables in the same database
and use the best of the both worlds to have OLTP and Analytics workload in the same database.

__72. Type clear in GNOME Terminal window.


$ clear

** End of Lab 06: Query Optimization

IBM DB2 10.5 BLU Acceleration Page 77

You might also like