Professional Documents
Culture Documents
Adaptive Optimization of Very Large Join Queries: Thomas Neumann Bernhard Radke Conference: SIGMOD 2018
Adaptive Optimization of Very Large Join Queries: Thomas Neumann Bernhard Radke Conference: SIGMOD 2018
Thomas Neumann
Professor at Technical University of Munich
6 papers got accepted in VLDB 2020, by Thomas Neumann group
Adaptive join for large queries, project received funding from
European Research Council.
2/21
Motivation for adaptive optimization
3/21
Formalizing the problem • Sum of
sizes of intermedi
ate results
• Follows
Target : Commercial Databases( which also supports non-inner join, outer join, ASI property
etc.), should adapt and perform good for simple and complex queries(having large
number of relations) as well.
Query
Query Graph Query Graph( Q=(V,E) ) Query Optimizer
Generator
4/31
Queries can be divided into three
categories (depending on query graph and
number of relations)
5/21
Small Queries
6/21
CountCC(Q)
7/21
Basic Idea of Hyper graphs
Hyper graph
Subgraph
Connected
Connected Subgraph
Connected Complement Subgraph
CSG-CMP pair
Neighborhood
Min(s) ( and ordering)
8/21
Intitution of DPHyp
Images source : Dynamic Programming Strikes Back, by Guido Moerkotte and Thomas Neumann
9/21
Medium Queries
DPHyp
10/21
(IK/KBZ)
Toshihide Ibraraki
IKKBZ Algorithm Tiko Kameda
Ravi KrishnaMurthy
Haran Boral
Carlo Zaniolo
Cout = (1-sel)/costs
Time Complexity :
O(n^2)
11/21
LinearizedDP Query graph
LinearizedDP
12/21
Large Queries
Problem with LinearizedDP is O(n3 ) [for DP phase]. Fine upto 100 relations
Greedy approach + DP (idea from Iterative DP)
Greedy Algorithm used : Greedy Operator Ordering (GOO)
GOO produces good bushy plan and run efficiently
Run LinearizedDP over subplan of size k(k=100), iteratively by choosing subplan with
maximum cost and size=k.
LinearizedDP will run till whole budget(size of DP table) gets over.
13/21
GOO-DP
14/21
Join ordering constructed with GOO
Image source :New Heuristic for Optimizing Large Queries, Leonidas Fegaras
15/21
LinearizedDP++ :
Adaptive Algorithm works for non-
inner and cross
join
16/21
Experimental
Evaluation
17/21
Some Details
18/21
Median optimization time for
diff. queries
Median Optimization Time for Random Tree Queries of Median Optimization Time for Random Tree Queries
Sizes 10–100 (100 queries per size) of Sizes 10–1000 (100 queries per size)
Comparison with existing Database
systems
20/21
Conclusion
21/21
Questions