OpenSAP Hanasql1 Week 2 Transcript en

openSAP
A First Step Towards SAP HANA Query

Optimization
Week 2 Unit 1
00:00:05 Hello and welcome to the course, A First Step Towards SAP HANA Query Optimization.
00:00:11 My name is Helen Shin. Last week we talked about SAP HANA query processing.
00:00:18 In this week, we'll learn about column search and analysis tools. Today, in unit one of week
two, we'll focus on column search.
00:00:31 Let's have a look at an overview of column search. We have discovered that after a plan is
generated, depending on the plan,
00:00:40 the different execution engines take over the job. When generated plan consists of
columnar operators,
00:00:51 column engine handles the plan. Here is one example - T means table, J is JOIN, and G is
GROUP BY in the plan.
00:01:03 Let's say T1, T2, and T3 are all column store tables. Then we can tell the JOIN J12 and
J13s are the JOINs between column store tables.
00:01:18 This is a plan that consists of columnar operator. Therefore, column store engine handles
this generated plan.
00:01:32 When the column engine processes the data, it needs a composite operator. Composite
operator is the one big package that contains one or several operators,
00:01:46 and those operators are executed within one composite operator. And we call these
"column search".
00:01:54 As I mentioned earlier, column search is available only for columnar operators that process
column tables or other columnar operators.
00:02:05 If the plan consists of row store tables, then column search is not made. There is another
important concept
00:02:17 regarding column search, which is pushdown blockers. Pushdown blockers are any reason
that prevents a parent operator
00:02:28 from going down to the column search. For example, if there is a pushdown blocker,
00:02:35 as you can see, the operator B cannot be moved into column search #1. Like the diagram
on the right-hand side,
00:02:48 if there is a part that is processed by the row engine, this is also pushdown blocker.
00:02:57 Let's have a look how the operators are processed if there is pushdown blocker.
00:03:04 If there is pushdown blocker, data materialization occurs. Data materialization is an
operation
00:03:14 where the intermediate result is formed in a physical temporary table. Usually, we can say
the data materialization is expensive.
00:03:25 In many out-of-memory events, it is found that a large part of the memory is allocated
00:03:31 for data materialization during the JOIN. Even for non-OOM performance issues,
00:03:38 materialization of the vast amount of the data normally takes a long time. Column search
processes the natively supported operators
00:03:53 in a predefined order. As the predefined order,
00:03:57 firstly, table is processed, and JOIN and GROUP BY operators are processed.
00:04:04 After that, ORDER BY operator is handled. Let's have a look at the example.
00:04:13 Here's one column search that consists of two JOINs and there is GROUP BY at the top.
00:04:20 And there is another JOIN, J14, where the table T4 tries to be pushed down into column
search #1.
00:04:31 However, due to the predefined order in column search,
00:04:35 JOIN J14 with the table T4 cannot be pushed down into column search #1.
00:04:42 Therefore, here, GROUP BY is a pushdown blocker. Column search can process natively
supported operators,
00:04:55 such as columnar operators. However, there are some operators that the column search
does not support,
00:05:02 for example, outer cyclic joins or window functions. As an example, let's look at the column
search.
00:05:14 In this query plan, there is JOIN J12 between table, T1 and table T2, and there is another
JOIN between J12 and table T3.
00:05:26 On top of this, there is GROUP BY. Here, let's assume the JOIN J13 contains outer cyclic
joins,
00:05:36 then the column search is split like on the right- hand side. Let's look at another example.
00:05:47 Here is one column search that consists of the JOIN 12 with the tables T1 and T2,
00:05:55 and there is another JOIN, J13 with J12, with the table T3, and there is a GROUP BY at the
top.
00:06:04 Here, if J13 contains operators that cannot be processed in column searches in any way,
00:06:12 these are processed by the row engine. Now we'll discover the concept of stacked column
search.
00:06:24 Let's suppose that the two logical plans are available during cost-based query optimization.
00:06:31 Let's assume those are all columnar operators. With this plan, we can imagine the following
column search.
00:06:42 Here, as you can see, two column searches are made on the left-hand side
00:06:49 while one big column search is made on the right-hand side. The reason why two column
searches are made on the left-hand side
00:07:00 is because a JOIN, J13, cannot be placed after GROUP BY,
00:07:06 due to column search's predefined order. Therefore, two column searches are made.
00:07:17 Stacked column search is a column search when the column search is split into multiple
ones.
00:07:28 Stacked column searches cause data materialization, while single column searches do not
need to process intermediate results.
00:07:41 For example, records from the GROUP BY within column search #1 are materialized.
00:07:51 If the stacked column search is slow because of the data materialization, then we can
consider making it a single column search using SQL hints.
00:08:03 In this example, we can think about SQL hints like A_THRU_B. And among candidates of
A_THRU_B,
00:08:13 we can come up with the SQL hint, JOIN_THRU_AGGR (aggregation). With the hint
JOIN_THRU_AGGR in this example,
00:08:24 we can make the JOIN positioned before GROUP BY and one single column search is
made.
00:08:33 So we can prevent a plan from becoming a stacked column search. On the other hand,
stacked column search has advantage,
00:08:44 it can benefit from pushdown blockers that reduce the intermediate result. Let's assume this
column search generates too many intermediate results,
00:08:57 and if you want to make it split, then you can also think about SQL hints. You make one
single column search into multiple column searches.
00:09:15 So we can summarize the characteristic of stacked column search and single column
search as follows.
2
00:09:27 In terms of stacked column search, data materialization can occur,
00:09:32 while we can also have benefits from pushdown blockers that reduce the intermediate
resort.
00:09:40 On the other hand, there is the advantage that the data materialization is not required for a
single column search.
00:09:51 However, there can be a large intermediate result generated. That's the end of the topic of
column search.
00:10:03 In the next unit, my colleague Jinyeon Lee will present to you the topic of SQL Trace.
00:10:11 Thank you for your attention. Bye.
3
Week 2 Unit 2
00:00:05 Hello and welcome. I'm Jinyeon Lee.

00:00:08 There are many useful analysis tools in HANA. Depending on your needs and situation,
00:00:13 you can collect traces using various analysis tools. As collecting traces is the first step to
analyze the performance issue,
00:00:22 it is very important to capture the traces correctly. First of all, as a topic for unit two of week
two,
00:00:30 we will focus on SQL trace. The SQL trace captures every single
00:00:39 SQL statement that enters the database. When you work on a performance issue, the most
time-consuming part is
00:00:47 to understand the performance issue and business scenario. From a database point of view,
00:00:55 the very first scenario that we encounter is a query itself. Therefore, it is important to find
00:01:03 which query comes into the database layer in which order, and which part of it makes the
slowness.
00:01:11 In this case, we can use SQL trace. There can be two use cases for SQL trace.
00:01:22 Let's say that a user runs a report from an application. This is a very simple job from
application perspective.
00:01:32 However, on the HANA side, many statements may come to HANA through the session
layer and be executed as a process of the application job.
00:01:44 At this point, if there is slow performance for running the application job, SQL trace helps us
to find the problematic statements by capturing all its SQL statements.
00:02:01 If you cannot find the SQL statement in SQL trace, there are two possible reasons.
00:02:08 First, the statement never reaches into the database. Second, the statement never goes
through the session layer
00:02:18 because it was executed internally. In this example, the case is when a procedure runs.
00:02:26 Although a procedure is a collection of multiple SQL statements, the statements inside a
procedure body are executed inside HANA SQLScript engine.
00:02:39 When the statement of a procedure body is executed internally, it is not captured by the
default SQL trace setting.
00:02:49 However, this internal statement can be logged when you enable the "internal statement"
configuration setting.
00:02:57 I will explain it more in later slides. There are two ways to collect the SQL trace.
00:03:09 The first one is through HANA studio, and the other one is using SQL commands.
00:03:17 Firstly, we will look into the case using HANA studio. In order to activate SQL trace using
HANA studio,
00:03:25 go to the "Trace Configuration" tab in "Administration editor" in HANA studio. Then you can
find "SQL Trace" on the top right of the Trace Configuration tab.
00:03:39 To collect the SQL trace, you need to click the pencil icon and enter the appropriate context
information.
00:03:48 First, choose the "Active" button to enable the trace, then you can set the trace level and
trace file name.
00:03:59 It is recommended that you use as many filters as possible, because otherwise the trace will
be very long
00:04:07 and it is hard to find the problematic query. So please make sure to use filters in order to get
efficient traces.
00:04:23 To turn off the SQL trace, you need to click the same pencil icon again.
00:04:28 Then set the trace to inactive. You can find the trace file on the "Diagnosis Files" tab.
4
00:04:38 In the next slides, I will talk about how to collect the SQL trace using SQL commands. There
are SQL commands to collect the SQL trace.
00:04:50 You can run ALTER SYSTEM commands in the SQL console instead of enabling the trace
in the Trace Configuration tab.
00:04:59 As I mentioned earlier, when the statements are executed internally, all the statements are
not captured with the basic trace option.
00:05:10 In this case, we can use the INTERNAL option to collect the statements executed internally.
And when you set the trace level as all_with_results,
00:05:23 we can see all the results of queries. query_plan_trace option shows us the query plan of
each statement as well.
00:05:37 This is an example of a correctly captured SQL Trace. If you'd like to find the specific SQL
statement,
00:05:50 It is important to know the time frame when the statement ran. The narrower it is, the easier
it will be.
00:06:01 If you don't know the thread number or the transaction ID associated with the statement,
schema name and table name are also very helpful to find your query.
00:06:16 The SQL trace can be used in reverse, for example, to find the precise timestamp of the
statement execution.
00:06:28 There can also be cases where you already know the thread number and you want to track
down the execution time or a statement string.
00:06:39 By juggling these key pieces of information, you should be able to find the details you are
looking for.
00:06:50 Now I will show you the SQL trace example with query_plan_trace option on. As you can
see,
00:07:03 when you turn on query_plan_trace option by ALTER SYSTEM statement, You can find
each query's plan like this.
00:07:14 This option is useful when you need to check query plan. Let's say you have an issue that
occurs sporadically.
00:07:24 And when you checked M_SQL_PLAN_CACHE, you found out that preparation count
keeps increasing.
00:07:31 That means, the query may be often compiled. In this case, to check whether the plan is
often compiled,
00:07:40 and if you want to know which plan is good or bad, you can turn on the SQL trace with this
option
00:07:48 until the issue is reproduced. Also, you can set the SQL trace level as all_with_results
00:07:59 by this ALTER SYSTEM statement at the bottom of the screen. You can see all results of
the queries like this.
00:08:10 This is useful when you need to check the query results. In the case of procedure execution,
00:08:20 seeing the result of internal statements is helpful in understanding dataflow. Let's say you
encounter an issue that the application job generates wrong results.
00:08:33 However, when you run the same query in HANA studio, it generates the correct results.
00:08:40 In this case, we need to check the query results from application to HANA. Because there
can be a case that the client or application layer generates wrong results,
00:08:54 even though HANA processed it correctly. Or we can also compare the result between
application and HANA,
00:09:02 whether HANA processed the result incorrectly. That was about the case of simple SQL
trace.
00:09:10 Now, we are going to look at the example of SQL trace for the procedure. Here is the
example procedure called PROC_INNER and PROC_OUTER.
00:09:23 The format of this procedure is, the PROC_OUTER is the main procedure. And within the
PROC_OUTER,
5
00:09:32 there is another procedure called PROC_INNER, and there are two additional statements:
00:09:39 SELECT all FROM table variable TV2 and SELECT all FROM table variable T2.
00:09:46 In order to see the individual query within the procedure, I used the NO_INLINE SQL hint
here within the procedure.
00:09:56 When compiling the procedure, the SQLScript optimizer combines two or more statements
00:10:02 if their combined form is considered to be more efficient. This process is called inlining
00:10:10 and it is usually beneficial for most procedures. I will talk about inlining more in week three.
00:10:23 As you can see, all the statements within procedure PROC_OUTER and PROC_INNER are
collected.
00:10:31 And since HANA 2.0 SPS04, we can track the memory consumption using "sqltrace".
00:10:39 In order to track the memory consumption, you need to enable "resource_tracking" first.
00:10:44 After that, you can set details as "basic" and "resource_consumption" under "sqltrace". You
can configure this in the Configuration tab in HANA studio
00:10:57 or by executing ALTER SYSTEM ALTER CONFIGURATION commands. Then, as you can
see,
00:11:04 the execution information including CPU time and memory size is collected in SQL trace.
00:11:13 That was about SQL trace. Thank you for your attention.
00:11:20 In the next unit, my colleague Helen Shin will talk about Explain Plan.
00:11:26 Goodbye.
6
Week 2 Unit 3
00:00:05 Hello and welcome back to unit three. Today we'll focus on explain plan.
00:00:13 Explain plan shows a compiled plan without executing the statement. It creates physical
data in a table called explain plan table,
00:00:24 and it selects the data for display and then deletes everything straight away because the
information does not need to be kept.
00:00:35 In order to capture explain plan in HANA studio, you can select the statement and right-click
and choose Explain Plan.
00:00:50 Then you can see the compiled plan in tabular form. It contains OPERATOR_NAME,
OPERATOR_DETAILS,
00:00:58 OPERATOR_PROPERTIES, EXECUTION_ENGINE, SCHEMA NAME, TABLE NAME,
TABLE TYPE, TABLE SIZE, OUTPUT SIZE,
00:01:07 AND LOCATION INFORMATION. When you look at the OPERATOR_NAME, this is the
hierarchy structure.
00:01:18 We'll look at optimized the plan using OPERATOR_NAME in the next slide. With operator
details, we can see each plan's detailed information.
00:01:35 Operator properties contain enumeration information, recompilation information, and
parameter values.
00:01:45 With EXECUTION_ENGINE column, we can check the executed engines for each operator.
00:01:53 Explain plan also shows the table size and its estimated output size. And if the tables are
partitioned or located in different hosts,
00:02:04 we can get the location information as well. Now we are going to draw a query optimizer
plan using OPERATOR_NAME.
00:02:17 As I mentioned earlier, the OPERATOR_NAME column is a hierarchy structure. Based on
this, we can draw a query optimizer plan.
00:02:29 In order to draw a query optimizer plan, we can find out COLUMN SEARCH first. There are
three column searches in total in this example.
00:02:42 As this is the hierarchy structure, let's have a look at column search C first.
00:02:50 Column search C has aggregation at the top, and below the aggregation, there is a JOIN
between tables T2 and T3.
00:03:00 So we can draw the optimizer plan as this. Next, let's look at the column search B.
00:03:10 Column search B is the upper column search of column search C. Under the column search
B, there is a LIMIT,
00:03:18 and below the LIMIT operator, there is table T1 and its data is ORDER BY.
00:03:25 And as the final column search, column search A embraces column search B and column
search C.
00:03:34 When you look at the column search A, there is a JOIN, and after JOIN, ORDER BY
operator is processed.
00:03:43 Therefore, we can draw the optimizer tree like this. Now we are looking to
OPERATOR_PROPERTIES column.
00:03:56 OPERATOR_PROPERTIES is another useful column in explain plan. When you look at the
OPERATOR_PROPERTIES column,
00:04:05 you can see the attached hint list. For example, if you specify SQL hint at the end of the
statement,
00:04:15 then you can see the hint name in the OPERATOR_PROPERTIES column. This is available
as of HANA 2.0 SPS03.
00:04:27 Also, you can see the logical enumeration rules that are applied to the operator.
00:04:32 Here, the LIMIT_THRU_JOIN enumerator is applied. Also, you can check whether the query
is precompiled
7
00:04:41 or recompiled plan. From last week's session,
00:04:47 we have checked the parameterized query's compilation through explain plan. And if the
query is parameterized a query,
00:04:57 then we can also see the parameter values. As I mentioned before,
00:05:07 explain plan is deleted from the explain plan table right after it is created.
00:05:13 To revisit it, you need to recreate the explain plan.
00:05:18 In this context, it is more useful to extract the existing explain plan using the SQL plan
cache
00:05:26 because the caches are always stored for later use unless they are evicted. Therefore, I will
introduce how to find
00:05:36 existing explain plans using SQL plan cache. First, we need to search the target query in
M_SQL_PLAN_CACHE.
00:05:47 Here we need to know PLAN_ID so you can search the query string as well as PLAN_ID.
00:05:55 After that, you can run the simple SQL commands to see the existing explain plan,
00:06:01 which is "EXPLAIN PLAN FOR SQL PLAN CACHE ENTRY <PLAN_ID>;". Here is an
example.
00:06:13 Firstly, we are going to search PLAN_ID for the target query in M_SQL_PLAN_CACHE.
00:06:25 After that, we know the PLAN_ID is 18450530003.
00:06:34 Using this information, we are searching for explain plan. So we use explain plan for SQL
plan cache entry plan ID.
00:06:44 Now we can find the existing explain plan. Through this session,
00:06:51 you know what is explain plan, how to capture, and how to find an existing explain plan.
00:06:59 That's about explain plan as a tool of performance issue analysis. In the next unit,
00:07:05 we'll discover another useful analysis tool, which is visualized plan.
00:07:12 Thank you for your attention. Looking forward to meeting you there. Bye.
8
Week 2 Unit 4
00:00:05 Hello and welcome to unit four of week two. We will continue to introduce SAP HANA useful
analysis tools.
00:00:14 And today's topic is Visualized Plan. You might be more familiar with PlanViz.
00:00:25 As PlanViz is a tool to create a visualized plan, it creates a visual map of the operators
00:00:32 and their relationships and hierarchies. In order to capture visualized plan,
00:00:42 you need to select the query first, right-click, and choose Visualize Plan,
00:00:46 then click Execute. The executed plan provides actual information
00:00:54 and not only planned information. On the other hand,
00:01:01 the prepared plan only shows the data that is available before the execution,
00:01:07 such as estimated size. Most of the time, you will find the execution plan more useful.
00:01:17 So here, I will show you the case of executed plan. Then you will see this screen when you
execute the plan visualizer
00:01:26 or open a plan visualizer file. As highlighted here, you can see the compilation time and
execution time.
00:01:38 The execution time indicates the significance of the issue. If the execution time is 2
seconds,
00:01:47 it is considered not very critical. But if it takes 3,000 seconds to run a single SQL statement,
00:01:56 this can be very critical because users need to wait 50 minutes for the result.
00:02:05 Of course, how critical the issue is depends on the business use case and requirements.
00:02:12 Another factor to consider is the compilation time. Most of performance issue is due to slow
execution
00:02:20 but there is a case that the performance issue is because of the slow compilation.
Therefore, you should also consider the compilation time
00:02:31 as well as execution time. And when you hover your mouse on SQL Query in the Context
field,
00:02:46 you can see the full SQL string. Or you can click the small SQL icon to see the full SQL
statement.
00:03:00 And PlanViz shows the dominant operators among the visualized plan. The Dominant
Operators section displays the most expensive top three operators
00:03:11 by sorting its execution time. When you click on any operator name displayed in Dominant
Operators,
00:03:20 it goes to the corresponding visualized operator in the graph. Here, when you click
BWPopJoin1inwards,
00:03:30 it shows the operator in the graph as you can see. Here, BW in column search is execution
engine operator.
00:03:41 But for query issue analysis, understanding data flow at column search level
00:03:47 is more helpful than execution engine operator level. Now, let's move on to the Executed
Plan tab.
00:03:58 Then you can see the graphical view of the plan. From unit one of week two,
00:04:05 we learned about the concept of column search. Column search is a composite operator to
process data.
00:04:17 Therefore, understanding the relationship between column searches is very helpful in
understanding data flow.
00:04:29 Both column search and analytical search indicate the column search, but depending on the
execution engine involved,
00:04:37 the column search name is different. When it says column search on PlanViz,
9
00:04:45 join engines or calculation engines are involved. When it says analytical search,
00:04:52 OLAP engine is used. These are results of the child column searches sent to parent column
search.
00:05:02 And these are visualized for development purposes, so we don't use these operators for
query performance issue analysis.
00:05:16 Now, let's have a look at each column search. Understanding data flow is very important in
terms of performance issue analysis.
00:05:28 Therefore, we are going to draw each column search to understand data flow.
00:05:36 The left-hand side is PlanViz, and we will draw the column search by simplifying it on the
right-hand side.
00:05:46 So as we draw the column search, we recommend you draw the column search
00:05:51 based on the information in PlanViz. We can recognize three column searches from this
example.
00:06:07 Let's look at the time of each column search. Inclusive time is the time taken to execute the
complete operation
00:06:17 including the time of the children operators and excluding compilation time.
00:06:25 On the other hand, exclusive time is the time taken to execute a single operation.
00:06:35 In most cases, we are checking exclusive time to check time for execution of one single
operation
00:06:47 You can also reach to the most dominant operators by following the highlighted orange line
00:06:53 instead of clicking the dominant operator on the Overview page. Now, we will check out how
we could know the data flow in PlanViz.
00:07:10 When you look at the line from the box of the column search, you can see the figures.
00:07:15 The figure in the brackets is the estimated size by the optimizer and this is used during
query compilation.
00:07:24 And the figure without brackets is the actual data size, which is only available after
execution.
00:07:35 Actual execution information is only available in PlanViz. Explain Plan does not provide
actual execution information.
00:07:47 However, the actual size displayed in PlanViz is not always applicable to every case.
00:07:54 This is because actual execution information is only available after the plan has been
executed. Let's say there is a query that caused an out-of-memory event
00:08:08 but you don't have any traces. You know that PlanViz could help the issue analysis
00:08:15 but you cannot create it because the system would result in an out-of-memory situation
again.
00:08:23 So you will not get the actual execution information. Instead, you would get the visualized
plan at the cost of OOM.
00:08:36 And now, let's make it in a simple diagram. There are three column searches.
00:08:43 And when you look at it, the column search #1 returns 20 rows and its estimated size is 20
as well.
00:08:53 And column search #2 generates the intermediate size, 1,000,535 rows, and its estimated
size is 10 million.
00:09:05 Those intermediate results go into the column search #3 and are processed. You can see a
more detailed plan when you look inside the column search.
00:09:19 When you right-click on the column search, you can open inner plan as logical or physical.
00:09:27 The logical plan gives you the big picture and an overview of the plan,
00:09:32 but it does not provide detailed information such as execution engine information.
00:09:41 On the other hand, physical plans contain more detailed information, this is including
information that is provided by the execution engines.
10
00:09:55 Physical plans are usually more complex than logical plans. Since the logical plan shows
the shape of the query optimizer tree
00:10:05 and contains structural information, we recommend you analyze the logical plan first
00:10:12 before you analyze the physical plan. So this is the logical plan of column search #1.
00:10:24 There is a base table called T1 and its result is ordered by T1.A in ascending order.
00:10:32 After that, limit operation is applied. We can simplify its logical plan like this.
00:10:45 Now, let's have a look at the logical plan of column search #2 and column search #3
00:10:56 As you can see, when we look at the logical plan of column search #2, we can see that
there is an INNER JOIN between table T2 and table T3.
00:11:06 Then there is aggregation of GROUP BY. And when we look at the logical plan of column
search #3,
00:11:18 there is a LEFT OUTER JOIN between column search #1 and column search #2. After that,
ORDER BY is processed.
00:11:32 Then we can check out the logical structure like this. There are three column searches.
00:11:39 And column search #3 is the result of the left outer join between column search #1 and
column search #2.
00:11:48 In column search #1, data is extracted from THE table T1 and it is sorted in ascending
order.
00:11:56 After that, limit operator is applied. In column search #2, there is an INNER JOIN between
table T2 and T3,
00:12:06 and it is GROUP BY. Like this example,
00:12:12 using PlanViz, you can understand the logical structure of the query plan and its data flow.
00:12:19 That's the end of the topic of Visualized Plan. In the next unit, my colleague Jinyeon Lee
00:12:26 will present a hands-on session for literal and parameterized queries about collecting traces.
00:12:33 Thank you for your attention. Goodbye.
11
Week 2 Unit 5
00:00:05 Hello, and welcome to unit five of week two. I'm Jinyeon Lee and I will present to you the
topic
00:00:11 about trace collection for parameterized query and we will have our hands-on exercise to
collect traces
00:00:17 for a literal query and a parameterized query. Okay, we know the characteristics of
parameterized queries,
00:00:31 which is that the parameterized query is compiled twice, precompilation and recompilation.
00:00:37 Precompilation is done without bind variable. On the other hand, recompilation is done with
bind variable.
00:00:45 Here, regarding parameterized query trace collection, it is important to see the recompiled
plan for performance issue analysis.
00:00:59 When you run explain plan and collect PlanViz without plan cache, you are looking at the
precompiled plan.
00:01:07 And you are never going to get the final execution plan that puts the parameter values into
its analysis.
00:01:18 To get a recompiled plan, you need to execute a parameterized query before collecting the
trace. There is one more thing you should know when you run the parameterized query,
00:01:38 which is to make the parameterized query into a single line. This is because different
carriage returns across the interfaces
00:01:47 is likely to interfere with the usage of plan cache. HANA database server can handle that
same query
00:01:55 from Windows and Linux environments differently due to their different carriage returns.
00:02:02 So, in order to avoid plan cache confusion caused by different carriage returns, please
make sure the parameterized query is executed as a single line.
00:02:14 The easiest way to make one single-line parameterized query is just copy the query into the
navigation bar in any Internet browser.
00:02:30 It is very important to see the recompiled plan when you analyze a performance issue.
Because analyzing a precompiled plan can make a big difference in plan analysis.
00:02:43 On the left, we have a precompiled plan. On the right-hand side, there is a recompiled plan.
00:02:54 For the performance issue analysis for parameterized query, it is correct to investigate the
plan on the right-hand side since it is the recompiled plan.
00:03:05 As you can see, the precompiled plan and recompiled plan have different logical plans in
this example.
00:03:16 Let us suppose that you have a performance issue with a parameterized query. In this case,
if you analyze the query with a precompiled plan, like the left-hand side,
00:03:28 it does not give you any help for the root cause analysis. So the safest way is to execute the
parameterized query once
00:03:39 before collecting the trace for the query. Here, you make sure you use exactly the same
query string including carriage returns
00:03:51 and white spaces for query execution. Another thing to check, the PlanViz itself does not
show you
00:04:01 whether the plan is precompiled or recompiled. So for the recompilation check, you need to
check out explain plan.
00:04:15 Collecting HANA traces in the correct way gives a good start to the investigation. So now
you will do a simple trace collection hands on
00:04:26 for the literal query and parameterized query. Firstly, you will collect explain plan and then,
you will also collect visualized plan.
00:04:39 Here is a hands-on query. You can find more information in the hands-on information page.
12
00:04:46 Let's have a look at the query. There is LEFT OUTER JOIN between table T1 and the
subquery with the JOIN key T1.A=T4.C.
00:05:01 The subquery has another join between table T2 and table T3. And the result is ordered by
T1.A with 20 limits.
00:05:13 Now, let's collect explain plan for this query. This is the result of explain plan.
00:05:23 Since this is a literal query, you can easily collect explain plan. You can select the query and
right click, then chose Explain Plan.
00:05:35 You will see this explain plan. When you look at the operator name, you can imagine the
query structure.
00:05:48 There are three column searches in total. We can tell the INNER JOIN between table T2
and T3 is processed first.
00:05:59 After that, LEFT OUTER JOIN is processed. Also we could find that the execution engine
was done by OLAP engine at the beginning
00:06:08 and it was processed by column engine at the end. Let's collect the PlanViz.
00:06:18 So here, we can check compilation time is 0.56 milliseconds and execution time is 1.7
seconds.
00:06:28 We can also find the dominant operator is BwPopJoin1 and number of the table used is 3.
And when we move on to executed plan, we can see this diagram.
00:06:44 There are three column searches like explain plan. And as the final result, we could know
that 20 rows are returned.
00:07:00 Here, parameterized query. As I explained previously, it is important to make a
parameterized query
00:07:07 into a single line to avoid plan cache confusion. Here we use bind variable 20.
00:07:17 Let's collect explain plan for the parameterized query. If you are seeing this recompiled
explain plan, you are looking at the correct one.
00:07:28 From the last slides, we learn that we need to execute the parameterized query before
collecting the trace.
00:07:37 Then the plan will be precompiled, and when we collect the trace, we can see the
recompiled plan for the parameterized query.
00:07:49 When you look at the plan, you find the difference from the literal query. Here, there are only
two column searches created.
00:08:04 Let's check out PlanViz. This is a step to enter bind variable for the parameterized query.
00:08:18 So like literal query, you can check the compilation time and execution time. And also check
out the dominant operator, which is BwPopJoin1.
00:08:34 When you move on to the executed plan tab, then you will see this query structure. Unlike
the literal case, there are two column searches,
00:08:44 and joins are processed within the column search. Now we know how to collect traces such
as explain plan and PlanViz
00:08:56 for parameterized query and literal query. In the next unit, my colleague, Helen Shin, will
continue to explain
00:09:03 column searches in trace. Thank you for your attention.
00:09:07 Goodbye.
13
Week 2 Unit 6
00:00:05 Hello, and welcome to unit six of week two. I'm Helen Shin and I will present to you column
search in the traces today.
00:00:16 As column search is a HANA-specific physical operator, depending on the trace, it has
different shapes and different names.
00:00:27 You can easily check out column search in explain plan since it is explicitly stated as
column search.
00:00:35 However, when you look at the PlanViz, you may find out different names of the column
search for different execution engines.
00:00:47 In PlanViz, column search is a column search sent to JOIN engine, and analytical search is
a column search sent to OLAP engine.
00:00:59 I will explain more details in later slides. If we simplify the column search from the trace,
00:01:09 it looks like this. It is important to check out the relationship between column searches
00:01:16 to understand data flow The best advantage of explain plan compared to the other traces is
it is fast and easy.
00:01:32 It gives you the execution plan without actually executing the statement. When you execute
explain plan, then it will give you a nicely organized table view
00:01:46 right after cache lookup and compilation. In explain plan, column search appears as if there
is a separate physical operator,
00:02:00 but it is a composite operator that embraces the other operators below. In this example, the
yellow column search at the bottom
00:02:13 is comprised of its absorbed operators, FILTER, JOIN, and the two column tables.
00:02:22 This one set of the column search is going to be transmitted to JOIN engine for the
execution, and a more precise execution strategy will be determined
00:02:33 by the local optimizer inside the JOIN engine. Therefore, what we know from this explain
plan is there are three column searches
00:02:48 and therefore, the data materialization is going to be happening three times. Column search
in PlanViz has two different names,
00:03:06 column search and analytical search. Column search in PlanViz is a column search sent to
JOIN engine,
00:03:14 And analytical search is a column search sent to OLAP engine. JOIN and OLAP engines is
one of the most popular topics to be discussed.
00:03:27 In some cases, the hint USE_OLAP_PLAN seems to cure all performance degradations so
easily.
00:03:37 However, there are several things you really need to keep in mind. First, it is not always true
that OLAP is faster than JOIN.
00:03:52 Second, OLAP never works without aggregation. That is, it needs aggregation within
column search.
00:04:01 Third, OLAP cannot handle intermediate results bigger than 2 billion records. Lastly,
supported features, size estimation algorithms,
00:04:14 reduction mechanisms, and JOIN strategies are very different between the two engines.
Lastly, there is no fixed answer to the JOIN or OLAP question, but only comprehensive
rules.
00:04:33 For example, OLAP could be worth trying when it comes to star schema and heavy
aggregation. But basically, this heavily depends on the situation.
00:04:48 So, we have discovered column search in the traces. That's it for unit six.
00:04:57 Please join in unit seven, my colleague Jinyeon Lee will explore how to analyze out-of-
memory dumps with you.
00:05:05 Thank you for your attention. Goodbye.
14
Week 2 Unit 7
00:00:05 Hello and welcome to the last unit of week two. I'm Jinyeon Lee.
00:00:11 In this unit, I will present to you how to analyze out-of-memory dump. Let's start.
00:00:20 When SAP HANA requires additional memory and is not able to allocate new memory or
reclaim memory,
00:00:27 then the transaction is aborted with out-of- memory error and out-of-memory dump is
generated.
00:00:40 In this unit, we will find out how to analyze out- of-memory dump. First of all, let's look at the
dump structure.
00:00:50 When you look at the out-of-memory dump, it has the following structure.
00:00:56 Under the BUILD section, there is build information of your HANA instance.
00:01:01 And you can find all the running threads including SQLs and query plans under the
THREAD section.
00:01:10 STACK_SHORT shows call stacks and pending exceptions of all threads. you can check
the process information under the PROCESS_INFO section.
00:01:21 There is the OS_MEMORY section and MEMORY_OOM information as well. By default,
SAP HANA creates only one out-of-memory dump within 24 hours.
00:01:38 And sometimes it can be a disadvantage when several OOMs need to be analyzed that
happened within less than 24 hours.
00:01:48 In this case, you can refer to the M_OUT_OF_MEMORY_EVENTS monitoring view.
00:01:55 It shows a list of the last 20 out-of-memory events. Now, let’s discover what composite out
of memory is.
00:02:08 Composite OOM is linked to the statement memory limit. We can set statement memory
limit to prevent single statements
00:02:19 from consuming too much memory. When reaching the certain statement memory limit,
00:02:25 the composite OOM dump is generated and the statement is aborted.
00:02:35 You can set the statement memory limit using the following configuration. You need to turn
on the enable_tracking and memory_tracking configurations
00:02:45 under the resource_tracking section. After that, you can set the statement memory limit.
00:02:54 Of course, you can also create exceptions to statement memory limits for individual users
by setting a different statement memory limit for each individual.
00:03:06 Let’s have a look how to analyze a composite OOM dump. First of all, you can find the
composite limit value and root allocator
00:03:17 under the memory_limit_violation section. Then you check out how much the composite
limit is,
00:03:33 and the problematic query information. Also you can check out the top limited composite
allocator
00:03:51 which displays the allocators consuming the most memory in the current system. From the
top allocator in descending order by exclusive size in use,
00:04:10 you can find the root allocator's connection ID and statement ID. Using this information, you
can find the query.
00:04:23 In this example, we found one suspicious query with the connection ID = 300387. And the
query has thread ID as 233230
00:04:37 and its parent thread is 33511. With the given connection ID, we found the thread and its
parent thread.
00:04:52 Let's find the parent thread. So you can search the thread with the value 33511.
00:05:00 As you can see, the parent thread has the same connection ID, but there is no value for its
parent thread.
15
00:05:09 That means, this thread is the topmost hierarchy. So now, you find the problematic query for
the composite OOM dump.
00:05:37 Once you find the problematic query that caused OOM, you can also see its explain plan.
00:05:45 To see the query plan in a tabular view, you can just copy and paste it from the OOM dump
to MS Excel.
00:05:58 That's the end of unit seven, which is the last unit of week two. Thank you for your attention.
00:06:07 In the following week, we will explore Methods for Query Performance Analysis. Looking
forward to meeting you there.
00:06:17 Goodbye.
16
www.sap.com/contactsap
© 2020 SAP SE or an SAP affiliate company. All rights reserved.

No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company.
The information contained herein may be changed without prior notice. Some software products marketed by SAP SE and its distr ibutors contain proprietary software components of other software vendors.
National product specifications may vary.
These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP or its affiliated companies shall not be liable
for errors or omissions with respect to the materials. The only warranties for SAP or SAP affiliate company products and serv ices are those that are set forth in the express warranty statements
accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.
In particular SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any relat ed presentation, or to develop or release any functionality
mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possibl e future developments, products, and/or platform directions and functionality are
all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation
to deliver any material, code, or functionality. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from e xpectations. Readers are
cautioned not to place undue reliance on these forward-looking statements, and they should not be relied upon in making purchasing decisions.
SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trade marks of SAP SE (or an SAP affiliate company) in Germany and other
countries. All other product and service names mentioned are the trademarks of their respective companies. See www.sap.com/copyright for additional trademark information and notices.

OpenSAP Hanasql1 Week 2 Transcript en

Uploaded by

Copyright:

Available Formats

You might also like

OpenSAP Hanasql1 Week 2 Transcript en

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

OpenSAP Hanasql1 Week 2 Transcript en

Uploaded by

Copyright:

Available Formats

openSAP

A First Step Towards SAP HANA Query

00:00:05 Hello and welcome. I'm Jinyeon Lee.

00:12:33 Thank you for your attention. Goodbye.

© 2020 SAP SE or an SAP affiliate company. All rights reserved.

You might also like