Database Performance Tuning and Query Optimization: Discussion Focus

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 6

Ch11 Database Performance Tuning-and Query Optimization

Chapter 11

Database Performance Tuning and Query Optimization

Discussion Focus

This chapter focuses on the factors that directly affect database performance. Because performance-
tuning techniques can be DBMS-specific, the material in this chapter may not be applicable under all
circumstances, nor will it necessarily pertain to all DBMS types.

This chapter is designed to build a foundation for the general understanding of database performance-
tuning issues and to help you choose appropriate performance-tuning strategies.

 Start by reading about the basic database performance-tuning concepts (11.1). You are
encouraged to use the web to search for information about the internal architecture (internal
process and database storage formats) of various database systems. Focus on the similarities to
lay a common foundation.
 Be familiar with how a DBMS processes SQL queries in general terms and stress the importance
of indexes in query processing.
 Step through the query processing example in section 11.4, Optimizer Choices.
 There are common practices used to write more efficient SQL code. Note that some practices are
DBMS-specific. As technology advances, the query optimization logic becomes increasingly
sophisticated and effective. Therefore, some of the SQL practices illustrated in this chapter may
not improve query performance as dramatically as it does in older systems.
 Step though the chapter material using the query optimization example in section 11.8.

Problem Solutions

Problems 1 and 2 are based on the following query:

SELECT EMP_LNAME, EMP_FNAME, EMP_AREACODE, EMP_SEX


FROM EMPLOYEE
WHERE EMP_SEX = ‘F’ AND EMP_AREACODE = ‘615’
ORDER BY EMP_LNAME, EMP_FNAME;

1. What is the likely data sparsity of the EMP_SEX column?

Because this column has only two possible values (“M” and “F”), the EMP_SEX column has low
sparsity.

2. What indexes should you create? Write the required SQL commands.
Ch11 Database Performance Tuning-and Query Optimization

I should create an index in EMP_AREACODE and a composite index on EMP_LNAME,


EMP_FNAME. In the following solution, I have named the two indexes EMP_NDX1 and
EMP_NDX2, respectively. The required SQL commands are:

CREATE INDEX EMP_NDX1 ON EMPLOYEE(EMP_AREACODE);


CREATE INDEX EMP_NDX2 ON EMPLOYEE(EMP_LNAME, EMP_FNAME);

3. Using Table 11.4 on page 499 as an example, create two alternative access plans (Plan A & Plan
B). Use the following assumptions:
a. There are 8,000 employees.
b. There are 4,150 female employees.
c. There are 370 employees in area code 615.
d. There are 190 female employees in area code 615.

The solution is shown in Table P11.3.

TABLE P11.3 COMPARING ACCESS PLANS AND I/O COSTS


I/O I/O Resulting Total I/O
Pla Ste
Operation Operatio Cost Set Rows Cost
n p
ns
Full table scan EMPLOYEE
Select only rows with
A A1 8000 8000 190 8000
EMP_SEX=’F’ and
EMP_AREACODE=’615’
A A2 SORT Operation 190 190 190 8190
Index Scan Range of
B B1 370 370 370 370
EMP_NDX1
Table Access by RowID
B B2 370 370 370 740
EMPLOYEE
Select only rows with
B B3 370 370 190 930
EMP_SEX=’F’
B B4 SORT Operation 190 190 190 1120

Problems 4-6 are based on the following query:

SELECT EMP_LNAME, EMP_FNAME, EMP_DOB, YEAR(EMP_DOB) AS YEAR


FROM EMPLOYEE
WHERE YEAR(EMP_DOB) = 1966;

4. What is the likely data sparsity of the EMP_DOB column?

Because the EMP_DOB column stores employee’s birthdays, this column is very likely to have high
data sparsity.
Ch11 Database Performance Tuning-and Query Optimization

5. Should you create an index on EMP_DOB? Why or why not?

I don’t think it is necessary to create an index in the EMP_DOB column because it would not help
this query, due to the query uses the YEAR function. However, if the same column is used for other
queries, then I might re-consider creating an index.
Ch11 Database Performance Tuning-and Query Optimization

Problems 7-10 are based on the ER model shown in Figure P11.7 and on the query shown after the
figure.

Figure P11.7 The Ch11_SaleCo ER Model

Given the following query

SELECT P_CODE, P_PRICE


FROM PRODUCT
WHERE P_PRICE >= (SELECT AVG(P_PRICE) FROM PRODUCT);

7. Assuming that there are no table statistics, what type of optimization will the DBMS use?

The DBMS will use the rule-based optimization.

8. What type of database I/O operations will likely be used by the query? (See Table 11.3.)

The DBMS will likely use a full table scan to compute the average price in the inner subquery. The
DBMS is also very likely to use another full table scan of PRODUCT to execute the outer query.
Ch11 Database Performance Tuning-and Query Optimization

TABLE 11.3 Sample DBMS Access Plan I/O Operations


Operation Description
Table Scan (Full) Reads the entire table sequentially, from the first
row to the last row, one row at a time (slowest)
Table Access (Row Reads a table row directly, using the row ID value
ID) (fastest)
Index Scan Reads the index first to obtain the row IDs and
(Range) then accesses the table rows directly (faster than
a full table scan)
Index Access Used when a table has a unique index in a column
(Unique)
Nested Loop Reads and compares a set of values to another
set of values, using a nested loop style (slow)
Merge Merges two data sets (slow)
Sort Sorts a data set (slow)

9. What is the likely data sparsity of the P_PRICE column?

Because each product is likely to have a different price, the P_PRICE column is likely to have high
sparsity.

10. Should you create an index? Why or why not?

Yes, I should create an index because the column P_PRICE has high sparsity and the column is very
likely to be used in many different SQL queries as part of a conditional expression.

Problems 11-14 are based on the following query:

SELECT P_CODE, SUM(LINE_UNITS)


FROM LINE
GROUP BY P_CODE
HAVING SUM(LINE_UNITS) > (SELECT MAX(LINE_UNITS) FROM LINE);

11. What is the likely data sparsity of the LINE_UNITS column?

The LINE_UNITS column in the LINE table represents the quantity purchased of a given product in
a given invoice. This column is likely to have many different values and therefore, the column is
very likely to have high sparsity.

12. Should you create an index? If so, what would the index column(s) be, and why would you
create that index? If not, explain your reasoning.

Yes, you should create an index on LINE_UNITS. This index is likely to help in the execution of the
inner query that computes the maximum value of LINE_UNITS.
Ch11 Database Performance Tuning-and Query Optimization

13. Should you create an index on P_CODE? If so, write the SQL command to create that index. If
not, explain your reasoning.

Yes, creating an index on P_CODE will help in query execution. The SQL command would be:

CREATE INDEX LINE_NDX1 ON LINE(P_CODE)

14. Write the command to create statistics for this table.

ANALYZE TABLE LINE COMPUTE STATISTICS;

You might also like