Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 62

SEOUC 2010

To Parallel or Not To Parallel


(that is the question..)

Romeo Vasileniuc
BB&T Specialized Lending
About Romeo..

 I started working with Oracle database since 1994


 I've been involved in about all aspects of Oracle database
technologies including RAC, ASM, Data Guard and Streams
 Designed and implemented many different varieties of high
available database environments using RAC on ASM, OCFS2 and
Tru64 CFS
 I enjoy performance tuning
 Proficient Perl developer
 In my current role as data warehouse architect with BB&T, I have
architected and implemented many business-driven solutions using
Oracle and other vendor products to meet critical business needs
especially in warehousing area
 Oracle Certified Master, OCP, OCE
Overview

 What is Parallel Execution?


 How/When to use it?
 How to Avoid getting into Regression?
 Test Load Cases
 Oracle 11gR2 New Features
Introduction to Parallel Execution

 Parallel execution reduces response time for


DSS and DWH
 You can try on certain OLTP systems
 Parallelism : breaking a task in multiple pieces :
divide and conquer technique
 Typical used for:
– Large table scans, joins, partitioned index scans
– Creation of large indexes
– Creation of large tables or materialized views
– Bulk inserts, updates, merges and deletes
Parallel Execution : System Requirements

 Symmetric multiprocessors (SMPs), clusters,


massively parallel systems
 Sufficient CPU (systems where CPU usage is
less then 30%)
 Sufficient I/O bandwidth
 Data is on multiple Disk Drives
 Sufficient memory to support additional memory-
intensive processes (sorts, hashing, I/O buffers)
Parallel Execution : SQL Requirements

 SQL to be paralyzed is long running or resource


intensive
 SQL performs at least on full table/index/partition
scan
 SQL is well tuned !
What Can Be Parallelized?

 Access Methods (table scans, index full scans,


partitioned index range scans)
 Join Methods (nested loop, sort merge, hash and
star transformation)
 DDL Statements (CTAS, create/rebuild
index/partition, move/split/coalesce partition)
 DML (insert as select, updates, deletes, merge)
 Parallel Query
 SQL*Loader (w/ direct path)
– sqlldr USERID=SCOTT/TIGER CONTROL=LOAD1.CTL DIRECT=TRUE PARALLEL=TRUE
– sqlldr USERID=SCOTT/TIGER CONTROL=LOAD2.CTL DIRECT=TRUE PARALLEL=TRUE
– sqlldr USERID=SCOTT/TIGER CONTROL=LOAD3.CTL DIRECT=TRUE PARALLEL=TRUE
How Parallel Execution Works

 Parallel execute divides the SQL statement into multiple


units, each executed by a separate process
 Partitioned and Parallel work together. If partitioned
object not available Oracle will split the work into
granules
 The main server process became the query coordinator
and does the following:
– Parses the query and determines the DOP
– Allocates one or two sets of slaves
– Controls the query and sends instructions to PQ slaves
– Determines which tables or indexes need to be scanned by the PQ slaves
– Produces the final output to the user
Serial Execution

Process 1 alter table TRIAD noparallel;

Read Rows from select *


TRIAD Table
from
TRIAD
order by
cdate
SORT Rows
;

Return Rows
Parallel Execution

Process 1 Process 2
alter table TRIAD
Read Rows from Read Rows from
parallel (degree 2 instances 1)
TRIAD Table TRIAD Table
;

Process 3 Process 4 select *


from
SORT Rows SORT Rows TRIAD
order by
cdate
;

Return Rows

Process 5 (QC)
Parallel Execution
Fetch Rows/Group By/Order By : DOP(3,1)

TRIAD Table

Read Rows from Read Rows from Read Rows from


TRIAD Table TRIAD Table TRIAD Table

GROUP BY Rows GROUP BY Rows GROUP BY Rows REUSE

SORT Rows SORT Rows SORT Rows

Query Coordinator
Degree of Parallelism (DOP)

 The query coordinator (QC) will enlist tow or


more of the instance parallel execution server to
process the SQL
 Degree of Parallelism : Number of parallel
execution servers
 If inter-operation parallelism is possible DOP can
be twice the specified !
 Ways to limit resource utilization:
– The adaptive multiuser algorithm
– User resource limits and profiles
– Database Resource Manager
How Parallel Execution Servers Communicate
Types of Parallelism

 Parallel Query (select, query portion of DDL,


external tables)
 Parallel DDL (create/rebuild index, CTAS,
move/split/coalesce partition
 Parallel DML (parallel insert, update, delete
and merge)
 Parallel Execution of Functions
 Other Types (Recovery, Replication, Load)
Manual Parallelism?

 Difficult to use
 Lack of transactional properties
 Work division is complex
 DOP calculation is complex
 Lack of affinity and resource information
(RAC)
Typical Parallel DML Usage

 Refreshing Tables in DWH system


 Create/Refresh Materialized Views
 Using Scoring Tables
 Update Historical Tables
 Running Batch Jobs (OLTP)
Initializing Parameters for Parallel Execution

 Parallel execution is enabled by default


 How to Disable Parallel
– PARALLEL_MAX_SERVERS=0

 Oracle defaults parallel settings based on


– CPU_COUNT and
– PARALLEL_THREADS_PER_CPU

 DOP Settings
– Statement level (SELECT /*+ PARALLEL(sales, 4) */ COUNT(*) FROM sales)
– Session level (ALTER SESSION FORCE PARALLEL)
– Table level (ALTER TABLE sales PARALLEL 4)
– Index Level (ALTER INDEX sales_uk PARALLEL 4)
Tuning General Parameters for Parallel Execution

 PARALLEL_MAX_SERVERS
– (CPU_COUNT x PARALLEL_THREADS_PER_CPU x (PGA_AGGREGATE_TARGET>0?
2:1) x 5

 PARALLEL_MIN_SERVERS (=0)
 PARALLEL_MIN_PERCENT (=0)
 SHARED_POOL_SIZE
 PGA_AGGREGATE_TARGET
 PARALLEL_EXECUTION_MESSAGE_SIZE
(=2K)
Monitoring and Diagnosing
Parallel Execution Performance

 Regression?
 Plan Change? (Explain Plan)
 Parallel Plan? (utlxplp.sql / dbms_xplan)
 Serial Plan?
– Add Index
– Compute Statistics
– Use histograms for non-uniform distributions
– Bind Variables
– I/O or CPU bound

 Parallel Execution?
– V$SESSTAT
– V$PX_SESSTAT
– V$PQ_SYSSTAT
Monitoring and Diagnosing
Parallel Execution Performance (Cont..)

 Workload Evenly Distributed? (V$PQ_TQSTAT)


– Device Contention?
– Controller Contention
– I/O bound – Parallelism up to the number of Devices
– CPU bound
Affinity and Parallel Operations

 Disk Affinity (Preferred Read Failure Groups)


 Device to Node Affinity
 Process to Device Affinity
Atrium Overview

 Distributed Task Scheduler


 Execute tasks on Windows/Unix
 Jobs may contain different tasks executed on
multiple servers
 Unified repository for runtime statistics, logs
 Task Types:
– Command line
– File transfer
– SQL
– PL/SQL
– SQL Loader
– DWH Loads (insert/merge..)
– Oracle Warehouse Builder Jobs
Atrium System Architecture

RSH

Atrium Runtime Crontab

Atrium Task
Atrium Portal
DWH01

RSH

Atrium DB
Atrium Runtime Crontab

Crontab Atrium Runtime Task


DWH02

RSH RSH
RSH

APP01 APP02 APPn

Task Task Task


Atrium – Web Interface
DWH – ETL Load Architecture

Stage Historical Data Data Mart

Extract External Table Load

External System SQL Loader DWH Table Merge DM Table

Materialized View
Serial Load – Object Definition

SH@sdwh:1:116>@size triad_s
TABLESPACE_NAME SIZE_MB
------------------------------ -------------
DWH_STAGE .06

SH@sdwh:1:116>@size triad_h
TABLESPACE_NAME SIZE_MB
------------------------------ -------------
DWH_HIST .06

SH@sdwh:1:116>@size triad_c
TABLESPACE_NAME SIZE_MB
------------------------------ -------------
DWH_DM .06

TABLE_NAME INDEX_NAME COL_NAME COL_POS


-------------------- -------------------- -------------------- -------
TRIAD_S TRIAD_S_UK BRACCTNO 1

TRIAD_H TRIAD_H_UK DWH_XDATE 1


TRIAD_H TRIAD_H_UK BRACCTNO 2

TRIAD_C TRIAD_C_UK BRACCTNO 1


Serial Load – S->H : Execution Plan

SH@sdwh:1:141>@xp_cursor 7aamxvqa09s6t

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
SQL_ID 7aamxvqa09s6t, child number 0
-------------------------------------
insert /*+ parallel (TRIAD_H,1,1) */ into TRIAD_H (BRACCTNO, ACCTNO,
CL_AMT_DUE1, CL_AMT_PAY1, CL_AMT_DUE2, CL_AMT_PAY2, CL_AMT_DUE3,

Plan hash value: 3573361879

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | 12426 (100)| |
| 1 | SEQUENCE | TRIAD_SEQ | | | | |
| 2 | TABLE ACCESS FULL| TRIAD_S | 872K| 399M| 12426 (3)| 00:02:30 |
--------------------------------------------------------------------------------

29 rows selected.
Serial Load – H->C : Execution Plan

SH@sdwh:1:141>@xp_cursor 57ccbj44gg5d4

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------
SQL_ID 57ccbj44gg5d4, child number 0
-------------------------------------
MERGE /*+ parallel (b,1,1) */ INTO TRIAD_C b USING ( select /*+ parallel
(TRIAD_H,1,1) */ * from TRIAD_H where
dwh_xdate=to_date('02/20/2010','MM/DD/YYYY') ) a ON (a.BRACCTNO=b.BRACCTNO)
WHEN MATCHED THEN UPDATE SET b.ACCTNO=a.ACCTNO, b.CL_AMT_DUE1=a.CL_AMT_DUE1,

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | | | | 68454 (100)| |
| 1 | MERGE | TRIAD_C | | | | | |
| 2 | VIEW | | | | | | |
|* 3 | HASH JOIN OUTER | | 872K| 834M| 426M| 68454 (1)| 00:13:42 |
|* 4 | TABLE ACCESS FULL| TRIAD_H | 872K| 416M| | 13036 (3)| 00:02:37 |
| 5 | TABLE ACCESS FULL| TRIAD_C | 872K| 417M| | 13033 (3)| 00:02:37 |
----------------------------------------------------------------------------------------

SH@sdwh:1:141>
Serial Load – Statistics
Parallel Load – Object Definition

SH@sdwh:1:141>@size triad_s1
TABLESPACE_NAME SIZE_MB
------------------------------ -------------
DWH_P1 .06
DWH_P2 .06
DWH_P3 .06
DWH_P4 .06

SH@sdwh:1:141>@size triad_h1
TABLESPACE_NAME SIZE_MB
------------------------------ -------------
DWH_P1 .06
DWH_P2 .06
DWH_P3 .06
DWH_P4 .06

SH@sdwh:1:141>@size triad_c1
TABLESPACE_NAME SIZE_MB
------------------------------ -------------
DWH_P1 .06
DWH_P2 .06
DWH_P3 .06
DWH_P4 .06
Parallel Load – S1->H1 : Execution Plan : DOP(4,1)
Parallel Load – H1->C1 : Execution Plan : DOP(4,1)
Parallel Load – Statistics : DOP(4,1)

INST_ID Username QC/Slave Slave Set SID QC SID Requested DOP Actual DOP
---------- ------------ ---------- ---------- ------ ------ ------------- ----------
1 SH QC 135 135
1 - p003 (Slave) 1 137 135 4 4
1 - p002 (Slave) 1 134 135 4 4
1 - p001 (Slave) 1 136 135 4 4
1 - p000 (Slave) 1 129 135 4 4
Parallel Load – Insert from Select – Plan : DOP(4,1)
Parallel Load – Insert from Select – Statistics : DOP(4,1)
Serial/Parallel Load – Statistics
Parallel Load – 2 Threads – QC/Slaves

SYSTEM@sdwh:1:128>@px

INST_ID Username QC/Slave Slave Set SID QC SID Requested DOP Actual DOP
---------- ------------ ---------- ---------- ------ ------ ------------- ----------
1 SH QC 125 125
1 - p007 (Slave) 2 126 125 4 4
1 - p006 (Slave) 2 137 125 4 4
1 - p005 (Slave) 2 130 125 4 4
1 - p004 (Slave) 2 154 125 4 4
1 - p003 (Slave) 1 134 125 4 4
1 - p002 (Slave) 1 136 125 4 4
1 - p001 (Slave) 1 142 125 4 4
1 - p000 (Slave) 1 129 125 4 4
1 SH QC 139 139
1 - p015 (Slave) 2 121 139 4 4
1 - p014 (Slave) 2 119 139 4 4
1 - p013 (Slave) 2 118 139 4 4
1 - p012 (Slave) 2 120 139 4 4
1 - p011 (Slave) 1 122 139 4 4
1 - p010 (Slave) 1 123 139 4 4
1 - p009 (Slave) 1 135 139 4 4
1 - p008 (Slave) 1 124 139 4 4
Parallel Load – 2 Threads – ASH
Parallel Load – 2 Threads – Session Stats
Parallel Load – 2 Threads – Load Stats
Serial Load – 2 Threads – ASH
Serial Load – 2 Threads – Load Stats
Parallel RAC Load – 1 Thread – Execution Plan : DOP(1,1)

SH@atrium:1:131>alter table triad_s1 noparallel;


Table altered.

SH@atrium:1:131>alter table triad_h1 noparallel;


Table altered.

SYSTEM@atrium:1:116>@xp_cursor ct93u9mmqghm1

SQL_ID ct93u9mmqghm1, child number 0


-------------------------------------
insert into triad_h1 a select b.*,0,trunc(sysdate-1),sysdate from triad_s1 b order by bracctno

--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | | 563K(100)| | | |
| 1 | SORT ORDER BY | | 1112K| 2512M| 8690M| 563K (1)| 01:52:48 | | |
| 2 | PARTITION HASH ALL| | 1112K| 2512M| | 14230 (3)| 00:02:51 | 1 | 4 |
| 3 | TABLE ACCESS FULL| TRIAD_S1 | 1112K| 2512M| | 14230 (3)| 00:02:51 | 1 | 4 |
--------------------------------------------------------------------------------------------------------
Parallel RAC Load – 1 Thread – Load Statistics : DOP(1,1)
Parallel RAC Load – 1 Thread – Execution Plan : DOP(8,2)

SYSTEM@atrium:2:135>@xp_cursor ct93u9mmqghm1

SQL_ID ct93u9mmqghm1, child number 0


-------------------------------------
insert into triad_h1 a select b.*,0,trunc(sysdate-1),sysdate from triad_s1 b order by bracctno

---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |Pstart |Pstop
---------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | | 39162 (100)| | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (ORDER) | :TQ10001 | 1112K| 2512M| | 39162 (1)| 00:07:50 | | |
| 3 | SORT ORDER BY | | 1112K| 2512M| 8690M| 39162 (1)| 00:07:50 | | |
| 4 | PX RECEIVE | | 1112K| 2512M| | 986 (3)| 00:00:12 | | |
| 5 | PX SEND RANGE | :TQ10000 | 1112K| 2512M| | 986 (3)| 00:00:12 | | |
| 6 | PX BLOCK ITERATOR | | 1112K| 2512M| | 986 (3)| 00:00:12 | 1 | 4 |
|* 7 | TABLE ACCESS FULL| TRIAD_S1 | 1112K| 2512M| | 986 (3)| 00:00:12 | 1 | 4 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


---------------------------------------------------
7 - access(:Z>=:Z AND :Z<=:Z)
Parallel RAC Load – 1 Thread – QC/Slaves : DOP(8,2)

SH@atrium:1:131>alter table triad_s1 parallel (degree 8 instances 2);

Table altered.

SH@atrium:1:131>alter table triad_h1 parallel (degree 8 instances 2);

Table altered.

SYSTEM@atrium:2:135>@px

INST_ID Username QC/Slave Slave Set SID QC SID Requested DOP Actual DOP
---------- ------------ ---------- ---------- ------ ------ ------------- ----------
1 SH QC 121 121
1 - p015 (Slave) 2 114 121 16 16
1 - p014 (Slave) 2 115 121 16 16
1 - p013 (Slave) 2 116 121 16 16
.. 12 more
1 - p000 (Slave) 1 128 121 16 16
2 - p015 (Slave) 2 117 121 16 16
2 - p014 (Slave) 2 124 121 16 16
2 - p013 (Slave) 2 120 121 16 16
.. 12 more
2 - p000 (Slave) 1 132 121 16 16
Parallel RAC Load – 1 Thread – Sessions : DOP(8,2)

SYSTEM@atrium:2:135>@px2 SH

INST_ID USERNAME PROGRAM SID SERIAL# SPID COMMAND


---------- ---------- ---------------------------------------- -------- ---------- ------------ ---------
1 SH sqlplus@dwh01.bbandt-lob.com (TNS V1-V3) 121 722 25351 Insert
1 SH oracle@dwh01.bbandt-lob.com (P014) 115 101 25398 Insert
1 SH oracle@dwh01.bbandt-lob.com (P013) 116 164 25396 Insert
1 SH oracle@dwh01.bbandt-lob.com (P012) 117 368 25394 Insert
.. 10 more
1 SH oracle@dwh01.bbandt-lob.com (P000) 128 54 25353 Insert
1 SH oracle@dwh01.bbandt-lob.com (P015) 114 99 25400 Insert
2 SH oracle@dwh02.bbandt-lob.com (P015) 117 1 18197 Insert
2 SH oracle@dwh02.bbandt-lob.com (P000) 132 1474 18141 Insert
2 SH oracle@dwh02.bbandt-lob.com (P013) 120 1 18193 Insert
2 SH oracle@dwh02.bbandt-lob.com (P012) 122 1 18191 Insert
.. 10 more
2 SH oracle@dwh02.bbandt-lob.com (P001) 126 5856 18143 Insert
2 SH oracle@dwh02.bbandt-lob.com (P014) 124 8 18195 Insert

33 rows selected.
Parallel RAC Load – 1 Thread – ASH : DOP(8,2)
Parallel RAC Load – 1 Thread – Load Statistics : DOP(8,2)
Parallel RAC Report – 1 Thread – Execution Plan : DOP(4,1)

SYSTEM@atrium:1:116>@xp_cursor afdptzdfqb7x6
SQL_ID afdptzdfqb7x6, child number 0
-------------------------------------
insert into triad_report1 select /*+ parallel(a,4,1) */ bracctno, avg(cl_raw_score1) cl_raw_score1_avg,

from sh.triad_h1 a group by bracctno

------------------------------------------------------------------------------------------
| Id | Operation | Name | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | | | |
| 1 | PX COORDINATOR | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | | | Q1,01 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | | | Q1,01 | PCWP | |
| 4 | PX RECEIVE | | | | Q1,01 | PCWP | |
| 5 | PX SEND HASH | :TQ10000 | | | Q1,00 | P->P | HASH |
| 6 | HASH GROUP BY | | | | Q1,00 | PCWP | |
| 7 | PX BLOCK ITERATOR | | 1 | 4 | Q1,00 | PCWC | |
|* 8 | TABLE ACCESS FULL| TRIAD_H1 | 1 | 4 | Q1,00 | PCWP | |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


---------------------------------------------------
8 - access(:Z>=:Z AND :Z<=:Z)
Parallel RAC Report – 1 Thread – Load Statistics : DOP(4,1)
Test Load Summary

Case Operation Threads DB Instances DOP ASC Time

1 Insert1 1 1 1 2 1:33

2 Insert1 1 1 4 7 1:33

5 Insert2 2 2 1 4 3:06

3 Insert2 1 1 4 7 1:05

4 Insert2 2 2 4 6 8:47

6 Insert2 1 2 1 2 4:30

7 Insert2 1 2 4 4 1:37

8 Insert2 1 2 8 16 5o:28
Database Resource Manager
Database Resource Manager : Execution time Limits..

SH_USER@sdwh:1:138>select
2 count(*)
3 from
4 sh.triad_s a
5 where
6 CL_BILL_BAL1 > (
7 select
8 avg(CL_BILL_BAL1)
9 from
10 sh.triad_s
11 where
12 cl_bill_bal1>a.cl_bill_bal2
13 )
14 ;
sh.triad_s a
*
ERROR at line 4:
ORA-07455: estimated execution time (50424 secs), exceeds limit (20 secs)

SH_USER@sdwh:1:138>
Database Resource Manager : DOP Limits..

SH_USER@sdwh:1:138>select /*+ parallel(a,32,1) */


2 count(*)
3 from
4 sh.triad_s a
5 where
6 cl_bill_bal1>0
7 ;

COUNT(*)
----------
542516

SYSTEM@sdwh:1:136>@px

INST_ID Username QC/Slave Slave Set SID QC SID Requested DOP Actual DOP
---------- ------------ ---------- ---------- ------ ------ ------------- ----------
1 SH_USER QC 138 138
1 - p003 (Slave) 1 135 138 32 4
1 - p002 (Slave) 1 145 138 32 4
1 - p001 (Slave) 1 134 138 32 4
1 - p000 (Slave) 1 137 138 32 4
Parallel Execution Tips

 Set Buffer Cache Size for Parallel Operations


(Update/Delete)
 Override the Default DOP
– Change PARALLEL_THREADS_PER_CPU
– Adjust DOP at Table/Session/Statement (Hints) Level
– Set PARALLEL_ADAPTIVE_MULTI_USER=TRUE

 Rewrite SQL Statements


 Create/Populate Tables in Parallel
 Create Temporary TBS for Parallel Sort and Hash Join
 Using EXPLAIN Plan to Show/Confirm PEX
 Be aware of Parallel Restrictions (version specific)
 Test, Test, Test !
Oracle 11gR2 : New Features

 Automatic DOP
 Statement Queuing
 In memory Parallel Execution
Oracle 11gR2 : Automatic DOP
Oracle 11gR2 : Statement Queuing
Oracle 11gR2 : In Memory PX
Oracle 11gR2 : PARALLEL_DEGREE_POLICY

 MANUAL : Disables automatic degree of parallelism, statement


queuing, and in-memory parallel execution. This reverts the
behavior of parallel execution to what it was prior to Oracle
Database 11g Release 2 (11.2). This is the default.
 LIMITED : Enables automatic degree of parallelism for some
statements but statement queuing and in-memory Parallel
Execution are disabled. Automatic degree of parallelism is only
applied to those statements that access tables or indexes decorated
explicitly with the PARALLEL clause. Tables and indexes that have a
degree of parallelism specified will use that degree of parallelism.
 AUTO : Enables automatic degree of parallelism, statement
queuing, and in-memory parallel execution.
Questions & Contact Info

 http://romeosoft.com/
 romeo@romeosoft.com

You might also like