Oracle10g數據庫自動診斷監視工具(ADDM)使用指南

[size=large][b]第一章 ADDM簡介[/b][/size]

在Oracle9i及之前,DBA們已經擁有了很多很好用的性能分析工具,比如,tkprof、sql_trace、statspack、set event 10046&10053等等。這些工具能夠幫助DBA很快的定位性能問題。但這些工具都只給出一些統計數據,然後再由DBA們根據自己的經驗進行優化。

那能不能由機器自動在統計數據的基礎上給出優化建議呢?Oracle10g中就推出了新的優化診斷工具:數據庫自動診斷監視工具(Automatic Database Diagnostic Monitor ADDM)和SQL優化建議工具(SQL Tuning Advisor STA)。這兩個工具的結合使用,能使DBA節省大量優化時間,也大大減少了系統宕機的危險。簡單點說,ADDM就是收集相關的統計數據到自動工作量知識庫(Automatic Workload Repository AWR)中,而STA則根據這些數據,給出優化建議。例如,一個系統資源緊張,出現了明顯的性能問題,由以往的辦法,做個一個statspack快照,等30分鐘,再做一次。查看報告,發現’ db file scattered read’事件在top 5 events裏面。根據經驗,這個事件一般可能是因爲缺少索引、統計分析信息不夠新、熱表都放在一個數據文件上導致IO爭用等原因引起的。根據這些經驗,我們需要逐個來定位排除,比如查看語句的查詢計劃、查看user_tables的last_analysed子段,檢查熱塊等等步驟來最後定位出原因,並給出優化建議。但是,有了STA以後,它就可以根據ADDM採集到的數據直接給出優化建議,甚至給出優化後的語句(搶了DBA的飯碗嘍)。

ADDM能發現定位的問題包括:
· 操作系統內存頁入頁出問題
· 由於Oracle負載和非Oracle負載導致的CPU瓶頸問題
· 導致不同資源負載的Top SQL語句和對象——CPU消耗、IO帶寬佔用、潛在IO問題、RAC內部通訊繁忙
· 按照PLSQL和JAVA執行時間排的Top SQL語句.
· 過多地連接 (login/logoff).
· 過多硬解析問題——由於shared pool過小、書寫問題、綁定大小不適應、解析失敗原因引起的。
· 過多軟解析問題
· 索引查詢過多導致資源爭用.
· 由於用戶鎖導致的過多的等待時間 (通過包dbms_lock加的鎖)
· 由於DML鎖導致的過多等待時間(例如鎖住表了)
· 由於管道輸出導致的過多等待時間(如通過包dbms_pipe.put進行管道輸出)
· 由於併發更新同一個記錄導致的過多等待時間(行級鎖等待)
· 由於ITL不夠導致的過多等待時間(大量的事務操作同一個數據塊)
· 系統中過多的commit和rollback(logfile sync事件).
· 由於磁盤帶寬太小和其他潛在問題(如由於logfile太小導致過多的checkpoint,MTTR設置問題,過多的undo操作等等)導致的IO性能問題I
· 對於DBWR進程寫數據塊,磁盤IO吞吐量不足
· 由於歸檔進程無法跟上redo日至產生的速度,導致系統變慢
· redo數據文件太小導致的問題
· 由於擴展磁盤分配導致的爭用
· 由於移動一個對象的高水位導致的爭用問題
· 內存太小問題——SGA Target, PGA, Buffer Cache, Shared Pool
· 在一個實例或者一個機羣環境中存在頻繁讀寫爭用的熱塊
· 在一個實例或者一個機羣環境中存在頻繁讀寫爭用的熱對象
· RAC環境中內部通訊問題


· LMS進程無法跟上導致鎖請求阻塞
· 在RAC環境中由於阻塞和爭用導致的實例傾斜
· RMAN導致的IO和CPU問題
· Streams和AQ問題
· 資源管理等待事件


有一點要記住:AWR收集的數據時放到內存中(share pool),通過一個新的後臺進程MMON定期寫到磁盤中。所以10g的share pool要求比以前版本更大,一般推薦比以前大15-20%。另外,還要求系統參數STATISTICS_LEVEL設置爲TYPICAL(推薦)或ALL;
ALTER SESSION SET STATISTICS_LEVEL= TYPICAL;

[b][size=large]第二章 工作採集、診斷過程[/size][/b]

Oracle10g提供了一個圖形化的界面(通過OEM),使這個工具使用起來非常簡單。下面這裏介紹一下如何通過sqlplus使用這個工具。這個工具的使用非常簡單,它是不需要安裝的。
[b]第一步:創建測試用的表[/b]
SQL> CREATE TABLE bigtab AS SELECT rownum as "id", a.* FROM dba_objects a;

Table created.
SQL> create table smalltab as select rownum as "id", a.* FROM dba_tables a;
Table created.
SQL> ALTER TABLE bigtab MODIFY (empno NUMBER);
Table altered.
SQL> DECLARE
n NUMBER;
BEGIN
FOR n IN 1..100
LOOP
INSERT INTO bigtab SELECT rownum as "id", a.* FROM dba_objects a;
COMMIT;
END LOOP;
END;
/

PL/SQL procedure successfully completed.

[b]第二步:採集一次工作量快照[/b]
SQL> begin
2 dbms_workload_repository.create_snapshot('TYPICAL');
3 end;
4 /

PL/SQL procedure successfully completed.

[b]第三步:進行一些高負荷操作[/b]
]DECLARE
v_var number;
BEGIN
FOR n IN 1..6
LOOP
select count(*) into v_var from bigtab b, smalltab a;

END LOOP;
END;

/

PL/SQL procedure successfully completed.
[b]
第四步:再次採集一次工作量快照[/b]

要注意的是:兩次快照之間的間隔時間必須足夠(一般推薦30分鐘左右),否則得到的ADDM報告中就會提示:THERE WAS NOT ENOUGH DATABASE TIME FOR ADDM ANALYSIS.
SQL> begin
2 dbms_workload_repository.create_snapshot('TYPICAL');
3 end;
4 /

PL/SQL procedure successfully completed.

[b]第五步:創建一個優化診斷任務並執行[/b]

先獲取到兩次快照的ID:
SQL> select snap_id from
2 (SELECT * FROM dba_hist_snapshot
3 ORDER BY snap_id desc)
4 where rownum <=2;

SNAP_ID
--------
66
65


然後創建優化任務,並執行。

DECLARE
task_name VARCHAR2(30) := 'DEMO_ADDM01';
task_desc VARCHAR2(30) := 'ADDM Feature Test';
task_id NUMBER;
BEGIN
dbms_advisor.create_task('ADDM', task_id, task_name, task_desc, null);
dbms_advisor.set_task_parameter(task_name, 'START_SNAPSHOT', 65);
dbms_advisor.set_task_parameter(task_name, 'END_SNAPSHOT', 66);
dbms_advisor.set_task_parameter(task_name, 'INSTANCE', 1);
dbms_advisor.set_task_parameter(task_name, 'DB_ID', 1712582900);
dbms_advisor.execute_task(task_name);
END;

/

PL/SQL procedure successfully completed.


其中,set_task_parameter是用來設置任務參數的。START_SNAPSHOT是起始快照ID,END_SNAPSHOT是結束快照ID,INSTANCE是實例號,對於單實例,一般是1,在RAC環境下,可以通過查詢視圖v$instance得到,DB_ID是數據庫的唯一識別號,可以通過查詢v$database查到。
[b]第六步:查看優化建議結果[/b]

通知函數dbms_advisor.get_task_report可以得到優化建議結果。

SQL> SET LONG 1000000 PAGESIZE 0 LONGCHUNKSIZE 1000
SQL> COLUMN get_clob FORMAT a80
SQL> SELECT dbms_advisor.get_task_report('DEMO_ADDM01', 'TEXT', 'ALL') FROM DUAL;

DBMS_ADVISOR.GET_TASK_REPORT('
--------------------------------------------------------------------------------
DETAILED ADDM REPORT FOR TASK 'DEMO_ADDM01' WITH ID 243
-------------------------------------------------------

Analysis Period: 23-NOV-2005 from 15:02:27 to 16:06:42
Database ID/Instance: 1712582900/1
Database/Instance Names: EDGAR/edgar
Host Name: HUANGED
Database Version: 10.2.0.1.0
Snapshot Range: from 65 to 66
Database Time: 1463 seconds
Average Database Load: .4 active sessions

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


FINDING 1: 100% impact (1463 seconds)
-------------------------------------
Significant virtual memory paging was detected on the host operating system.

RECOMMENDATION 1: Host Configuration, 100% benefit (1463 seconds)
ACTION: Host operating system was experiencing significant paging but no
particular root cause could be detected. Investigate processes that
do not belong to this instance running on the host that are consuming
significant amount of virtual memory. Also consider adding more
physical memory to the host.

FINDING 2: 100% impact (1463 seconds)
-------------------------------------
SQL statements consuming significant database time were found.

RECOMMENDATION 1: SQL Tuning, 68% benefit (998 seconds)
ACTION: Tune the PL/SQL block with SQL_ID "064wqx7c5b81z". Refer to the
"Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
and Reference"
RELEVANT OBJECT: SQL statement with SQL_ID 064wqx7c5b81z
DECLARE
v_var number;
BEGIN
FOR n IN 1..10000
LOOP
select count(*) into v_var from bigtab b, smalltab a;
END LOOP;
END;

RECOMMENDATION 2: SQL Tuning, 67% benefit (986 seconds)
ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID
"fvqfghq71cqns".
RELEVANT OBJECT: SQL statement with SQL_ID fvqfghq71cqns and
PLAN_HASH 3281046854
SELECT COUNT(*) FROM BIGTAB B, SMALLTAB A
RATIONALE: SQL statement with SQL_ID "fvqfghq71cqns" was executed 6
times and had an average elapsed time of 166 seconds.

FINDING 3: 69% impact (1002 seconds)
------------------------------------
Time spent on the CPU by the instance was responsible for a substantial part
of database time.

RECOMMENDATION 1: SQL Tuning, 67% benefit (986 seconds)
ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID
"fvqfghq71cqns".
RELEVANT OBJECT: SQL statement with SQL_ID fvqfghq71cqns and
PLAN_HASH 3281046854
SELECT COUNT(*) FROM BIGTAB B, SMALLTAB A
RATIONALE: SQL statement with SQL_ID "fvqfghq71cqns" was executed 6
times and had an average elapsed time of 166 seconds.
RATIONALE: Average CPU used per execution was 162 seconds.

RECOMMENDATION 2: SQL Tuning, 2.1% benefit (30 seconds)
ACTION: Tune the PL/SQL block with SQL_ID "2b064ybzkwf1y". Refer to the
"Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
and Reference"
RELEVANT OBJECT: SQL statement with SQL_ID 2b064ybzkwf1y
BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;
RATIONALE: SQL statement with SQL_ID "2b064ybzkwf1y" was executed 125
times and had an average elapsed time of 0.26 seconds.
RATIONALE: Average CPU used per execution was 0.24 seconds.

FINDING 4: 2.2% impact (33 seconds)
-----------------------------------
PL/SQL execution consumed significant database time.

RECOMMENDATION 1: SQL Tuning, 2.2% benefit (33 seconds)
ACTION: Tune the PL/SQL block with SQL_ID "2b064ybzkwf1y". Refer to the
"Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
and Reference"
RELEVANT OBJECT: SQL statement with SQL_ID 2b064ybzkwf1y
BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;
RATIONALE: SQL statement with SQL_ID "2b064ybzkwf1y" was executed 125
times and had an average elapsed time of 0.26 seconds.
RATIONALE: Average time spent in PL/SQL execution was 0.26 seconds.


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

ADDITIONAL INFORMATION
----------------------

Wait class "Application" was not consuming significant database time.
Wait class "Commit" was not consuming significant database time.
Wait class "Concurrency" was not consuming significant database time.
Wait class "Configuration" was not consuming significant database time.
Wait class "Network" was not consuming significant database time.
Wait class "User I/O" was not consuming significant database time.
Session connect and disconnect calls were not consuming significant database
time.
Hard parsing of SQL statements was not consuming significant database time.

The analysis of I/O performance is based on the default assumption that the
average read time for one database block is 10000 micro-seconds.


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TERMINOLOGY
-----------

DATABASE TIME: This is the ADDM's measurement of throughput. From the user's
point of view: this is the total amount of time spent by users waiting for
a response from the database after issuing a call (not including
networking). From the database instance point of view: this is the total
time spent by forground processes waiting for a database resource (e.g.,
read I/O), running on the CPU and waiting for a free CPU (run-queue). The
target of ADDM analysis is to reduce this metric as much as possible,
thereby reducing the instance's response time.

AVERAGE DATABASE LOAD: At any given time we can count how many users (also
called 'Active Sessions') are waiting for an answer from the instance. This
is the ADDM's measurement for instance load. The 'Average Database Load' is
the average of the the load measurement taken over the entire analysis
period. We get this number by dividing the 'Database Time' by the analysis
period. For example, if the analysis period is 30 minutes and the 'Database
Time' is 90 minutes, we have an average of 3 users waiting for a response.

IMPACT: Each finding has an 'Impact' associated with it. The impact is the
portion of the 'Database Time' the finding deals with. If we assume that
the problem described by the finding is completely solved, then the
'Database Time' will be reduced by the amount of the 'Impact'.

BENEFIT: Each recommendation has a 'benefit' associated with it. The ADDM
analysis estimates that the 'Database Time' can be reduced by the 'benefit'
amount if all the actions of the recommendation are performed.

說明:

其中第五步到第六步可以直接執行$ORACLE_HOME/rdbms/admin/addmrpt.sql來得到,這個腳本的執行過程和statspack腳本執行過程類似:
SQL> @addmrpt

Current Instance
~~~~~~~~~~~~~~~~

DB Id DB Name Inst Num Instance
----------- ------------ -------- ------------
1712582900 EDGAR 1 edgar


Instances in this Workload Repository schema
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

DB Id Inst Num DB Name Instance Host
------------ -------- ------------ ------------ ------------
* 1712582900 1 EDGAR edgar HUANGED

Using 1712582900 for database Id
Using 1 for instance number


Specify the number of days of snapshots to choose from
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Entering the number of days (n) will result in the most recent
(n) days of snapshots being listed. Pressing <return> without
specifying a number lists all completed snapshots.



Listing the last 3 days of Completed Snapshots

Snap
Instance DB Name Snap Id Snap Started Level
------------ ------------ --------- ------------------ -----
edgar EDGAR 7 22 Nov 2005 00:00 1
... ...
64 23 Nov 2005 15:02 1
65 23 Nov 2005 16:00 1
66 23 Nov 2005 16:06 1


Specify the Begin and End Snapshot Ids
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Enter value for begin_snap: 65
Begin Snapshot Id specified: 66

Enter value for end_snap: 66
End Snapshot Id specified: 66



Specify the Report Name
~~~~~~~~~~~~~~~~~~~~~~~
The default report file name is addmrpt_1_65_66.txt. To use this name,
press <return> to continue, otherwise enter an alternative.

Enter value for report_name:

Using the report name addmrpt_1_65_66.txt


Running the ADDM analysis on the specified pair of snapshots ...


Generating the ADDM report for this analysis ...

... ...


此外,如果是RAC環境下,可以執行$ORACLE_HOME/rdbms/admin/addmrpti.sql,這腳本的執行,會多出要求輸入DB ID和instance ID的要求。
第三章 診斷結果分析

我們從上面的建議結果看到了,ADDM Report的結果與Statspack Report的結果大不相同。Statspack Report的結果給出的都是統計數據、各種事件,然後由DBA根據這些數據給出優化建議,而ADDM Report的結果包含就已經是給出的優化建議了(汗!DBA以後要失業了!)
第一部分:
Analysis Period: 23-NOV-2005 from 15:02:27 to 16:06:42
Database ID/Instance: 1712582900/1
Database/Instance Names: EDGAR/edgar
Host Name: HUANGED
Database Version: 10.2.0.1.0
Snapshot Range: from 65 to 66
Database Time: 1463 seconds
Average Database Load: .4 active sessions


這一部分包括一些基礎信息,分析時間段、DB和instance ID&名字、主機名字、Oracle版本、快照範圍、數據庫消耗時間、多少個活動會話。
第二部分:

下面就是ADDM發現的問題,並給出的相應建議。在我們這個例子中總共發現4個問題,下面一一解釋一下。第一個問題:
FINDING 1: 100% impact (1463 seconds)
-------------------------------------
Significant virtual memory paging was detected on the host operating system.

RECOMMENDATION 1: Host Configuration, 100% benefit (1463 seconds)
ACTION: Host operating system was experiencing significant paging but no
particular root cause could be detected. Investigate processes that
do not belong to this instance running on the host that are consuming
significant amount of virtual memory. Also consider adding more
physical memory to the host.

先看第一行:100% impact (1463 seconds),這是這個問題所持續的實踐及其對系統的影響,它的時間是1463秒,和分析期間的數據庫消耗時間(在第一部分中)是一樣(1463秒),所以對系統的影響是1463/1463*100=100%的。

再看第二行:Significant virtual memory paging was detected on the host operating system.,這是ADDM發現的這個問題的具體描述:在操作系統中發現有顯著的虛擬內存頁入頁出的問題。

然後看ADDM給出的建議及其作用:Host Configuration, 100% benefit (1463 seconds)——更改主機配置,100%有效。

最後是具體該如何操作:略——在主機的操作系統上發現了明顯的頁入頁出,但是沒有發現明顯導致內存頻繁換如換出的根本原因。需要仔細檢查那些消耗大量虛擬內存的進程(除Oracle實例外)。此外,還可以考慮增大主機的物理內存。說明一下:我的這個實例是跑在我自己的PC機上,Oracle運行的同時有大量的其他消耗內存的程序(word等)在運行,所以肯定有大量的內存交換存在。

再看第二個問題:
FINDING 2: 100% impact (1463 seconds)
-------------------------------------
SQL statements consuming significant database time were found.

RECOMMENDATION 1: SQL Tuning, 68% benefit (998 seconds)
ACTION: Tune the PL/SQL block with SQL_ID "064wqx7c5b81z". Refer to the
"Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
and Reference"
RELEVANT OBJECT: SQL statement with SQL_ID 064wqx7c5b81z
DECLARE
v_var number;
BEGIN
FOR n IN 1..10000
LOOP
select count(*) into v_var from bigtab b, smalltab a;
END LOOP;
END;

ADDM發現有SQL語句在消耗大量數據庫時間,它的影響是100%的。給出的建議是優化SQL,能取得68%的效果。

具體操作是優化ADDM找到的PL/SQL塊,它的SQL_ID是"064wqx7c5b81z"(可以通過select sql_text from v$sql where sql_id=’064wqx7c5b81z’;查到)。至於如何優化SQL語句,可以參考Oracle文檔PL/SQL User's Guide and Reference中的Tuning PL/SQL Applications章節。下面的內容便是我們用來插入數據的測試語句。

下面是ADDM發現的其他問題語句:
FINDING 3: 69% impact (1002 seconds)
------------------------------------
Time spent on the CPU by the instance was responsible for a substantial part
of database time.

RECOMMENDATION 1: SQL Tuning, 67% benefit (986 seconds)
ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID
"fvqfghq71cqns".
RELEVANT OBJECT: SQL statement with SQL_ID fvqfghq71cqns and
PLAN_HASH 3281046854
SELECT COUNT(*) FROM BIGTAB B, SMALLTAB A
RATIONALE: SQL statement with SQL_ID "fvqfghq71cqns" was executed 6
times and had an average elapsed time of 166 seconds.
RATIONALE: Average CPU used per execution was 162 seconds.

RECOMMENDATION 2: SQL Tuning, 2.1% benefit (30 seconds)
ACTION: Tune the PL/SQL block with SQL_ID "2b064ybzkwf1y". Refer to the
"Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
and Reference"
RELEVANT OBJECT: SQL statement with SQL_ID 2b064ybzkwf1y
BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;
RATIONALE: SQL statement with SQL_ID "2b064ybzkwf1y" was executed 125
times and had an average elapsed time of 0.26 seconds.
RATIONALE: Average CPU used per execution was 0.24 seconds.

這個問題的描述是,實例消耗的CPU事件佔據了大量的數據庫運行時間。由於發現了兩條問題語句,所以這裏有兩個建議。

第一個建議就是優化我們的測試語句。並且說明了這個問題的根本原因:這條語句總共執行過6次,平均每次消耗了166秒。平均這個問題消耗的CPU時間是162秒。

第二個建議實際上是針對一個系統過程,這個過程是用來讀取隊列信息的,消耗的資源比較小,我們這裏就不需要關心了。

再看最後一個問題:
FINDING 4: 2.2% impact (33 seconds)
-----------------------------------
PL/SQL execution consumed significant database time.

RECOMMENDATION 1: SQL Tuning, 2.2% benefit (33 seconds)
ACTION: Tune the PL/SQL block with SQL_ID "2b064ybzkwf1y". Refer to the
"Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
and Reference"
RELEVANT OBJECT: SQL statement with SQL_ID 2b064ybzkwf1y
BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;
RATIONALE: SQL statement with SQL_ID "2b064ybzkwf1y" was executed 125
times and had an average elapsed time of 0.26 seconds.
RATIONALE: Average time spent in PL/SQL execution was 0.26 seconds.

從內容上看,這個問題就是上一個問題中的第二個建議。但是,它導致的結果是不一樣的。看這個問題的描述:PL/SQL的執行次數消耗了大量的數據庫時間。它的根本原因是因爲執行次數太多(125次)。可見ADDM的問題檢查相當全面。
第三部分:

這一部分的內容是關於此次優化建議的一些附加信息:
ADDITIONAL INFORMATION
----------------------

Wait class "Application" was not consuming significant database time.
Wait class "Commit" was not consuming significant database time.
Wait class "Concurrency" was not consuming significant database time.
Wait class "Configuration" was not consuming significant database time.
Wait class "Network" was not consuming significant database time.
Wait class "User I/O" was not consuming significant database time.
Session connect and disconnect calls were not consuming significant database
time.
Hard parsing of SQL statements was not consuming significant database time.

The analysis of I/O performance is based on the default assumption that the
average read time for one database block is 10000 micro-seconds.

這是關於這次優化診斷對各類事件(在Oracle10g,新增了很多新的事件,主要是將原先一些較含糊的事件細化了,同時將所有事件進行了歸類。可以查看視圖V$SYSTEM_WAIT_CLASS)的一些總結:Application、Commit、Concurrency、Configuration、Network、User I/O類等待事件沒有顯著消耗數據庫時間;會話連接、斷連請求沒有消耗大量數據庫時間;對SQL語句的硬解析沒有消耗大量數據庫時間;對IO性能的分析是基於默認假設每次讀一個數據塊的時間是10000微秒的。
第四部分:

這部分是對診斷報告中用到的術語的解釋:
TERMINOLOGY
-----------

DATABASE TIME: This is the ADDM's measurement of throughput. From the user's
point of view: this is the total amount of time spent by users waiting for
a response from the database after issuing a call (not including
networking). From the database instance point of view: this is the total
time spent by forground processes waiting for a database resource (e.g.,
read I/O), running on the CPU and waiting for a free CPU (run-queue). The
target of ADDM analysis is to reduce this metric as much as possible,
thereby reducing the instance's response time.

AVERAGE DATABASE LOAD: At any given time we can count how many users (also
called 'Active Sessions') are waiting for an answer from the instance. This
is the ADDM's measurement for instance load. The 'Average Database Load' is
the average of the the load measurement taken over the entire analysis
period. We get this number by dividing the 'Database Time' by the analysis
period. For example, if the analysis period is 30 minutes and the 'Database
Time' is 90 minutes, we have an average of 3 users waiting for a response.

IMPACT: Each finding has an 'Impact' associated with it. The impact is the
portion of the 'Database Time' the finding deals with. If we assume that
the problem described by the finding is completely solved, then the
'Database Time' will be reduced by the amount of the 'Impact'.

BENEFIT: Each recommendation has a 'benefit' associated with it. The ADDM
analysis estimates that the 'Database Time' can be reduced by the 'benefit'
amount if all the actions of the recommendation are performed.


DATABASE TIME:是ADDM的度量數據。從用戶角度看:這是從向數據庫請求開始,消耗在用戶等待響應上的全部時間(不包括網絡響應時間);從數據庫實例角度看:前臺進程消耗在等待一種數據庫資源(例如,IO讀)、CPU運行和等待CPU釋放(隊列等待)的總共時間。ADDM分析的目標就儘量降低這個數字,也就是減少實例響應時間。

AVERAGE DATABASE LOAD:所有能統計到的有多少用戶(也稱爲“活動會話”)等待實例響應。這是實例負荷的度量指標。平均數據庫負荷是由整個分析計算出來的平均負荷。通過“Database Time”除以分析週期時間得到。例如,分析週期時30分鐘,而數據庫運行消耗時間是90分鐘,那就說明平均有3個用戶在等待響應。

IMPACT:每一個找到的問題都有“影響”這一項。“影響”是數據庫消耗時間用於處理這個問題的時間不分。假定我們所找到的這個問題完全解決,那麼數據庫消耗時間就會相應減少“影響”時間。

BENEFIT:每一個找到的問題都“受益”這一項。如果所有建議操作得到實施,ADDM分析估計數據庫消耗時間能減少“受益”的全部時間。
第四章 使用STA來優化語句

ADDM得出了診斷結果,並給出了優化建議。通常90%的性能問題都是由於應用引起的,而應用問題肯定離不開問題語句。那麼如何優化這些語句呢,以前靠的是DBA的經驗,現在就可以使用STA了。

備註:關於STA的使用,請參考我的另一篇文章《利用SQL優化器(STA)優化語句》

STA的使用非常簡單,從ADDM診斷報告中,發現有一個與句很影響系統性能,我們就對這條語句進行優化:
第一步:創建優化任務並執行
SQL> DECLARE
2 my_task_name VARCHAR2(30);
3 my_sqltext CLOB;
4 BEGIN
5 my_sqltext := 'select a.table_name, b.object_id from bigtab b, smalltab a';
6
7 my_task_name := DBMS_SQLTUNE.CREATE_TUNING_TASK(
8 sql_text => my_sqltext,
9 user_name => 'DEMO',
10 scope => 'COMPREHENSIVE',
11 time_limit => 60,
12 task_name => 'TEST_sql_tuning_task',
13 description => 'Task to tune a query on a specified PRODUCT');
14
15 dbms_sqltune.Execute_tuning_task (task_name => 'TEST_sql_tuning_task');
16 END;
17 /

PL/SQL procedure successfully completed


DBMS_SQLTUNE.CREATE_TUNING_TASK就是用來創建優化任務的函數。其中,sql_text是需要優化的語句,user_name是該語句通過哪個用戶執行,scope是優化範圍(limited或comprehensive),time_limit優化過程的時間限制,task_name優化任務名稱,description優化任務描述。

dbms_sqltune.Execute_tuning_task是執行優化的函數。
第二步:查看優化建議結果
SQL> set long 10000
SQL> set longchunksize 1000
Cannot SET LONGCHUNKSIZE

SQL> set linesize 100
SQL> SELECT DBMS_SQLTUNE.REPORT_TUNING_TASK('TEST_sql_tuning_task') FROM DUAL;

DBMS_SQLTUNE.REPORT_TUNING_TAS
--------------------------------------------------------------------------------
GENERAL INFORMATION SECTION
-------------------------------------------------------------------------------
Tuning Task Name : TEST_sql_tuning_task
Tuning Task Owner : DEMO
Scope : COMPREHENSIVE
Time Limit(seconds) : 60
Completion Status : COMPLETED
Started at : 11/24/2005 14:46:23
Completed at : 11/24/2005 14:46:24
Number of SQL Restructure Findings: 1

-------------------------------------------------------------------------------
Schema Name: DEMO
SQL ID : f5k29d73zdpsd
SQL Text : select a.table_name, b.object_id from bigtab b, smalltab a

-------------------------------------------------------------------------------
FINDINGS SECTION (1 finding)
-------------------------------------------------------------------------------

1- Restructure SQL finding (see plan 1 in explain plans section)
----------------------------------------------------------------
An expensive cartesian product operation was found at line ID 1 of the
execution plan.

Recommendation
--------------
- Consider removing the disconnected table or view from this statement or
add a join condition which refers to it.

Rationale
---------
A cartesian product should be avoided whenever possible because it is an
expensive operation and might produce a large amount of data.

-------------------------------------------------------------------------------
EXPLAIN PLANS SECTION
-------------------------------------------------------------------------------

1- Original
-----------
Plan hash value: 3479921507

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1474M| 32G| 4316K (2)| 14:23:21
| 1 | MERGE JOIN CARTESIAN| | 1474M| 32G| 4316K (2)| 14:23:21
| 2 | TABLE ACCESS FULL | SMALLTAB | 1223 | 23237 | 11 (0)| 00:00:01
| 3 | BUFFER SORT | | 1205K| 5887K| 4316K (2)| 14:23:21
| 4 | TABLE ACCESS FULL | BIGTAB | 1205K| 5887K| 3530 (2)| 00:00:43
--------------------------------------------------------------------------------

-------------------------------------------------------------------------------


優化建議結果分爲三部分,下面簡單介紹一下每個部分。
第一部分:

和所有Oracle的報告一樣,第一部分總是基礎信息J。
GENERAL INFORMATION SECTION
-------------------------------------------------------------------------------
Tuning Task Name : TEST_sql_tuning_task
Tuning Task Owner : DEMO
Scope : COMPREHENSIVE
Time Limit(seconds) : 60
Completion Status : COMPLETED
Started at : 11/24/2005 14:46:23
Completed at : 11/24/2005 14:46:24
Number of SQL Restructure Findings: 1

-------------------------------------------------------------------------------
Schema Name: DEMO
SQL ID : f5k29d73zdpsd
SQL Text : select a.table_name, b.object_id from bigtab b, smalltab a

這部分信息包括:任務名稱、任務所有者、任務範圍、任務執行時間限制(前面幾個信息實際上就是我們創建任務時指定的參數)、任務狀態、任務開始時間、完成時間、發現的需要重新構造的問題語句數量。接下來就是schema名稱、SQL ID和SQL內容。
第二部分:

這部分是優化器發現的需要優化的語句,在我們的例子中,肯定只有一條:
-------------------------------------------------------------------------------
FINDINGS SECTION (1 finding)
-------------------------------------------------------------------------------

1- Restructure SQL finding (see plan 1 in explain plans section)
----------------------------------------------------------------
An expensive cartesian product operation was found at line ID 1 of the
execution plan.

Recommendation
--------------
- Consider removing the disconnected table or view from this statement or
add a join condition which refers to it.

Rationale
---------
A cartesian product should be avoided whenever possible because it is an
expensive operation and might produce a large amount of data.


首先是在查詢計劃中發現的問題:在語句1的查詢計劃中發現有很大的笛卡爾積操作(顯然,我們的語句是一個外連接操作,做的就是迪卡爾積操作);

其次是針對問題給出的建議:考慮將不需要做連接查詢的表或視圖去掉,或者增加一個連接條件;

最後導致問題的根本原因:由於迪卡爾積是一個會產生很大數據量的消耗資源的操作,所以需要儘量避免迪卡爾積操作。
第三部分:

這一部分是SQL語句的查詢計劃。
-------------------------------------------------------------------------------
EXPLAIN PLANS SECTION
-------------------------------------------------------------------------------

1- Original
-----------
Plan hash value: 3479921507

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1474M| 32G| 4316K (2)| 14:23:21
| 1 | MERGE JOIN CARTESIAN| | 1474M| 32G| 4316K (2)| 14:23:21
| 2 | TABLE ACCESS FULL | SMALLTAB | 1223 | 23237 | 11 (0)| 00:00:01
| 3 | BUFFER SORT | | 1205K| 5887K| 4316K (2)| 14:23:21
| 4 | TABLE ACCESS FULL | BIGTAB | 1205K| 5887K| 3530 (2)| 00:00:43
--------------------------------------------------------------------------------

-------------------------------------------------------------------------------

這裏給出的是語句優化前的原始查詢計劃。如果優化器認爲這條語句需要重寫,它還會給出重寫後的查詢計劃。
第五章 總結

哎,不多說了,總之一個字:簡單!如果Oracle在這樣下去,這個世界上誰都是DBA,誰都不需要DBA了。

作者: fuyuncat
來源: www.HelloDBA.com
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章