接上篇:
SQL Server性能調教系列(4)--Profiler(上)
3.分析跟蹤記錄
在跟蹤了一段時間之後,在文件中就會保存有跟蹤的數據(包括IO,Duration,CPU,Reads,Writes,RowCounts等計數器),接下來就是把跟蹤的數據加載到表並分析這些數據。可以選擇在Profile中打開並檢查這些跟蹤數據,會有些限制,如不能完成太多的操作,大量重複的SQL語句,沒有彙總。
3.1 加載數據到表(使用函數fn_trace_gettable返回表格形式的數據,作爲範例只選擇分析T-SQL代碼和Duration查詢的運行時間)
select CAST(textdata as nvarchar(max)) as tsql_code,duration
into Workload
from sys.fn_trace_gettable('C:\test\performancetrace_20100802.trc',NULL) as TT
3.2 彙總相同的SQL項
select tsql_code,SUM(duration) as total_duration from workload group by tsql_code
(由於我是在Production上面做的trace,考慮到系統的安全性,在此不便透露分析的SQL代碼,實在很抱歉,各位朋友如有興趣可在自己的測試環境中測試,討論測試的結果)
問題:分組聚合後會看到邏輯上相同(參數不同)的查詢會被分到不同的組,因爲在篩選器中使用了不同的值。因爲這些相同邏輯的SQL會使用相同的執行計劃,應該聚合在一起才能準備的分析總的查詢運行的時間。
3.3 問題處理方案一(大致分段截取)
通常情況下SQL語句都是Select+欄位,左邊有很大一部分是相同的,根據SQL字符的長度,截取前一段來聚合。如取前50,100,150. 方法簡單,容易操作,會聚合一部分數據,但是長度不太好取值,只能調整前綴的長度去測試。
select left(tsql_code,50) as t_sql,SUM(duration) as total_duration from workload group by left(tsql_code,50)
--or
select left(tsql_code,100) as t_sql,SUM(duration) as total_duration from workload group by left(tsql_code,100)
--or
select left(tsql_code,150) as t_sql,SUM(duration) as total_duration from workload group by left(tsql_code,150)
3.4 問題處理方案二(複雜,精確,邏輯上相同的SQL,參數用通配符替代),這個方法是T-SQL查詢技術內幕中介紹的方法,如果需要更加詳細的說明,請閱讀這本書,你會得到更多的啓發。
(1) 模式化查詢,它對於相同模式的查詢是一樣的。
- T-SQL函數實現
建立函數:
CREATE FUNCTION [dbo].[fn_SQLSigTSQL]
(@p1 NTEXT, @parselength INT = 4000)
RETURNS NVARCHAR(4000)
-- This function will replace the parameters with '#'
-- This function is provided "AS IS" with no warranties,
-- and confers no rights.
-- Use of included script samples are subject to the terms specified at
-- http://www.microsoft.com/info/cpyright.htm
--
-- Strips query strings
AS
BEGIN
DECLARE @pos AS INT;
DECLARE @mode AS CHAR(10);
DECLARE @maxlength AS INT;
DECLARE @p2 AS NCHAR(4000);
DECLARE @currchar AS CHAR(1), @nextchar AS CHAR(1);
DECLARE @p2len AS INT;
SET @maxlength = LEN(RTRIM(SUBSTRING(@p1,1,4000)));
SET @maxlength = CASE WHEN @maxlength > @parselength
THEN @parselength ELSE @maxlength END;
SET @pos = 1;
SET @p2 = '';
SET @p2len = 0;
SET @currchar = '';
set @nextchar = '';
SET @mode = 'command';
WHILE (@pos <= @maxlength)
BEGIN
SET @currchar = SUBSTRING(@p1,@pos,1);
SET @nextchar = SUBSTRING(@p1,@pos+1,1);
IF @mode = 'command'
BEGIN
SET @p2 = LEFT(@p2,@p2len) + @currchar;
SET @p2len = @p2len + 1 ;
IF @currchar IN (',','(',' ','=','<','>','!')
AND @nextchar BETWEEN '0' AND '9'
BEGIN
SET @mode = 'number';
SET @p2 = LEFT(@p2,@p2len) + '#';
SET @p2len = @p2len + 1;
END
IF @currchar = ''''
BEGIN
SET @mode = 'literal';
SET @p2 = LEFT(@p2,@p2len) + '#''';
SET @p2len = @p2len + 2;
END
END
ELSE IF @mode = 'number' AND @nextchar IN (',',')',' ','=','<','>','!')
SET @mode= 'command';
ELSE IF @mode = 'literal' AND @currchar = ''''
SET @mode= 'command';
SET @pos = @pos + 1;
END
RETURN @p2;
END
該函數參數爲一個查詢字符串和要分析的代碼的長度,但會輸入查詢的簽名,並用井號(#)替換所有的參數。測試結果如下:
select dbo.fn_SQLSigTSQL('select * from Sales.SalesOrderHeader where SalesOrderID=''43659'' and Status=''5'' ',500)
- CLR實現
CLR在處理迭代/過程邏輯和字符串處理時比T-SQL效率高,下面介紹用CLR實現模式化查詢。
a. 建立C#版的Classs Libary,函數如下:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.SqlServer.Server;
using System.Data.SqlTypes;
using System.Text.RegularExpressions;
public partial class SQLSignature
{
// fn_SQLSigCLR
[SqlFunction(IsDeterministic = true, DataAccess = DataAccessKind.None)]
public static SqlString fn_SQLSigCLR(SqlString querystring)
{
return (SqlString)Regex.Replace(
querystring.Value,
@"([\s,(=<>!](?![^\]]+[\]]))(?:(?:(?:(?# expression coming
)(?:([N])?(')(?:[^']|'')*('))(?# character
)|(?:0x[\da-fA-F]*)(?# binary
)|(?:[-+]?(?:(?:[\d]*\.[\d]*|[\d]+)(?# precise number
)(?:[eE]?[\d]*)))(?# imprecise number
)|(?:[~]?[-+]?(?:[\d]+))(?# integer
))(?:[\s]?[\+\-\*\/\%\&\|\^][\s]?)?)+(?# operators
))",
@"$1$2$3#$4");
}
// fn_RegexReplace - for generic use of RegEx-based replace
[SqlFunction(IsDeterministic = true, DataAccess = DataAccessKind.None)]
public static SqlString fn_RegexReplace(
SqlString input, SqlString pattern, SqlString replacement)
{
return (SqlString)Regex.Replace(
input.Value, pattern.Value, replacement.Value);
}
}
b. 加載.dll中間語言代碼到DB
USE master;
CREATE ASSEMBLY SQLSignature
FROM 'C:\SQLSignature\SQLSignature\bin\Debug\SQLSignature.dll';
c. 註冊函數fn_SQLSigCLR和fn_RegexReplace
CREATE FUNCTION dbo.fn_SQLSigCLR(@querystring AS NVARCHAR(MAX))
RETURNS NVARCHAR(MAX)
WITH RETURNS NULL ON NULL INPUT
EXTERNAL NAME SQLSignature.SQLSignature.fn_SQLSigCLR;
GO
CREATE FUNCTION dbo.fn_RegexReplace(
@input AS NVARCHAR(MAX),
@pattern AS NVARCHAR(MAX),
@replacement AS NVARCHAR(MAX))
RETURNS NVARCHAR(MAX)
WITH RETURNS NULL ON NULL INPUT
EXTERNAL NAME SQLSignature.SQLSignature.fn_RegexReplace;
GO
d. 註冊完成之後,用下面代碼測試:
SELECT
dbo.fn_SQLSigCLR(tsql_code) AS sig_sql,
duration
FROM dbo.Workload;
結果的SQL全被模式化,井號(#)替代所有的參數。
(2) 以用上面建立的函數,模式化追蹤的T-SQL語句,並分類彙總。
a. 以用查詢簽名,爲每個字符串生成整數的校驗和(CheckSum),方便以後的彙總計算,提高效率:
ALTER TABLE dbo.Workload ADD cs INT NOT NULL DEFAULT (0);
GO
UPDATE dbo.Workload
SET cs = CHECKSUM(dbo.fn_SQLSigCLR(tsql_code));
CREATE CLUSTERED INDEX idx_cl_cs ON dbo.Workload(cs);
b. 用每個簽名的檢驗和計算運行時間填充臨時表#AggQueries,包括運行時間的百分比,以及運行時間降序的行號。
IF OBJECT_ID('tempdb..#AggQueries') IS NOT NULL
DROP TABLE #AggQueries;
GO
SELECT cs, SUM(duration) AS total_duration,
100. * SUM(duration) / SUM(SUM(duration)) OVER() AS pct,
ROW_NUMBER() OVER(ORDER BY SUM(duration) DESC) AS rn
INTO #AggQueries
FROM dbo.Workload
GROUP BY cs;
CREATE CLUSTERED INDEX idx_cl_cs ON #AggQueries(cs);
查詢聚合之後臨時表的內容,數據量會大大的減少,包含簽名,總的運行時間,運行時間佔總運行時間的半分比,排序序號。
c.篩選並匹配,使用APPLY運算符得到查詢模式和一個示例查詢。
WITH RunningTotals AS
(
SELECT AQ1.cs,
CAST(AQ1.total_duration / 1000.
AS DECIMAL(12, 2)) AS total_s,
CAST(SUM(AQ2.total_duration) / 1000.
AS DECIMAL(12, 2)) AS running_total_s,
CAST(AQ1.pct AS DECIMAL(12, 2)) AS pct,
CAST(SUM(AQ2.pct) AS DECIMAL(12, 2)) AS run_pct,
AQ1.rn
FROM #AggQueries AS AQ1
JOIN #AggQueries AS AQ2
ON AQ2.rn <= AQ1.rn
GROUP BY AQ1.cs, AQ1.total_duration, AQ1.pct, AQ1.rn
HAVING SUM(AQ2.pct) - AQ1.pct <= 90 -- percentage threshold
)
SELECT RT.rn, RT.pct, S.sig, S.tsql_code AS sample_query
FROM RunningTotals AS RT
CROSS APPLY
(SELECT TOP(1) tsql_code, dbo.fn_SQLSigCLR(tsql_code) AS sig
FROM dbo.Workload AS W
WHERE W.cs = RT.cs) AS S
ORDER BY RT.rn;
4. 有了查詢模式,示例查詢,和佔用時間的百分比例和排序。然後就可以着手優化。也可以通過類似的方式,找到造成大量結果集,大多數的I/O問題的查詢模式。
四:總結
Perfiler是一個很好用的工具來追蹤系統的性能和工作的負荷,從而準確的找到值得優化的SQL,提高效率,大大減少工作量。
原載地址:http://www.cnblogs.com/changbluesky/archive/2010/08/04/1791672.html