通過GROUP BY grouping sets提升數據分組統計效率

使用 GROUPING SETS 的 GROUP BY 子句可以生成一個等效於由多個簡單 GROUP BY 子句的 UNION ALL 生成的結果集
示例:(sql server 2008 R2)
創建測試數據庫及表並插入測試數據

use master
CREATE DATABASE db_sales
go
use db_sales
go
CREATE TABLE [dbo].[tb_sale](
	[id] [int] IDENTITY(1,1) NOT NULL,
	[server] [nvarchar](50) NULL,
	[pname] [nvarchar](50) NULL,
	[pinpai] [nvarchar](50) NULL,
	[dates] [smalldatetime] NULL,
	[cnt] [int] NULL
) ON [PRIMARY]

go

INSERT INTO [db_ControlManager_ft].[dbo].[test]([server],[pname],[pinpai],[dates],[cnt])
     VALUES('A','computer','hp','2012-01-01',1),
		   ('A','computer','hp','2012-01-02',3),
		   ('A','computer','hp','2012-01-03',5),
		   ('A','computer','hp','2012-01-04',1),
		   ('A','computer','hp','2012-01-05',3),
		   ('A','computer','hp','2012-01-06',5),
		   ('A','computer','dell','2012-01-01',2),
		   ('A','computer','dell','2012-01-02',4),
		   ('A','computer','dell','2012-01-03',6),
		   ('A','computer','dell','2012-01-04',7),
		   ('A','computer','dell','2012-01-05',2),
		   ('A','computer','dell','2012-01-06',4),
		   ('B','computer','hp','2012-01-01',3),
		   ('B','computer','hp','2012-01-02',3),
		   ('B','computer','hp','2012-01-03',3),
		   ('B','computer','hp','2012-01-04',3),
		   ('B','computer','hp','2012-01-05',3),
		   ('B','computer','hp','2012-01-06',2),
		   ('B','computer','dell','2012-01-01',2),
		   ('B','computer','dell','2012-01-02',2),
		   ('B','computer','dell','2012-01-03',2),
		   ('B','computer','dell','2012-01-04',2),
		   ('B','computer','dell','2012-01-05',1),
		   ('B','computer','dell','2012-01-06',1),
		   ('A','TV','hp','2012-01-01',1),
		   ('A','TV','hp','2012-01-02',3),
		   ('A','TV','hp','2012-01-03',5),
		   ('A','TV','hp','2012-01-04',1),
		   ('A','TV','hp','2012-01-05',3),
		   ('A','TV','hp','2012-01-06',5),
		   ('A','TV','dell','2012-01-01',2),
		   ('A','TV','dell','2012-01-02',4),
		   ('A','TV','dell','2012-01-03',6),
		   ('A','TV','dell','2012-01-04',7),
		   ('A','TV','dell','2012-01-05',2),
		   ('A','TV','dell','2012-01-06',4),
		   ('B','TV','hp','2012-01-01',3),
		   ('B','TV','hp','2012-01-02',3),
		   ('B','TV','hp','2012-01-03',3),
		   ('B','TV','hp','2012-01-04',3),
		   ('B','TV','hp','2012-01-05',3),
		   ('B','TV','hp','2012-01-06',2),
		   ('B','TV','dell','2012-01-01',2),
		   ('B','TV','dell','2012-01-02',2),
		   ('B','TV','dell','2012-01-03',2),
		   ('B','TV','dell','2012-01-04',2),
		   ('B','TV','dell','2012-01-05',1),
		   ('B','TV','dell','2012-01-06',1)
go

現在要求分別計算出每天的銷量,總銷量,每個銷售員的總銷量,每個產品的總銷量,每個品牌的總銷量,及每個銷售員按品牌的產品銷量

使用普通的UNION ALL語句

select N'總銷量',null,null,null,null,SUM(cnt) from tb_sale
union all
select N'每日銷量',null,null,null,dates,SUM(cnt) from tb_sale group by dates
union all
select N'按人員總銷量',[server],null,null,null,SUM(cnt) from tb_sale group by [server]
union all
select N'按品牌總銷量',null,null,pinpai,null,SUM(cnt) from tb_sale group by pinpai
union all
select N'按產品總銷量',null,pname,null,null,SUM(cnt) from tb_sale group by pname
union all
select N'按產品及品牌總銷量',null,pname,pinpai,null,SUM(cnt) from tb_sale group by pname,pinpai
union all
select N'按人員產品及品牌總銷量',[server],pname,pinpai,null,SUM(cnt) from tb_sale group by [server],pname,pinpai

使用grouping sets 和grouping_id實現

select 
[server] as 銷售人員 ,
pname as 產品名稱,
pinpai as  品牌 ,
dates as 銷售時間,
SUM(cnt) as 銷售數量,
(case
when GROUPING_ID([server],pname,pinpai,dates)=15 then N'總銷量'
when GROUPING_ID([server],pname,pinpai,dates)=14 then N'每日銷量'
when GROUPING_ID([server],pname,pinpai,dates)=13 then N'按品牌總銷量'

when GROUPING_ID([server],pname,pinpai,dates)=11 then N'按產品總銷量'
when GROUPING_ID([server],pname,pinpai,dates)=9 then N'按產品及品牌總銷量'
when GROUPING_ID([server],pname,pinpai,dates)=7 then N'按人員總銷量'

when GROUPING_ID([server],pname,pinpai,dates)=1 then N'按人員產品及品牌總銷量'
end
)
as 項目說明
  from tb_sale
group by GROUPING sets(([server],pname,pinpai),(pname,pinpai),[server],pname,pinpai,dates,())
order by [server],pname,pinpai,dates


使用union操作會增加IO開銷,會減少cpu和內存的開銷,使用grouping sets會減少IO開銷,會增加cpu和內存的消耗.

GROUPING SETS在遇到多個條件時,聚合是一次性從數據庫中取出所有需要操作的數據,在內存中對數據庫進行聚合操作並生成結果。而UNION ALL是多次掃描表,將返回的結果進行UNION操作.這也就是爲什麼GROUPING SETS和UNION操作所返回的數據順序是不同的.
grouping sets的執行方式在group by後面有多列的時候,grouping sets帶來的性能提升非常明顯


 


 

 

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章