彙總平均數/比值的坑

A城市:每天車總訂單100(total_ord),一共20個車(total_vid),因此每天的車均單是5(avg_ord)

B城市:每天車總訂單300,一共30個車,因此每天的車均單是10

在計算全國(假設只有A+B兩個城市)

簡單的車均單加總處以2即(10+5)/2=7.5,對應sql裏面avg(avg_ord)

但是分開算真實的是:(100+300)/(20+30)=8,對應sql的sum(total_ord)/sum(total_vid)

所以但凡設計到平均數或者比值的時候,再加總看整體的時候,一定要注意

這裏舉個例子

v_id代表車id

order_id代表訂單id

每個車會有多個訂單

abnormal_flag代表這個訂單有異常標籤

create table avgtable(
city_id varchar(15) not null,
v_id int not null,
order_id int not null,
abnomal_flag int not null,
ptime datetime not null
);
insert into avgtable values('A',1,1001,0,'2019-07-01 10:00:00');
insert into avgtable values('A',1,1002,1,'2019-07-01 11:00:00');
insert into avgtable values('A',1,1003,1,'2019-07-01 12:00:00');
insert into avgtable values('A',1,1004,0,'2019-07-01 13:00:00');
insert into avgtable values('A',2,1005,0,'2019-07-01 10:00:00');
insert into avgtable values('A',2,1006,1,'2019-07-01 11:00:00');
insert into avgtable values('A',2,1007,1,'2019-07-01 12:00:00');
insert into avgtable values('B',3,1008,0,'2019-07-01 10:00:00');
insert into avgtable values('B',4,1009,1,'2019-07-01 11:00:00');
insert into avgtable values('B',4,1010,1,'2019-07-01 12:00:00');
insert into avgtable values('B',4,1011,0,'2019-07-01 13:00:00');
insert into avgtable values('B',5,1012,0,'2019-07-01 10:00:00');
insert into avgtable values('B',5,1013,1,'2019-07-01 11:00:00');
insert into avgtable values('B',5,1014,1,'2019-07-01 12:00:00');

目標:

第一種做法:

首先:看每個城市的每天的每個車的總訂單

select 
	date_format(ptime,'%y-%m-%d') as pt
	,city_id
    ,v_id
    ,count(*) as total_ord
from avgtable
group by date_format(ptime,'%y-%m-%d'),city_id,v_id

之後:在這個表的基礎上進行一些動作

select 
	pt
    ,city_id
    ,avg(total_ord)
from
(
select 
	date_format(ptime,'%y-%m-%d') as pt
	,city_id
    ,v_id
    ,count(*) as total_ord
from avgtable
group by date_format(ptime,'%y-%m-%d'),city_id,v_id
) as day_vid_ord_cnt
group by pt,city_id;

第二種做法:巧妙的利用count(distinct)就可以不用嵌套2個表了

select 
	date_format(ptime,'%y-%m-%d')
    ,city_id
    ,count(distinct order_id)
    ,count(distinct v_id)
    ,count(distinct order_id)/count(distinct v_id) as avg_ord_cnt
from avgtable
group by date_format(ptime,'%y-%m-%d'),city_id;

最終需要:用withas語句寫不同城市的車均單+全國的車均單 

with city_count as
(
select 
	date_format(ptime,'%y-%m-%d') as pt
    ,city_id
    ,count(distinct order_id) as c1
    ,count(distinct v_id) as c2
    ,count(distinct order_id)/count(distinct v_id) as avg_ord_cnt
from avgtable
group by date_format(ptime,'%y-%m-%d'),city_id
)
select pt,avg_ord_cnt from city_count
union all
select
	pt
    ,sum(c1)/sum(c2) as avg_ord_cnt
from city_count

注意做全國的時候不要直接阿avg(acg_ord_cnt)

select
	pt
    ,'0'
    ,sum(c1)
    ,sum(c2)
    ,avg(avg_ord_cnt)
    ,sum(c1)/sum(c2) as avg_ord_cnt
from 
(
select 
	date_format(ptime,'%y-%m-%d') as pt
    ,city_id
    ,count(distinct order_id) as c1
    ,count(distinct v_id) as c2
    ,count(distinct order_id)/count(distinct v_id) as avg_ord_cnt
from avgtable
group by date_format(ptime,'%y-%m-%d'),city_id
) as t
group by pt;

看到了吧2.9和2.8不一樣的

現在加大難度:把abormal標籤爲1的車作爲一列計算它的車均單

select 
	date_format(ptime,'%y-%m-%d') as pt
    ,city_id
    ,count(distinct order_id) as c1
    ,count(distinct v_id) as c2
    ,count(distinct order_id)/count(distinct v_id) as avg_ord_cnt
    ,count(case when abnomal_flag = 1 then order_id else null end) as cc1
    ,count(distinct case when abnomal_flag = 1 then v_id else null end) as cc2
    ,count(case when abnomal_flag = 1 then order_id else null end)/count(distinct case when abnomal_flag = 1 then v_id else null end) as avg2
from avgtable
group by date_format(ptime,'%y-%m-%d'),city_id

這裏注意既要滿足flag=1又要distinct的用法!!!!count的好處!!!

之後再with的寫法和上面就一樣啦

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章