Hive高級聚合(0.10開始支持)
高級聚合功能相當於group by 加強
grouping sets:多個group by 進行union all,在同一個數據集進行多重group by
該關鍵字可以實現同一數據集的多重group by操作。事實上GROUPING SETS是多個GROUP BY進行UNION ALL操作的簡單表達,它僅僅使用一個stage完成這些操作。GROUPING SETS的子句中如果包含()數據集,則表示整體聚合。
select name,work_space[0] from employee group by name, work_space[0] grouping sets((name,work_space[0]),name,());
// 上面語句與下面語句等效
select name, work_space[0] as main_place, count(employee_id) as emp_id_cnt
from employee
group by name, work_space[0]
UNION ALL
select name, work_space[0] as main_place, count(employee_id) as emp_id_cntfrom employee
group by name
UNION ALL
select name, work_space[0] as main_place, count(employee_id) as emp_id_cntfrom employee;
rollup
group by a,b,c with rollup <=> grouping sets((a,b,c),(a,b),(a),())
cube
group by a,b,c with cube <=> grouping sets((a,b,c),(a,b),(a,c),(b,c),(a),(b),(c),())
參考:https://www.jianshu.com/p/9502e1d58f45