用戶訪問表(visit_table)
user_id(用戶ID) | Url(訪問地址) |
1 | A |
1 | B |
2 | C |
2 | A |
1 | A |
SQL查詢,訪問過A並且訪問過B的用戶數量
實現1:
with user_visit as ( select 1 as user_id, 'A' as url union all select 1 as user_id, 'A' as url union all select 1 as user_id, 'B' as url union all select 1 as user_id, 'C' as url union all select 2 as user_id, 'B' as url union all select 2 as user_id, 'B' as url union all select 3 as user_id, 'A' as url union all select 4 as user_id, 'A' as url union all select 4 as user_id, 'B' as url union all select 5 as user_id, 'C' as url union all select 1 as user_id, 'A' as url ) --即訪問A,又訪問B頁面的用戶數 select count(user_id) from ( select user_id, collect_set(url) as url_set from user_visit where url = 'A' or url = 'B' group by user_id ) a where size(url_set) = 2
實現2:
set hive.strict.checks.cartesian.product=false; with user_visit as ( select 1 as user_id, 'A' as url union all select 1 as user_id, 'A' as url union all select 1 as user_id, 'B' as url union all select 1 as user_id, 'C' as url union all select 2 as user_id, 'B' as url union all select 2 as user_id, 'B' as url union all select 3 as user_id, 'A' as url union all select 4 as user_id, 'A' as url union all select 4 as user_id, 'B' as url union all select 5 as user_id, 'C' as url union all select 1 as user_id, 'A' as url ) --即訪問A,又訪問B頁面的用戶數 select count(user_id) from ( select distinct user_id, url from user_visit where url = 'A' ) a join ( select distinct user_id, url from user_visit where url = 'B' ) b