需求:求出每个域名的当月点击总量和累计点击总量
样本数据:
domain | time | traffic(T) |
---|---|---|
gifshow.com | 2019/01/01 | 5 |
yy.com | 2019/01/01 | 4 |
huya.com | 2019/01/01 | 1 |
gifshow.com | 2019/01/20 | 6 |
gifshow.com | 2019/02/01 | 8 |
yy.com | 2019/01/20 | 5 |
gifshow.com | 2019/02/02 | 7 |
解答思路1:
- 先求出每个域名的每月点击总量和。
- 将上一步生成的结果表进行自关联,做笛卡尔积。
- 从上一步的结果中,进行分组查询,分组的字段是a.domain a.month
求月累计值: 将b.month <= a.month的所有b.salary求和即是月累计值。
select A.domain, A.month, max(A.traffics) as traffics, sum(B.traffics) as totals from
(select t.domain, t.month, cast(sum(t.traffic) as Int) as traffics from (select domain, concat(substr(time,0,4),substr(time,6,2)) as month, traffic from user_flow)t group by domain, month) A
inner join
(select t.domain, t.month,cast(sum(t.traffic) as Int) as traffics from (select domain, concat(substr(time,0,4),substr(time,6,2)) as month, traffic from user_flow)t group by domain, month) B
on A.domain=B.domain
where B.month <= A.month
group by A.domain,A.month
order by A.domain,A.month;
解答思路2:(一般采用这种方式)
select t.domain,t.month,sum(t.traffic) as traffics,SUM(sum(t.traffic)) OVER (PARTITION BY domain ORDER BY month) as totals
from (select domain, from_unixtime(unix_timestamp(time,'yyyy/MM/dd'),'yyyy-MM') as month, traffic from user_flow) t
group by domain, month;
输出:
domain | time | traffics | totals |
---|---|---|---|
gifshow.com | 2019-01 | 11 | 11 |
gifshow.com | 2019-02 | 15 | 26 |
yy.com | 2019-01 | 9 | 9 |
huya.com | 2019-01 | 1 | 1 |
本文链接:https://my.lmcjl.com/post/15010.html
展开阅读全文
4 评论