例如,我想列出请求中两个日期之间的缺失日期
我的数据:
TABLE ORDER
DATE_order | AMOUNT
01/01/2020 | 500
01/01/2020 | 600
03/01/2020 | 100
05/01/2020 | 300
我希望请求返回
01/01/2020 | 1100
02/01/2020 | 0
03/01/2020 | 100
04/01/2020 | 0
05/01/2020 | 300
我使用Cassandra数据库和Apach Hive连接器
有人可以帮助我吗?
您可以使用侧面视图和正爆炸生成缺失的行:
with your_data as (
select stack(4,
'2020-01-01',500,
'2020-01-01',600,
'2020-01-03',100,
'2020-01-05',300
) as (DATE_order,AMOUNT )
)
select date_sub(s.date_order ,nvl(d.i,0)) as date_order, case when d.i > 0 then 0 else s.amount end as amount
from
(--find previous date
select date_order, amount,
lag(date_order) over(order by date_order) prev_date,
datediff(date_order,lag(date_order) over(order by date_order)) datdiff
from
( --aggregate
select date_order, sum(amount) amount from your_data group by date_order )s
)s
--generate rows
lateral view outer posexplode(split(space(s.datdiff-1),' ')) d as i,x
order by date_order;
结果:
date_order amount
2020-01-01 1100
2020-01-02 0
2020-01-03 100
2020-01-04 0
2020-01-05 300
Time taken: 10.04 seconds, Fetched: 5 row(s)