是否可以编写一个SQL查询,将具有开始日期和结束日期的行汇总为具有连续开始日期和结束日期的行?
约束是它必须是常规sql,即不使用CTE,循环等作为第三方工具,只允许sql语句以Select开头。
e.f.:
ID StartDate EndDate
1001, Jan-1-2018, Jan-04-2018
1002, Jan-5-2018, Jan-13-2018
1003, Jan-14-2018, Jan-18-2018
1004, Jan-25-2018, Feb-05-2018
所需的输出必须是:
Jan-1-2018, Jan-18-2018
Jan-25-2018, Feb-05-2018
谢谢
你可以利用window functions和使用gaps-and-islands
这个概念。在你的情况下,连续的日期将是岛屿,差距是自我解释的。
我在下面以详细的方式写下了答案,以帮助明确查询正在做什么,但它很可能以不同的方式编写,更简洁。请在答案中查看我的评论,解释每个步骤(子查询)的作用。
--Determine Final output
select min(c.StartDate) as StartDate
, max(c.EndDate) as EndDate
from (
--Assign a number to each group of Contiguous Records
select b.ID
, b.StartDate
, b.EndDate
, b.EndDatePrev
, b.IslandBegin
, sum(b.IslandBegin) over (order by b.ID asc) as IslandNbr
from (
--Determine if its Contiguous (IslandBegin = 1, means its not Contiguous with previous record)
select a.ID
, a.StartDate
, a.EndDate
, a.EndDatePrev
, case when a.EndDatePrev is NULL then 1
when datediff(d, a.EndDatePrev, a.StartDate) > 1 then 1
else 0
end as IslandBegin
from (
--Determine Prev End Date
select tt.ID
, tt.StartDate
, tt.EndDate
, lag(tt.EndDate, 1, NULL) over (order by tt.ID asc) as EndDatePrev
from dbo.Table_Name as tt
) as a
) as b
) as c
group by c.IslandNbr
order by c.IslandNbr
我希望以下SQL查询可以帮助您确定给定案例的差距和涵盖日期
我没有使用dates table function等的CTE表达式。另一方面,我使用master..spt_values的数字表生成日期表作为LEFT连接的主表您可以创建数字表或日期表如果它不符合您的要求
在查询中,为了捕获边界之间的变化,我使用SQL LAG() function,这使我能够与排序列表中的列的先前值进行比较
select
max(startdate) as startdate,
max(enddate) as enddate
from (
select
date,
case when exist = 1 then date else null end as startdate,
case when exist = 0 then dateadd(d,-1,date) else null end as enddate,
( row_number() over (order by date) + 1) / 2 as rn
from (
select date, exist, case when exist <> (lag(exist,1,'') over (order by date)) then 1 else 0 end as changed
from (
select
d.date,
case when exists (select * from Periods where d.date between startdate and enddate) then 1 else 0 end as exist
from (
SELECT dateadd(dd,number,'20180101') date
FROM master..spt_values
WHERE Type = 'P' and dateadd(dd,number,'20180101') <= '20180228'
) d
) cte
) tbl
where changed = 1
) dates
group by rn
这是结果