我有一个包含 ids 和时间间隔示例的数据库:
时间ID | 开始 | 结束 |
---|---|---|
1 | 0 | 10 |
1 | 2 | 13 |
1 | 11 | 21 |
1 | 15 | 30 |
2 | 0 | 10 |
2 | 2 | 13 |
2 | 11 | 21 |
2 | 15 | 30 |
我想编写一个选择不重叠间隔的查询。即使两者的范围 [start, end] 重叠,具有不同 TimeId 的行也被视为不重叠。所以上一个例子的预期输出是
时间ID | 开始 | 结束 |
---|---|---|
1 | 0 | 10 |
1 | 11 | 21 |
2 | 0 | 10 |
2 | 11 | 21 |
另一个可能的输出
时间id | 开始 | 结束 |
---|---|---|
1 | 0 | 10 |
1 | 15 | 30 |
2 | 0 | 10 |
2 | 15 | 30 |
或
时间id | 开始 | 结束 |
---|---|---|
1 | 2 | 13 |
1 | 15 | 30 |
2 | 0 | 10 |
2 | 15 | 30 |
或
时间id | 开始 | 结束 |
---|---|---|
1 | 2 | 13 |
1 | 15 | 30 |
2 | 2 | 13 |
2 | 15 | 30 |
重要的是输出尽可能多的不重叠间隔。
所以基本上有两个条件:
因此,例如以下输出是不可接受的:
时间id | 开始 | 结束 |
---|---|---|
1 | 2 | 13 |
1 | 15 | 30 |
2 | 2 | 13 |
正弦我可以添加行
时间id | 开始 | 结束 |
---|---|---|
2 | 15 | 30 |
并且它仍然不会与输出的三个行中的任何一个重叠。
示例数据库
CREATE TABLE "test" (
"timeid" INTEGER,
"start" INTEGER,
"end" INTEGER
);
INSERT INTO test VALUES(1, 0, 10);
INSERT INTO test VALUES(1, 2, 13);
INSERT INTO test VALUES(1, 3, 15);
INSERT INTO test VALUES(1, 11, 21);
INSERT INTO test VALUES(1, 15, 30);
INSERT INTO test VALUES(2, 0, 10);
INSERT INTO test VALUES(2, 2, 13);
INSERT INTO test VALUES(2, 11, 21);
INSERT INTO test VALUES(2, 15, 30);
最后带来希望的是
with recursive nol(id, start, end) AS (
select test.timeid, test.start, test.end from test inner join
( select timeid as ttid, min(start) as ttstart from test group by id) as tt on
tt.ttid = test.timeid and tt.ttstart = test.start
union all
select test.timeid , test.start, test.end from test inner join nol
on test.timeid = nol.id and test.start > nol.end
GROUP By test.timeid
Having min(test.start)
)
select * from nol
假设和问题
在上面的代码中(以及我的实际情况),您可能会注意到,我假设 timeid 和 start 的组合是唯一的,即没有两行应该具有相同的 timeid 和 start。
上面代码的问题是不支持。我收到错误(结果:不支持递归聚合查询)。
您能帮忙看看什么是在递归中使用聚合的替代方法吗?
提前致谢
您可以尝试以一种可以识别可能重叠的方式准备数据,标记它们并相应地过滤结果。
根据您的示例数据,代码结果符合您的期望之一:
-- prepare your table data for overlap testings
WITH
grid AS
( Select t.timeid, t.starts, t.ends,
( Select Max(starts) From tbl Where timeid = t.timeid And starts < t.starts ) as last_start,
( Select Max(ends) From tbl Where timeid = t.timeid And starts < t.starts ) as last_end,
t2.starts as starts_2, t2.ends as ends_2
From tbl t
Left Join tbl t2 ON( t2.timeid = t.timeid And
t2.starts > t.ends )
),
-- overlaps and flags - final preparation for main SQL
overlaps AS
( SELECT DISTINCT
g.timeid, g.starts, g.ends,
g2.starts as starts_overlap, g2.ends as ends_overlap,
Case When g2.starts < LAG(g2.ends) Over(Partition By g2.timeid Order By g2.starts)
Then 'OUT'
Else 'IN'
End as flag
FROM tbl t
INNER JOIN ( Select *
From grid
Where last_start Is Null OR starts > last_end
) g ON( ( g.timeid = t.timeid And g.starts = t.starts And g.ends = t.ends )
OR
( g.timeid = t.timeid And g.starts_2 = t.starts And g.ends_2 = t.ends )
)
LEFT JOIN grid g2 ON( g2.timeid = t.timeid And g2.starts > t.starts And g2.starts < t.ends )
ORDER BY t.timeid, t.starts
)
-- S Q L :
Select t.*
From overlaps o
Inner Join tbl t ON( t.timeid = o.timeid And
t.starts = Coalesce(o.starts_overlap, o.starts) And
t.ends = Coalesce(o.ends_overlap, o.ends) )
Where ( o.starts_overlap Is Null OR o.starts_overlap > o.ends ) And o.flag = 'IN'
Order By o.timeid, o.starts
结果:
时间id | 开始 | 结束 |
---|---|---|
1 | 0 | 10 |
1 | 15 | 30 |
2 | 0 | 10 |
2 | 15 | 30 |