我想在请求中用最后一个已知值填充所有空值。 当它在表中而不是在请求中时,这很容易:
如果我按如下方式定义并填写表格:
CREATE TABLE test_fill_null (
date INTEGER,
value INTEGER
);
INSERT INTO test_fill_null VALUES
(1,2),
(2, NULL),
(3, 45),
(4,NULL),
(5, null);
SELECT * FROM test_fill_null ;
date | value
------+-------
1 | 2
2 |
3 | 45
4 |
5 |
然后我就必须这样填写:
UPDATE test_fill_null t1
SET value = (
SELECT t2.value
FROM test_fill_null t2
WHERE t2.date <= t1.date AND value IS NOT NULL
ORDER BY t2.date DESC
LIMIT 1
);
SELECT * FROM test_fill_null;
date | value
------+-------
1 | 2
2 | 2
3 | 45
4 | 45
5 | 45
但是现在,我有一个请求,就像这样:
WITH
pre_table AS(
SELECT
id1,
id2,
tms,
CASE
WHEN tms - lag(tms) over w < interval '5 minutes' THEN NULL
ELSE id2
END as group_id
FROM
table0
window w as (partition by id1 order by tms)
)
当前一个点距离超过5分钟时,group_id设置为id2,否则为null。通过这样做,我希望最终得到一组点的间隔小于 5 分钟,并且每组之间的间隔超过 5 分钟。
然后我不知道如何继续。我试过:
SELECT distinct on (id1, id2)
t0.id1,
t0.id2,
t0.tms,
t1.group_id
FROM
pre_table t0
LEFT JOIN (
select
id1,
tms,
group_id
from pre_table t2
where t2.group_id is not null
order by tms desc
) t1
ON
t1.tms <= t0.tms AND
t1.id1 = t0.id1
WHERE
t0.id1 IS NOT NULL
ORDER BY
id1,
id2,
t1.tms DESC
但在最终结果中,我有一些组连续两分距离超过 5 分钟。在这种情况下,他们应该是两个不同的群体。
A “选择中的选择”更常称为“子选择”或“子查询”。在您的特定情况下,它是一个“相关子查询”。 LATERAL
连接可以在很大程度上用更灵活的解决方案取代相关子查询:
对于您的
第一种情况,这个查询可能更快更简单:
SELECT date, max(value) OVER (PARTITION BY grp) AS value
FROM (
SELECT *, count(value) OVER (ORDER BY date) AS grp
FROM test_fill_null
) sub;
count()
仅计算非空值,因此
grp
随着每个非空 value
递增,从而根据需要形成组。在外部 value
中为每个 grp
选择 one非空
SELECT
是微不足道的。
对于您的,我假设行的初始顺序由 (id1, id2, tms)
确定,如您的查询之一所示。
SELECT id1, id2, tms
, count(step) OVER (ORDER BY id1, id2, tms) AS group_id
FROM (
SELECT *, lag(tms, 1, '-infinity') OVER (PARTITION BY id1 ORDER BY id2, tms) < tms - interval '5 min' AS step
FROM table0
) sub
ORDER BY id1, id2, tms;
适应您的实际订单。其中之一可能涵盖它:
PARTITION BY id1 ORDER BY id2 -- ignore tms
PARTITION BY id1 ORDER BY tms -- ignore id2
SELECT
t2.id1,
t2.id2,
t2.tms,
(
SELECT t1.group_id
FROM pre_table t1
WHERE
t1.tms <= t2.tms
AND t1.group_id IS NOT NULL
AND t2.id1 = t2.id1
ORDER BY t1.tms DESC
LIMIT 1
) as group_id
FROM
pre_table t2
ORDER BY
t2.id1
t2.id2
t2.tms
正如我所说,选择中的选择