表
tmch
包含数千行,它是:
CREATE TABLE IF NOT EXISTS public.tmch (
id bigserial NOT NULL,
year integer,
week integer,
my_number integer,
device_id bigint,
CONSTRAINT tmch_pkey PRIMARY KEY (id)
)
样本数据:
1716446 2024 37 13 2
1716447 2024 37 13 2
1716448 2024 37 0 3
1716449 2024 37 11 4
1716450 2024 37 12 4
1716451 2024 37 0 6
1716452 2024 37 0 6
1716453 2024 37 0 6
1716454 2024 37 1 6
1716455 2024 37 1 6
1716456 2024 37 9 7
这里有一个查询,用于计算每组
my_number
出现了多少次 (week, year, device_id)
:
select count(my_number) c, my_number, device_id, year, week from tmch
group by my_number, device_id, year, week
order by device_id asc, c desc
此查询的结果 - 基于更大、不同的样本(以避免不清楚):
6 16 2 2024 37
4 17 2 2024 37
4 15 2 2024 37
4 0 2 2024 37
3 11 2 2024 37
3 14 2 2024 37
2 13 2 2024 37
2 1 2 2024 37
2 18 2 2024 37
2 12 2 2024 37
1 10 2 2024 37
1 2 2 2024 37
8 15 3 2024 37
6 16 3 2024 37
5 14 3 2024 37
4 17 3 2024 37
4 12 3 2024 37
3 7 3 2024 37
3 20 3 2024 37
3 18 3 2024 37
3 19 3 2024 37
3 4 3 2024 37
3 5 3 2024 37
3 6 3 2024 37
1 21 3 2024 37
1 0 3 2024 37
1 3 3 2024 37
1 8 3 2024 37
如何仅获取每个
my_number
计数最高的行
一群(week, year, device_id)
?
以上示例的结果:
6 16 2 2024 37 -- because my_number=16 occurs 6 times for device_id=2 y=2024 w=37
8 15 3 2024 37 -- because my_number=15 occurs 8 times for device_id=3 y=2024 w=37
我尝试过
row_number() over (partition by ...)
但没有成功。
DISTINCT ON
会执行您想要的操作:
SELECT DISTINCT ON (device_id, week, year)
c, my_number, device_id, year, week
FROM (SELECT count(my_number) AS c,
my_number,
device_id,
year,
week
FROM tmch
GROUP BY my_number, device_id, year, week) AS sub
ORDER BY device_id, week, year, c DESC;
在
SELECT
查询中,DISTINCT ON
在使用 GROUP BY
和聚合函数进行聚合之后应用。 (即使在窗口函数之后。)因此您可以在单个查询级别中完成所有操作,而无需子查询。既然您在 my_number
子句中列出了
GROUP BY
,那么使用 count(*)
而不是 count(my_number)
更有意义。也快一点。逻辑上唯一的区别:如果
my_number
SELECT DISTINCT ON (device_id, year, week)
count(*) AS c, my_number, device_id, year, week
FROM tmch
GROUP BY device_id, year, week, my_number
ORDER BY device_id, year, week, c DESC;
具有更直观的列顺序的相同查询:
SELECT DISTINCT ON (device_id, year, week)
device_id, year, week, my_number, count(*) AS c
FROM tmch
GROUP BY device_id, year, week, my_number
ORDER BY device_id, year, week, c DESC;
或者使用最少的语法:
SELECT DISTINCT ON (1,2,3)
device_id, year, week, my_number, count(*) AS c
FROM tmch
GROUP BY 1, 2, 3, 4
ORDER BY 1, 2, 3, c DESC;
ORDER BY
和
DISTINCT ON
不不一致即可。参见: