在unixtime中按天对行进行分组

问题描述 投票:0回答:1

我有一张带有 DDL 的表:

CREATE TABLE public.test_table (
    id serial4 NOT NULL,
    "date" int8 NOT NULL,
    user_id int4 NOT NULL,
    device_id int4 NULL,
    CONSTRAINT test_table_date_user_id_device_id_key UNIQUE (date, user_id, device_id),
    CONSTRAINT test_table_pkey PRIMARY KEY (id)
);

ALTER TABLE public.test_table ADD CONSTRAINT test_table FOREIGN KEY (device_id) REFERENCES public.t_device(id);
ALTER TABLE public.test_table ADD CONSTRAINT test_table_user_id_fkey FOREIGN KEY (user_id) REFERENCES public.t_user(id) ON DELETE CASCADE;

样品:

| id       | date           | user_id |
| -------- | -------------- | ------- |
| 1886     | 1716625890     | 5       |
| 1887     | 1716626430     | 5       |
| 1888     | 1716627030     | 5       |

我需要找到按天分组的行数,还需要打印每天的第一个和最后一个日期。作为输入,我有两个日期:

from
to
user_id

我正在尝试使用以下sql:

with calendar as (
select
    d
from
    generate_series(to_timestamp(:FROM)::date,
    to_timestamp(:TO)::date,
    interval '1 day') d)

select
    c.d::date as item_date,
    count(dt.id) as item_count,
    min(dt."date") as item_min,
    max(dt."date") as item_max
from
    test_table dt 
left join
     calendar c
     on
    to_timestamp(dt.date)::date >= c.d
    and
        to_timestamp(dt.date)::date < c.d + interval '1 day'
where dt.user_id = 5
group by
    c.d
order by
    c.d;

我目前的输出:

|item_date |item_count|item_min     |item_max     |
|----------|----------|-------------|-------------|
|2024-05-25|144       |1,716,584,490|1,716,670,230|
|2024-05-26|144       |1,716,670,816|1,716,756,630|
|2024-05-27|144       |1,716,757,221|1,716,843,030|
|          |4,770     |1,716,286,230|1,719,514,890|

explain(analyze, verbose, buffers, settings)
结果:

Sort  (cost=199067.86..199068.36 rows=200 width=36) (actual time=10.152..10.154 rows=3 loops=1)
  Output: ((d.d)::date), (count(dt.id)), (min(dt.date)), (max(dt.date)), d.d
  Sort Key: d.d
  Sort Method: quicksort  Memory: 25kB
  Buffers: shared hit=363
  ->  HashAggregate  (cost=199057.72..199060.22 rows=200 width=36) (actual time=10.144..10.146 rows=3 loops=1)
        Output: (d.d)::date, count(dt.id), min(dt.date), max(dt.date), d.d
        Group Key: d.d
        Buffers: shared hit=363
        ->  Nested Loop  (cost=0.30..192966.61 rows=609111 width=20) (actual time=0.292..10.074 rows=432 loops=1)
              Output: d.d, dt.id, dt.date
              Join Filter: (((to_timestamp((dt.date)::double precision))::date >= d.d) AND ((to_timestamp((dt.date)::double precision))::date < (d.d + '1 day'::interval)))
              Rows Removed by Join Filter: 15174
              Buffers: shared hit=363
              ->  Function Scan on pg_catalog.generate_series d  (cost=0.01..10.01 rows=1000 width=8) (actual time=0.006..0.008 rows=3 loops=1)
                    Output: d.d
                    Function Call: generate_series((('2024-05-25 11:31:30+03'::timestamp with time zone)::date)::timestamp with time zone, (('2024-05-27 14:50:30+03'::timestamp with time zone)::date)::timestamp with time zone, '1 day'::interval)
              ->  Materialize  (cost=0.29..1100.30 rows=5482 width=12) (actual time=0.011..1.198 rows=5202 loops=3)
                    Output: dt.id, dt.date
                    Buffers: shared hit=363
                    ->  Index Scan using test_table_date_user_id_device_id_key on public.test_table dt  (cost=0.29..1072.89 rows=5482 width=12) (actual time=0.032..2.070 rows=5202 loops=1)
                          Output: dt.id, dt.date
                          Index Cond: (dt.user_id = 5)
                          Buffers: shared hit=363
Settings: effective_cache_size = '1377MB', effective_io_concurrency = '200', max_parallel_workers = '1', random_page_cost = '1.1', search_path = 'public, public, "$user"', work_mem = '524kB'
Planning Time: 0.147 ms
Execution Time: 10.264 ms
  • 查询是否最优?
  • 是否可以删除结果中的最后一个总计行? (更新:已修复,我的错误,谢谢弗兰克·海肯斯)
postgresql group-by
1个回答
0
投票

查询是否最优?

不是的。 CTE 是不必要的 - 您已经将 Unix 时间戳转换为日期了

to_timestamp(dt.date)::date
。我不认为有理由生成与
generate_series()
不匹配的日期。

是否可以删除结果中最后一个总计行?

它在那里是因为您在 CTE 中生成的

4,770
日期与
test_table
中的任何内容都不匹配。如果删除 CTE,您将摆脱最后一行,缩短、简化并加快查询速度:db<>fiddle 的演示

select
    to_timestamp(dt.date)::date as item_date,
    count(dt.id) as item_count,
    min(dt."date") as item_min,
    max(dt."date") as item_max
from
    test_table dt 
where dt.user_id = 5
--and to_timestamp(dt.date)::date >= to_timestamp(:FROM)::date
--and to_timestamp(dt.date)::date <  to_timestamp(:TO)::date
group by
    item_date
order by
    item_date;

如果

generate_series()
是为了缩小目标期限(看起来像 13 年),您可以取消注释上面的两个附加条件。

© www.soinside.com 2019 - 2024. All rights reserved.