按小时间隔分组

问题描述 投票:5回答:6

我很幸运能在Stack Overflow上找到这段令人敬畏的代码,但我想改变它所以它显示每半小时而不是每小时,但是搞乱它,只会让我破坏查询哈哈。

这是SQL:

SELECT CONCAT(HOUR(created_at), ':00-', HOUR(created_at)+1, ':00') as hours,
       COUNT(*)
FROM urls
GROUP BY HOUR(created_at)
ORDER BY HOUR(created_at) ASC

我怎么会每半小时得到一个结果? :)

另一件事是,如果它有半小时没有结果,我希望它返回0而不是仅仅跳过那一步。它看起来有点奇怪的胜利,我对查询做了统计,当它只跳过一个小时,因为没有:P

mysql sql
6个回答
5
投票

如果格式不太重要,则可以为间隔返回两列。您甚至可能只需要间隔的开始,可以通过以下方式确定:

date_format(created_at - interval minute(created_at)%30 minute, '%H:%i') as period_start

别名可以在GROUP BY和ORDER BY子句中使用。如果您还需要间隔结束,则需要进行一些小修改:

SELECT
  date_format(created_at - interval minute(created_at)%30 minute, '%H:%i') as period_start,
  date_format(created_at + interval 30-minute(created_at)%30 minute, '%H:%i') as period_end,
  COUNT(*)
FROM urls
GROUP BY period_start
ORDER BY period_start ASC;

当然你也可以连接这些值:

SELECT concat_ws('-',
           date_format(created_at - interval minute(created_at)%30 minute, '%H:%i'),
           date_format(created_at + interval 30-minute(created_at)%30 minute, '%H:%i')
       ) as period,
       COUNT(*)
FROM urls
GROUP BY period
ORDER BY period ASC;

但是:ぁzxswい

另一件事是,如果它有半个小时没有结果,我希望它返回0

如果以过程语言使用结果,则可以在循环中初始化所有48行零,然后从结果中“注入”非零行。

但是 - 如果需要在SQL中完成它,则需要一个至少有48行的LEFT JOIN表。这可以用“巨大的”UNION ALL语句内联完成,但(恕我直言)这将是丑陋的。所以我更喜欢让序列表有一个整数列,这对于报告非常有用。要创建该表,我通常使用http://rextester.com/RPN50688,因为它可以在任何MySQL服务器上使用,并且至少有几百行。如果您需要更多行 - 只需将其与自身连接即可。

现在让我们创建该表:

information_schema.COLUMNS

现在我们有一个从1到100的整数表(虽然现在你只需要48 - 但这是为了演示)。

使用该表我们现在可以创建所有48个时间间隔:

drop table if exists helper_seq;
create table helper_seq (seq smallint auto_increment primary key)
    select null
    from information_schema.COLUMNS c1
       , information_schema.COLUMNS c2
    limit 100; -- adjust as needed

我们将得到以下结果:

select time(0) + interval 30*(seq-1) minute as period_start,
       time(0) + interval 30*(seq)   minute as period_end
from helper_seq s
where s.seq <= 48;

但是:ぁzxswい

现在我们可以将它用作派生表(FROM子句中的子查询)和LEFT JOIN你的period_start | period_end 00:00:00 | 00:30:00 00:30:00 | 01:00:00 ... 23:30:00 | 24:00:00 表:

http://rextester.com/ISQSU31450

但是:ぁzxswい

最后一步(如果真的需要)是格式化结果。我们可以在外部选择中使用urlsselect p.period_start, p.period_end, count(u.created_at) as cnt from ( select time(0) + interval 30*(seq-1) minute as period_start, time(0) + interval 30*(seq) minute as period_end from helper_seq s where s.seq <= 48 ) p left join urls u on time(u.created_at) >= p.period_start and time(u.created_at) < p.period_end group by p.period_start, p.period_end order by p.period_start http://rextester.com/IQYQ32927。最后的查询是:

CONCAT

结果如下:

CONCAT_WS

但是:ぁzxswい


2
投票

嗯,这可能有点冗长,但它的工作原理:

TIME_FORMAT

查询中最困难的部分是输出没有任何命中的间隔的统计信息。 SQL就是查询和聚合现有数据;选择或汇总表中缺失的数据是非常不寻常的任务。这就是为什么像Wolph在评论中所说的那样,没有完美的解决方案来完成这项任务。

我通过明确选择当天的所有半间隔来解决这个问题。如果间隔数量有限,则可以使用此解决方案。但是,如果您在很长一段时间内汇总了不同的日期,则无法使用此功能。

我不是这个问题的粉丝,但我不能提出更好的建议。使用循环存储过程可以实现更优雅的解决方案,但似乎您希望使用原始SQL查询来解决它。


1
投票
  1. 切换到秒。
  2. 算术得到每个单位时间的数字(在你的情况下使用select concat_ws('-', time_format(p.period_start, '%H:%i'), time_format(p.period_end, '%H:%i') ) as period, count(u.created_at) as cnt from ( select time(0) + interval 30*(seq-1) minute as period_start, time(0) + interval 30*(seq) minute as period_end from helper_seq s where s.seq <= 48 ) p left join urls u on time(u.created_at) >= p.period_start and time(u.created_at) < p.period_end group by p.period_start, p.period_end order by p.period_start 半小时)
  3. 有一个连续数字表。
  4. 使用period | cnt 00:00-00:30 | 1 00:30-01:00 | 0 ... 23:30-24:00 | 3 甚至可以获得缺少的时间单位。
  5. http://rextester.com/LLZ41445
  6. 从时间单位转换回实际时间 - 用于显示。

(步骤3和4是可选的。问题是“每个”,所以我认为它们是必需的。)

步骤1和2体现在类似的东西中

SELECT hours, SUM(count) as count FROM (
    SELECT CONCAT(HOUR(created_at), ':', LPAD(30 * FLOOR(MINUTE(created_at)/30), 2, '0'), '-',
                  HOUR(DATE_ADD(created_at, INTERVAL 30 minute)), ':', LPAD(30 * FLOOR(MINUTE(DATE_ADD(created_at, INTERVAL 30 minute))/30), 2, '0')) as hours,
           COUNT(*) as count
    FROM urls
    GROUP BY HOUR(created_at), FLOOR(MINUTE(created_at)/30)

    UNION ALL

    SELECT '00:00-00:30'as hours, 0 as count UNION ALL SELECT '00:30-01:00'as hours, 0 as count UNION ALL 
    SELECT '01:00-01:30'as hours, 0 as count UNION ALL SELECT '01:30-02:00'as hours, 0 as count UNION ALL 
    SELECT '02:00-02:30'as hours, 0 as count UNION ALL SELECT '02:30-03:00'as hours, 0 as count UNION ALL 
    SELECT '03:00-03:30'as hours, 0 as count UNION ALL SELECT '03:30-04:00'as hours, 0 as count UNION ALL 
    SELECT '04:00-04:30'as hours, 0 as count UNION ALL SELECT '04:30-05:00'as hours, 0 as count UNION ALL 
    SELECT '05:00-05:30'as hours, 0 as count UNION ALL SELECT '05:30-06:00'as hours, 0 as count UNION ALL 
    SELECT '06:00-06:30'as hours, 0 as count UNION ALL SELECT '06:30-07:00'as hours, 0 as count UNION ALL 
    SELECT '07:00-07:30'as hours, 0 as count UNION ALL SELECT '07:30-08:00'as hours, 0 as count UNION ALL 
    SELECT '08:00-08:30'as hours, 0 as count UNION ALL SELECT '08:30-09:00'as hours, 0 as count UNION ALL 
    SELECT '09:00-09:30'as hours, 0 as count UNION ALL SELECT '09:30-10:00'as hours, 0 as count UNION ALL 
    SELECT '10:00-10:30'as hours, 0 as count UNION ALL SELECT '10:30-11:00'as hours, 0 as count UNION ALL 
    SELECT '11:00-11:30'as hours, 0 as count UNION ALL SELECT '11:30-12:00'as hours, 0 as count UNION ALL 
    SELECT '12:00-12:30'as hours, 0 as count UNION ALL SELECT '12:30-13:00'as hours, 0 as count UNION ALL 
    SELECT '13:00-13:30'as hours, 0 as count UNION ALL SELECT '13:30-14:00'as hours, 0 as count UNION ALL 
    SELECT '14:00-14:30'as hours, 0 as count UNION ALL SELECT '14:30-15:00'as hours, 0 as count UNION ALL 
    SELECT '15:00-15:30'as hours, 0 as count UNION ALL SELECT '15:30-16:00'as hours, 0 as count UNION ALL 
    SELECT '16:00-16:30'as hours, 0 as count UNION ALL SELECT '16:30-17:00'as hours, 0 as count UNION ALL 
    SELECT '17:00-17:30'as hours, 0 as count UNION ALL SELECT '17:30-18:00'as hours, 0 as count UNION ALL 
    SELECT '18:00-18:30'as hours, 0 as count UNION ALL SELECT '18:30-19:00'as hours, 0 as count UNION ALL 
    SELECT '19:00-19:30'as hours, 0 as count UNION ALL SELECT '19:30-20:00'as hours, 0 as count UNION ALL 
    SELECT '20:00-20:30'as hours, 0 as count UNION ALL SELECT '20:30-21:00'as hours, 0 as count UNION ALL 
    SELECT '21:00-21:30'as hours, 0 as count UNION ALL SELECT '21:30-22:00'as hours, 0 as count UNION ALL 
    SELECT '22:00-22:30'as hours, 0 as count UNION ALL SELECT '22:30-23:00'as hours, 0 as count UNION ALL 
    SELECT '23:00-23:30'as hours, 0 as count UNION ALL SELECT '23:30-00:00'as hours, 0 as count 

) AS T
GROUP BY hours ORDER BY hours;

例如:

30*60

步骤3需要进行一次并保存在永久表中。或者,如果您有MariaDB,请使用“seq”伪表;例如,`seq_844448_to_900000会动态地给出一个可以在未来很长的表格。

第6步示例:

LEFT JOIN

0
投票

您可以添加一些数学来计算48个间隔而不是24个间隔,并将其放入另一个要进行分组和排序的字段中。

GROUP BY

结果示例:

FLOOR(UNIX_TIMESTAMP(created_at) / (30*60))

Jazerix发布的原始查询结果如下:

mysql> SELECT NOW(), FLOOR(UNIX_TIMESTAMP(NOW()) / (30*60));
+---------------------+----------------------------------------+
| NOW()               | FLOOR(UNIX_TIMESTAMP(NOW()) / (30*60)) |
+---------------------+----------------------------------------+
| 2018-03-02 08:24:48 |                                 844448 |
+---------------------+----------------------------------------+

0
投票

一种不同的方法,无需创建其他表。可能看起来像黑客:-)

第1步:动态生成时间表

假设:INFORMATION_SCHEMA DB可用并且有一个表COLLATIONS,通常有超过100条记录。您可以使用任何至少有48条记录的表

查询:

mysql> SELECT DATE_FORMAT(FROM_UNIXTIME((844448) * 30*60), "%b %d %h:%i");
+-------------------------------------------------------------+
| DATE_FORMAT(FROM_UNIXTIME((844448) * 30*60), "%b %d %h:%i") |
+-------------------------------------------------------------+
| Mar 02 08:00                                                |
+-------------------------------------------------------------+
+---------------------------------------------------------------+
| DATE_FORMAT(FROM_UNIXTIME((844448+1) * 30*60), "%b %d %h:%i") |
+---------------------------------------------------------------+
| Mar 02 08:30                                                  |
+---------------------------------------------------------------+

上面的查询将给出一个时间和时间的表格,间隔为30分钟。

第2步:使用第一个查询生成所需的结果加入URL表

查询:

SELECT HOUR(created_at)*2+FLOOR(MINUTE(created_at)/30) as interval48, 
    if(HOUR(created_at)*2+FLOOR(MINUTE(created_at)/30) % 2 =0,
    CONCAT(HOUR(created_at), ':00-', HOUR(created_at), ':30'),
    CONCAT(HOUR(created_at), ':30-', HOUR(created_at)+1, ':00')
       )  as hours,
      count(*)
FROM urls
GROUP BY HOUR(created_at)*2+FLOOR(MINUTE(created_at)/30)
ORDER BY HOUR(created_at)*2+FLOOR(MINUTE(created_at)/30) ASC

0 0:00-0:30 2017 1 0:30-1:00 1959 2 1:30-2:00 1830 3 1:30-2:00 1715 4 2:30-3:00 1679 5 2:30-3:00 1688


0
投票

我希望这会有用,

0:00-1:00 3976
1:00-2:00 3545
2:00-3:00 3367
© www.soinside.com 2019 - 2024. All rights reserved.