我是 Spark SQL 新手。我想生成以下一系列的开始时间和结束时间,当前日期的间隔为 5 秒。假设我在 2018 年 1 月 1 日运行我的工作。我想要一系列相差 5 秒的开始时间和结束时间。所以1天会有17280条记录
START TIME | END TIME
-----------------------------------------
01-01-2018 00:00:00 | 01-01-2018 00:00:04
01-01-2018 00:00:05 | 01-01-2018 00:00:09
01-01-2018 00:00:10 | 01-01-2018 00:00:14
.
.
01-01-2018 23:59:55 | 01-01-2018 23:59:59
01-02-2018 00:00:00 | 01-01-2018 00:00:05
我知道我可以使用 Scala
for
循环生成此数据帧。但我只能使用 SQL 查询来做到这一点。
有什么方法可以使用
select
构造创建此数据结构吗?
您可以使用 sequence 函数来获取序列,并使用 explode 函数来旋转日期列表,例如:
select explode(sequence( to_date('2024-04-15')
,to_date('2024-04-20')
,interval 1 day)
) as date
选择 START_TIME 作为
**START TIME**
,
from_unixtime(UNIX_TIMESTAMP(cast(START_TIME as STRING),'yyyy-MM-dd HH:mm:ss')+4, 'yyyy-MM-dd HH:mm:ss') as **END TIME**
from ( selectexplode(sequence( to_timestamp('2024-04-15 00:00:00') ,to_timestamp('2024-04-20 00:00:00'),间隔 5 秒)) as START_TIME
)x;
+-------------------+-------------------+
|START TIME |END TIME |
+-------------------+-------------------+
|2024-04-15 00:00:00|2024-04-15 00:00:04|
|2024-04-15 00:00:05|2024-04-15 00:00:09|
|2024-04-15 00:00:10|2024-04-15 00:00:14|
|2024-04-15 00:00:15|2024-04-15 00:00:19|
|2024-04-15 00:00:20|2024-04-15 00:00:24|
|2024-04-15 00:00:25|2024-04-15 00:00:29|
|2024-04-15 00:00:30|2024-04-15 00:00:34|
|2024-04-15 00:00:35|2024-04-15 00:00:39|
|2024-04-15 00:00:40|2024-04-15 00:00:44|
|2024-04-15 00:00:45|2024-04-15 00:00:49|
|2024-04-15 00:00:50|2024-04-15 00:00:54|
|2024-04-15 00:00:55|2024-04-15 00:00:59|
|2024-04-15 00:01:00|2024-04-15 00:01:04|
|2024-04-15 00:01:05|2024-04-15 00:01:09|
|2024-04-15 00:01:10|2024-04-15 00:01:14|
|2024-04-15 00:01:15|2024-04-15 00:01:19|
|2024-04-15 00:01:20|2024-04-15 00:01:24|
|2024-04-15 00:01:25|2024-04-15 00:01:29|
|2024-04-15 00:01:30|2024-04-15 00:01:34|
|2024-04-15 00:01:35|2024-04-15 00:01:39|
+-------------------+-------------------+
sequence
可以基于5秒间隔创建一个数组。WITH CTE (start_time) as
(SELECT explode(sequence(timestamp(current_date()), timestamp(current_date()+1), interval 5 seconds)))
SELECT start_time, start_time + interval 4 seconds end_time
FROM CTE
+-------------------+-------------------+
|start_time |end_time |
+-------------------+-------------------+
|2024-09-11 00:00:00|2024-09-11 00:00:04|
|2024-09-11 00:00:05|2024-09-11 00:00:09|
|2024-09-11 00:00:10|2024-09-11 00:00:14|
|2024-09-11 00:00:15|2024-09-11 00:00:19|
...
|2024-09-11 23:59:45|2024-09-11 23:59:49|
|2024-09-11 23:59:50|2024-09-11 23:59:54|
|2024-09-11 23:59:55|2024-09-11 23:59:59|
|2024-09-12 00:00:00|2024-09-12 00:00:04|
+-------------------+-------------------+
此表将有 17281 条记录:
WITH CTE (start_time) as
(SELECT explode(sequence(timestamp(current_date()), timestamp(current_date()+1), interval 5 seconds)))
SELECT count(1)
FROM CTE
+--------+
|count(1)|
+--------+
|17281 |
+--------+