如何在pandas中获取常见的时间间隔

问题描述 投票:0回答:1

我使用的是pandas版本1.0.5

import pandas as pd
dat1 = [
    ['2023-12-27','2023-12-27 00:00:00','2023-12-27 02:14:00'],
    ['2023-12-27','2023-12-27 03:16:00','2023-12-27 04:19:00'],
    ['2023-12-27','2023-12-27 18:11:00','2023-12-27 20:13:00'],
    ['2023-12-28','2023-12-28 01:16:00','2023-12-28 02:14:00'],
    ['2023-12-28','2023-12-28 02:16:00','2023-12-28 02:28:00'],
    ['2023-12-28','2023-12-28 02:30:00','2023-12-28 02:56:00'],
['2023-12-28','2023-12-28 18:45:00','2023-12-28 19:00:00'],
    ['2023-12-29','2023-12-29 01:16:00','2023-12-29 02:13:00'],
['2023-12-29','2023-12-29 04:16:00','2023-12-29 05:09:00'],
    ['2023-12-29','2023-12-29 05:11:00','2023-12-29 05:14:00'],
['2023-12-29','2023-12-29 18:00:00','2023-12-29 19:00:00']
]
df = pd.DataFrame(dat1,columns = ['date','Start_tmp','End_tmp'])
df["Start_tmp"] = pd.to_datetime(df["Start_tmp"])
df["End_tmp"] = pd.to_datetime(df["End_tmp"])

我的数据框如下所示:

我需要找到时间戳之间的共同或重叠间隔。

例如, 所有三个日期(黄色突出显示)的重叠时间之一是 1:16 - 2:13。另一个(蓝色突出显示)是 18:45 - 19:00

所以我的预期输出是这样的:

[57,15]

57 - 1:16 - 2:13 之间的分钟。

15 - 18:45 - 19:00 之间的分钟

任何关于如何实现此输出的线索。 谢谢。

python-3.x pandas dataframe
1个回答
0
投票

由于您只对时间感兴趣,因此我会将

datetime
转换为
time
并使用元组作为开始和结束以及当前间隔是否已合并:
(start_time: datetime.time, end_time: datetime.time, already_merged: Boolean)

排序后,我们可以循环查看两个连续间隔是否重叠。如果是这样,我们将只取两端的最大值和两端的最小值,并跟踪这个间隔。

intervals = [(x.time(), y.time())  for x, y in zip(df["Start_tmp"], df["End_tmp"])]
intervals = sorted(intervals)

def time_to_minutes(t):
    return t.hour * 60 + t.minute

result = []
cur = (intervals[0][0], intervals[0][1], False)
for i in range(1, len(intervals)):
    # Is the current interval overlapping with the iterated one?
    if intervals[i][0] <= cur[1]:
        cur = (max(cur[0], intervals[i][0]), min(cur[1], intervals[i][1]), True)
    else:
        if cur[2]:
            result.append(time_to_minutes(cur[1]) - time_to_minutes(cur[0]))
        cur = (intervals[i][0], intervals[i][1], False)

if cur[2]:
    result.append(time_to_minutes(cur[1]) - time_to_minutes(cur[0]))

print(f"result = {result}") # [57, 3, 15]
© www.soinside.com 2019 - 2024. All rights reserved.