Python：正在进行中，开始日期时间和结束日期时间为每小时级别

Question

过去一年我一直在跟踪我的游戏会话 - 只是为了获得我关心的数据并学习 python。现在我想知道（并绘制——但还不重要）在整个时间和所有活动中，a 播放最多的时间（小时：0 到 23）——自跟踪开始以来每天。

样品：

session_id	游戏编号	开始日期时间	结束日期时间
001	74	2023-02-22 13:15:00	2023-02-22 15:30:00
002	127	2023-02-23 13:30:00	2023-02-23 13:45:00
003	74	2023-02-24 14:40:00	2023-02-24 15:00:00

最后我想看这个信息-不需要计算栏：

hour_of_day	sum_hours_played	avg_hours_played_per_day	计算
13	1.00	0.33	(0.75 + 0.25) / 3 天
14	1.33	0.44	(1.00 + 0.33) / 3 天
15	0.5	0.17	(0.5) / 3 天

简而言之，我不只是想看我玩了几个小时（玩过：1，没玩过0），还想知道我玩了特定小时的比例。

我在网上看到了一些方法，但几乎所有方法都只是每月或每天对单个是或否事件进行计数/求和。他们不计算一天/小时的比例。

所以，我很高兴你有任何提示。

Answer 1

设置：

import pandas as pd

# Load your data into a DataFrame
data = {
    'session_id': [1, 2, 3],
    'game_id': [74, 127, 74],
    'start_datetime': ['2023-02-22 13:15:00', '2023-02-23 13:30:00', '2023-02-24 14:40:00'],
    'end_datetime': ['2023-02-22 15:30:00', '2023-02-23 13:45:00', '2023-02-24 15:00:00']
}

df = pd.DataFrame(data)

# Convert the 'start_datetime' and 'end_datetime' columns to datetime objects
df['start_datetime'] = pd.to_datetime(df['start_datetime'])
df['end_datetime'] = pd.to_datetime(df['end_datetime'])

# Calculate the duration of each gaming session
df['duration'] = df['end_datetime'] - df['start_datetime']

# Initialize an empty dictionary to store the hours played
hours_played = {i: 0 for i in range(24)}

诀窍是将每个会话分成几个小时：

# Break down each session into hours and sum the proportion of hours played
for _, row in df.iterrows():
    start = row['start_datetime']
    end = row['end_datetime']
    duration = row['duration']

    # Loop over the hours involved
    while start < end:

        # Calculate the end of the hour currently considered
        hour_start = start.replace(minute=0, second=0)
        hour_end = hour_start + pd.Timedelta(hours=1)

        played = min(hour_end, end) - start  # Here take what ends first (the hour or the session) and substract the start time
        hours_played[start.hour] += played.total_seconds() / 3600  # Here add the time played to the current value in the dictionary
        
        start = hour_end  # For the (possible) next iteration of the while look, set the start to the end of the hour currently considered

# Calculate the average hours played per day
total_days = (df['end_datetime'].max() - df['start_datetime'].min()).days + 1
avg_hours_played = {hour: hours / total_days for hour, hours in hours_played.items()}

# Create a DataFrame to display the results
results = pd.DataFrame(list(avg_hours_played.items()), columns=['hour_of_day', 'avg_hours_played_per_day'])
results['sum_hours_played'] = [hours_played[hour] for hour in results['hour_of_day']]
results = results[['hour_of_day', 'sum_hours_played', 'avg_hours_played_per_day']]
print(results)

我希望我的评论是可以理解的

Python：正在进行中，开始日期时间和结束日期时间为每小时级别

问题描述投票：0回答：1

1个回答

最新问题

Python：正在进行中，开始日期时间和结束日期时间为每小时级别

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1