我有一个 df 日期时间列,我想将其从 Europe/Copenhgaen t.z 转换为 UTC,但我只是不断在 UTC 列中收到重复的条目。发生这种情况的原因是我制作日期时间列的方式。
我的数据以 df 形式出现,其中包含日期(开始时间设置为 00:00)、位置、分辨率、ID 和数量。对于给定日期,每个 ID 有 23 到 25 个数字位置,具体取决于日期,为了简单起见,我们假设分辨率只是每小时,并且我们有一个 ID。
我有一个函数,可以将日期、分辨率和位置转换为日期时间,然后将其设置为所需的时区,然后将其转换为 UTC。但是,我遇到的问题是 UTC 日期列包含重复的日期而不是所需的日期,2024-10-27 01:00
示例:
import pytz
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
# Set up date range from 25th October 2024 to 28th October 2024
date_range = pd.date_range(start="2024-10-25", end="2024-10-28", freq="D")
# Initialize an empty list to collect data
data = []
# Populate data for each date
for date in date_range:
# Set number of positions based on date
if date == datetime(2024, 10, 27):
positions = range(1, 26) # 25 positions on 27th October
else:
positions = range(1, 25) # 24 positions on other dates
# Generate data entries for each position
for pos in positions:
data.append({
"Date": date,
"position": pos,
"ID": "A",
"quantity": np.random.randint(1, 100),
"resolution": 'PT1H'
})
# Create DataFrame
df = pd.DataFrame(data)
copenhagen_tz = pytz.timezone('Europe/Copenhagen')
df['Date'] = pd.to_datetime(df['Date'],utc=False)
# Define a dictionary to map resolution text to timedelta values
resolution_map = {
'PT1H': timedelta(hours=1),
'PT30M': timedelta(minutes=30),
'PT15M': timedelta(minutes=15),
'PT5M': timedelta(minutes=5)
}
# Calculate the exact datetime by applying the resolution offset
def calculate_exact_date(row):
base_time = row['Date']
offset = (row['position'] - 1) * resolution_map[row['resolution']]
# Handle DST ambiguity by using `ambiguous=True` to interpret ambiguous times consistently
exact_date = base_time + offset
return copenhagen_tz.normalize(exact_date)
# Convert 'Date' to datetime in UTC, then localize to Europe/Copenhagen
df['Date'] = pd.to_datetime(df['Date']).dt.tz_localize('CET')
df['Date_Local'] = df.apply(calculate_exact_date, axis=1)
df['Date_UTC'] = pd.to_datetime(df['Date_Local']).dt.tz_convert('UTC')
我有什么想法可以以不同的方式做到这一点吗?我无法更改收到的数据的格式,它始终有一个位置和开始日期时间。
由于CEST到CET时间变化,有两个0200时间。 如果您以当地时间进行时间计算,您最终会得到两个相同的 UTC 时间。 以 UTC 时间计算:
import pandas as pd
from datetime import datetime, timedelta
from zoneinfo import ZoneInfo
zone = ZoneInfo('Europe/Copenhagen')
date_range = pd.date_range(start='2024-10-27', end='2024-10-27', freq='D', tz=zone)
data = []
for date in date_range:
for pos in range(1, 26):
data.append({'Date': date, 'position': pos})
df = pd.DataFrame(data)
def calculate_exact_date(row):
base_time = row['Date']
offset = (row['position'] - 1) * timedelta(hours=1)
exact_date = base_time + offset
return exact_date.astimezone(zone)
df['Date'] = pd.to_datetime(df['Date']).dt.tz_convert('UTC')
df['Date_Local'] = df.apply(calculate_exact_date, axis=1)
df['Date_UTC'] = pd.to_datetime(df['Date_Local']).dt.tz_convert('UTC')
del df['Date'] # To see Date_Local and Date_UTC in the default print below
print(df)
输出:
position Date_Local Date_UTC
0 1 2024-10-27 00:00:00+02:00 2024-10-26 22:00:00+00:00
1 2 2024-10-27 01:00:00+02:00 2024-10-26 23:00:00+00:00
2 3 2024-10-27 02:00:00+02:00 2024-10-27 00:00:00+00:00
3 4 2024-10-27 02:00:00+01:00 2024-10-27 01:00:00+00:00
4 5 2024-10-27 03:00:00+01:00 2024-10-27 02:00:00+00:00
5 6 2024-10-27 04:00:00+01:00 2024-10-27 03:00:00+00:00
6 7 2024-10-27 05:00:00+01:00 2024-10-27 04:00:00+00:00
7 8 2024-10-27 06:00:00+01:00 2024-10-27 05:00:00+00:00
8 9 2024-10-27 07:00:00+01:00 2024-10-27 06:00:00+00:00
9 10 2024-10-27 08:00:00+01:00 2024-10-27 07:00:00+00:00
10 11 2024-10-27 09:00:00+01:00 2024-10-27 08:00:00+00:00
11 12 2024-10-27 10:00:00+01:00 2024-10-27 09:00:00+00:00
12 13 2024-10-27 11:00:00+01:00 2024-10-27 10:00:00+00:00
13 14 2024-10-27 12:00:00+01:00 2024-10-27 11:00:00+00:00
14 15 2024-10-27 13:00:00+01:00 2024-10-27 12:00:00+00:00
15 16 2024-10-27 14:00:00+01:00 2024-10-27 13:00:00+00:00
16 17 2024-10-27 15:00:00+01:00 2024-10-27 14:00:00+00:00
17 18 2024-10-27 16:00:00+01:00 2024-10-27 15:00:00+00:00
18 19 2024-10-27 17:00:00+01:00 2024-10-27 16:00:00+00:00
19 20 2024-10-27 18:00:00+01:00 2024-10-27 17:00:00+00:00
20 21 2024-10-27 19:00:00+01:00 2024-10-27 18:00:00+00:00
21 22 2024-10-27 20:00:00+01:00 2024-10-27 19:00:00+00:00
22 23 2024-10-27 21:00:00+01:00 2024-10-27 20:00:00+00:00
23 24 2024-10-27 22:00:00+01:00 2024-10-27 21:00:00+00:00
24 25 2024-10-27 23:00:00+01:00 2024-10-27 22:00:00+00:00