我有 5 分钟间隔的 OHLC 数据。我想将其转换为每小时,但我希望这些值是在整点一刻,即 9:15-10:15、10:15-11:15、11:15-12:15 等之间。 以下代码可以在点钟测量之间完成此操作。如何修改代码以获得我想要的行为?
import pandas as pd
df = pd.read_csv('data.csv')
df['Datetime'] = pd.to_datetime(df['Datetime'])
df.set_index('Datetime', inplace=True)
hourly_data = df.resample('H').agg({'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last', 'Volume': 'sum'})
样本数据.csv:
Datetime,Open,High,Low,Close,Volume
2015-02-03 09:15:00+05:30,200.4,202,199.1,200.6,286228
2015-02-03 09:20:00+05:30,200.6,201.5,200.45,201.15,110749
2015-02-03 09:25:00+05:30,201.2,201.7,201.15,201.65,108342
2015-02-03 09:30:00+05:30,201.75,202.5,201.45,202.2,162976
2015-02-03 09:35:00+05:30,202.2,202.2,201.4,202,118370
2015-02-03 09:40:00+05:30,201.9,202,201.5,201.7,49873
2015-02-03 09:45:00+05:30,201.7,201.85,201.6,201.65,27798
2015-02-03 09:50:00+05:30,201.65,201.7,201.5,201.5,41560
2015-02-03 09:55:00+05:30,201.5,201.75,201.5,201.65,105351
2015-02-03 10:00:00+05:30,201.65,201.9,201.6,201.8,79346
2015-02-03 10:05:00+05:30,201.65,201.9,201.65,201.75,23715
2015-02-03 10:10:00+05:30,201.65,202.85,201.65,202.8,112551
2015-02-03 10:15:00+05:30,202.85,203.2,202.65,202.95,165270
2015-02-03 10:20:00+05:30,202.95,203.25,202.8,203,105614
2015-02-03 10:25:00+05:30,203,203.4,203,203.2,51944
2015-02-03 10:30:00+05:30,203.2,203.8,203,203.75,78474
2015-02-03 10:35:00+05:30,203.75,204.35,203.75,203.75,125514
2015-02-03 10:40:00+05:30,203.75,205,203.75,204.9,249231
2015-02-03 10:45:00+05:30,204.9,206.2,204.6,205.6,388224
2015-02-03 10:50:00+05:30,205.85,205.85,204.7,205,118267
2015-02-03 10:55:00+05:30,205,205.25,204.4,204.9,72310
2015-02-03 11:00:00+05:30,204.9,205.35,204.05,205,94266
2015-02-03 11:05:00+05:30,205,205.4,204.8,205,53333
resample
与 'h'
加上 offset
的 15m
:
hourly_data = df.resample("h", offset="15m").agg(
{"Open": "first", "High": "max", "Low": "min", "Close": "last", "Volume": "sum"}
)
Open High Low Close Volume
Datetime
2015-02-03 09:15:00+05:30 200.40 202.85 199.10 202.8 1226859
2015-02-03 10:15:00+05:30 202.85 206.20 202.65 205.0 1502447
基本上有两种选择。
您将
origin
从默认值(时间序列中第一天的午夜)更改为 15 分钟后。如果您的时间序列始终在我们的时间之后 15 分钟开始,这很简单:df.resample('H', origin='start')...
您使用参数
offset
向原点添加 15 分钟:df.resample('H', offset='15min')...