我已经阅读了多篇与我的问题类似的帖子,但我仍然无法弄清楚。我有一个 pandas df,如下所示(多天):
Out[1]:
price quantity
time
2016-06-08 09:00:22 32.30 1960.0
2016-06-08 09:00:22 32.30 142.0
2016-06-08 09:00:22 32.30 3857.0
2016-06-08 09:00:22 32.30 1000.0
2016-06-08 09:00:22 32.35 991.0
2016-06-08 09:00:22 32.30 447.0
...
要计算我可以做的 vwap:
df['vwap'] = (np.cumsum(df.quantity * df.price) / np.cumsum(df.quantity))
但是,我想每天重新开始(groupby),但我不知道如何让它与(lambda?)函数一起工作。
df['vwap_day'] = df.groupby(df.index.date)['vwap'].apply(lambda ...
速度至关重要。将不胜感激任何帮助:)
选项0
普通香草方法
def vwap(df):
q = df.quantity.values
p = df.price.values
return df.assign(vwap=(p * q).cumsum() / q.cumsum())
df = df.groupby(df.index.date, group_keys=False).apply(vwap)
df
price quantity vwap
time
2016-06-08 09:00:22 32.30 1960.0 32.300000
2016-06-08 09:00:22 32.30 142.0 32.300000
2016-06-08 09:00:22 32.30 3857.0 32.300000
2016-06-08 09:00:22 32.30 1000.0 32.300000
2016-06-08 09:00:22 32.35 991.0 32.306233
2016-06-08 09:00:22 32.30 447.0 32.305901
选项1
加一点
eval
df = df.assign(
vwap=df.eval(
'wgtd = price * quantity', inplace=False
).groupby(df.index.date).cumsum().eval('wgtd / quantity')
)
df
price quantity vwap
time
2016-06-08 09:00:22 32.30 1960.0 32.300000
2016-06-08 09:00:22 32.30 142.0 32.300000
2016-06-08 09:00:22 32.30 3857.0 32.300000
2016-06-08 09:00:22 32.30 1000.0 32.300000
2016-06-08 09:00:22 32.35 991.0 32.306233
2016-06-08 09:00:22 32.30 447.0 32.305901
我之前也使用过这种方法,但如果你想限制窗口期,它的效果不太准确。相反,我发现 TA python 库运行得非常好: https://technical-analysis-library-in-python.readthedocs.io/en/latest/index.html
from ta.volume import VolumeWeightedAveragePrice
# ...
def vwap(dataframe, label='vwap', window=3, fillna=True):
dataframe[label] = VolumeWeightedAveragePrice(high=dataframe['high'], low=dataframe['low'], close=dataframe["close"], volume=dataframe['volume'], window=window, fillna=fillna).volume_weighted_average_price()
return dataframe
我使用 HLC3 方法进行 vwap。这个公式对我有用。 这是升级到HLC3而不是close,这是这个平台上有人分享的。
import yfinance as yf
data = yf.download('AAPL', start='2020-01-01', end='2024-08-15',interval = '1d')
def vwap(df):
# Calculate HLC3 (average of High, Low, and Close)
hlc3 = (df['High'] + df['Low'] + df['Adj Close']) / 3
q = df['Volume'].values # Use 'Volume' column for quantity
p = hlc3.values # Use HLC3 for price
# VWAP calculation using HLC3
vwap = (p * q).cumsum() / q.cumsum()
# Assign the calculated VWAP as a new column
return df.assign(VWAP=vwap)
# Apply the VWAP calculation to the data
data = data.groupby(data.index.date, group_keys=False).apply(vwap)
data.head()