我是一个初学者,试图对一些股票数据进行简单的因果影响分析。然而,每次我尝试绘制影响图时,我都会得到一个值错误,如下所示:
raise ValueError("{point} 不存在于输入数据索引中。".format( ValueError:20201019 不存在于输入数据索引中。
它说它无法在我的数据索引中找到上述日期,即使当我打印数据框索引时日期显然存在。我已经尝试了所有我可以在网上和文档中找到的东西,但没有任何运气,所以我不知道下一步该怎么做。任何帮助将不胜感激。
import yfinance as yf
from causalimpact import CausalImpact
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import adfuller
training_start = '2020-09-01'
training_end = '2020-10-19'
treatment_start = '2020-10-20'
treatment_end = '2020-10-23'
end_stock = '2020-10-24'
y = ['BTC-USD']
y = yf.download(tickers = y,
start = training_start,
end = end_stock,
interval = '1d')
y = y['Adj Close'].rename('y')
stocks = ['ZAL.DE', 'SQ', 'CRSP', 'JD', 'DE', 'KTOS', 'GOOG', 'TPB']
x = yf.download(tickers = stocks,
start = training_start,
end = end_stock,
interval = '1d')
x = x.iloc[:,:len(stocks)]
x.columns = x.columns.droplevel()
x.index = x.index.tz_localize(None)
df = pd.concat([y,x], axis=1).dropna()
df_training = df[df.index <= training_end]
test = adfuller(x = df_training.y)[1]
if test < 0.05:
print('The time series is stationary')
else:
print('The time series is not stationary')
differencing = df_training.pct_change().dropna()
test = adfuller(x = differencing.y)[1]
if test < 0.05:
print('The time series is stationary')
else:
print('The time series is not stationary')
plt.figure(figsize = (8,6))
sns.set(font_scale = 1.2)
sns.heatmap(differencing.corr(),
annot = True,
fmt = '.1g',
cmap = 'YlOrBr',
center = True,
linewidth = 1,
linecolor = 'black')
#plt.show()
df_final = df.drop(columns = ['ZAL.DE'])
df_final = df.set_index(pd.date_range(start='2020-09-01', periods=len(df_final.index)))
pre_period = [pd.to_datetime(training_start), pd.to_datetime(training_end)]
post_period = [pd.to_datetime(treatment_start), pd.to_datetime(treatment_end)]
impact = CausalImpact(data=df_final, pre_period=pre_period, post_period=post_period)
impact.plot()
ERROR MESSAGE:
Traceback (most recent call last):
File "C:\Users\MGerdes\PycharmProjects\EconometricsCausalInference\Google Causal Impact.py", line 71, in <module>
impact = CausalImpact(data=df_final, pre_period=pre_period, post_period=post_period)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\MGerdes\PycharmProjects\EconometricsCausalInference\server\Lib\site-packages\causalimpact\main.py", line 206, in __init__
processed_input = cidata.process_input_data(data, pre_period, post_period,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\MGerdes\PycharmProjects\EconometricsCausalInference\server\Lib\site-packages\causalimpact\data.py", line 120, in process_input_data
pre_data, post_data = process_pre_post_data(fmt_data, pre_period, post_period)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\MGerdes\PycharmProjects\EconometricsCausalInference\server\Lib\site-packages\causalimpact\data.py", line 266, in process_pre_post_data
checked_pre_period = process_period(pre_period, data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\MGerdes\PycharmProjects\EconometricsCausalInference\server\Lib\site-packages\causalimpact\data.py", line 389, in process_period
raise ValueError("{point} not present in input data index.".format(
ValueError: 20201019 not present in input data index.