因果影响分析错误 ValueError("{point} not present in input data index.".format( ValueError: 20201019 not present in input data index

问题描述 投票:0回答:0

我是一个初学者,试图对一些股票数据进行简单的因果影响分析。然而,每次我尝试绘制影响图时,我都会得到一个值错误,如下所示:

raise ValueError("{point} 不存在于输入数据索引中。".format( ValueError:20201019 不存在于输入数据索引中。

它说它无法在我的数据索引中找到上述日期,即使当我打印数据框索引时日期显然存在。我已经尝试了所有我可以在网上和文档中找到的东西,但没有任何运气,所以我不知道下一步该怎么做。任何帮助将不胜感激。

import yfinance as yf
from causalimpact import CausalImpact
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import adfuller

training_start = '2020-09-01'
training_end = '2020-10-19'
treatment_start = '2020-10-20'
treatment_end = '2020-10-23'
end_stock = '2020-10-24'

y = ['BTC-USD']
y = yf.download(tickers = y,
                start = training_start,
                end = end_stock,
                interval = '1d')
y = y['Adj Close'].rename('y')

stocks = ['ZAL.DE', 'SQ', 'CRSP', 'JD', 'DE', 'KTOS', 'GOOG', 'TPB']
x = yf.download(tickers = stocks,
                start = training_start,
                end = end_stock,
                interval = '1d')
x = x.iloc[:,:len(stocks)]

x.columns = x.columns.droplevel()

x.index = x.index.tz_localize(None)

df = pd.concat([y,x], axis=1).dropna()

df_training = df[df.index <= training_end]

test = adfuller(x = df_training.y)[1]

if test < 0.05:
    print('The time series is stationary')
else:
    print('The time series is not stationary')

differencing = df_training.pct_change().dropna()

test = adfuller(x = differencing.y)[1]

if test < 0.05:
    print('The time series is stationary')
else:
    print('The time series is not stationary')


plt.figure(figsize = (8,6))
sns.set(font_scale = 1.2)
sns.heatmap(differencing.corr(),
            annot = True,
            fmt = '.1g',
            cmap = 'YlOrBr',
            center =  True,
            linewidth = 1,
            linecolor = 'black')
#plt.show()

df_final = df.drop(columns = ['ZAL.DE'])


df_final = df.set_index(pd.date_range(start='2020-09-01', periods=len(df_final.index)))
pre_period = [pd.to_datetime(training_start), pd.to_datetime(training_end)]
post_period = [pd.to_datetime(treatment_start), pd.to_datetime(treatment_end)]

impact = CausalImpact(data=df_final, pre_period=pre_period, post_period=post_period)
impact.plot()


ERROR MESSAGE:

Traceback (most recent call last):
  File "C:\Users\MGerdes\PycharmProjects\EconometricsCausalInference\Google Causal Impact.py", line 71, in <module>
    impact = CausalImpact(data=df_final, pre_period=pre_period, post_period=post_period)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\MGerdes\PycharmProjects\EconometricsCausalInference\server\Lib\site-packages\causalimpact\main.py", line 206, in __init__
    processed_input = cidata.process_input_data(data, pre_period, post_period,
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\MGerdes\PycharmProjects\EconometricsCausalInference\server\Lib\site-packages\causalimpact\data.py", line 120, in process_input_data
    pre_data, post_data = process_pre_post_data(fmt_data, pre_period, post_period)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\MGerdes\PycharmProjects\EconometricsCausalInference\server\Lib\site-packages\causalimpact\data.py", line 266, in process_pre_post_data
    checked_pre_period = process_period(pre_period, data)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\MGerdes\PycharmProjects\EconometricsCausalInference\server\Lib\site-packages\causalimpact\data.py", line 389, in process_period
    raise ValueError("{point} not present in input data index.".format(
ValueError: 20201019 not present in input data index.


python pycharm
© www.soinside.com 2019 - 2024. All rights reserved.