LinearRegression由给定的最后n行

问题描述 投票:0回答:1

我目前正在研究时间序列模型。非常简单我正在部署最后一行OHLC(开盘,高,低,收盘)值,并试图预测下一个收盘价。简单无用。但是我想做的是给最后10天以预测明天的价格。我知道这将是不准确的,但这是我正在尝试做的。

这里我如何获取NextClose并将其应用于线性回归模型:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

df = pd.read_csv("./EURUSD.csv")
days = 1
df['NextClose'] = df['Close'].shift(-days)
df = df.dropna()
total = len(df)
test_ratio = 0.30
test_size = int(total * test_ratio)

total = len(df)
test_ratio = 0.30
test_size = int(total * test_ratio)
X = df[['Open', 'High', 'Low', 'Close']]
y = df[['NextClose']]
#build test and train data
X_train = X[:-test_size]
y_train = y[:-test_size]
X_test = X[-test_size:]
y_test = y[-test_size:]
# build model
lr = LinearRegression()
lr.fit(X_train, y_train)
y_pred = lr.predict(X_test)
plt.scatter(y_pred, y_test)
plt.show()

在这种情况下,我只给出最后一行。我想做的是喂最后10到20行。

python pandas scikit-learn linear-regression
1个回答
0
投票

我相信这是与在此MachineLearningMastery page上的函数create_dataset中描述的数据转换类似的数据转换(请参见LSTM使用窗口方法进行回归的L]部分>)。

目标是使用t:(t + days)行中的数据来预测(t + days + 1)行的收盘价。

X_train矩阵的每一行将有days * X.shape[1]列,在下面的示例中,它表示价值10天的数据中的扁平化数据。

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# generate random data to test
df = pd.DataFrame(np.random.normal(size=(2000, 4)))
df.columns = ['Open', 'High', 'Low', 'Close']

days = 10
df = df.dropna()

total = len(df)
test_ratio = 0.30
test_size = int(total * test_ratio)
X = df[['Open', 'High', 'Low', 'Close']]
y = df['Close'].shift(-days)

# this function based on the MachineLearningMastery page mentioned 
def create_dataset(X, y, look_back=1):
    dataX, dataY = [], []
    for i in range(X.shape[0]-look_back):
        a = X.iloc[i:(i+look_back), :].values.flatten()
        dataX.append(a)
        dataY.append(y.iloc[i])

    return np.array(dataX), np.array(dataY)

#build test and train data
X_train, y_train = create_dataset(X[:-test_size], y[:-test_size], look_back=days)
X_test, y_test = create_dataset(X[-test_size:], y[-test_size:], look_back=days)
# build model
lr = LinearRegression()
lr.fit(X_train, y_train)
y_pred = lr.predict(X_test)
plt.scatter(y_pred, y_test)
plt.show()
© www.soinside.com 2019 - 2024. All rights reserved.