我正在研究预测问题。这是一个 Excel 电子表格,其中有几列数据:
https://drive.google.com/file/d/1fWf6dX8kOCRB3GpX42AF6UvTmd0g9zXp/view?usp=sharing
我试图根据 A 列到 E 列的值来预测 F 列的值。下面给出了代码
import numpy as np
import pandas as pn
from keras.layers import Dense, Activation
from keras.models import Sequential
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import linear_model
import matplotlib.pyplot as plt
dataset = pn.read_excel(r"G:\Machine learning\data\database.xlsx", "Sheet5")
dataset.columns = ['A','B','C','D','E','F']
print (dataset)
#check= dataset.iloc[0:,3 :13]
X = dataset.iloc[0:,0 :5]
print(X)
Y = dataset.iloc[0:, 5 :6]
print(Y)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.15, random_state = 0)
print(X_test)
print(Y_test)
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
#
model = Sequential()
##
### Adding the input layer and the first hidden layer
model.add(Dense(32, activation = 'relu', input_dim = 5, kernel_initializer='normal'))
##
### Adding the second hidden layer
model.add(Dense(units = 16, activation = 'relu'))
model.add(Dense(units = 64, activation = 'relu'))
model.add(Dense(units = 8, activation = 'relu'))
model.add(Dense(units = 16, activation = 'relu'))
#model.add(Dense(units = 8, activation = 'linear'))
###
#### Adding the third hidden layer
#model.add(Dense(units = 16, activation = 'relu'))
#model.add(Dense(units = 16, activation = 'relu'))
#model.add(Dense(units = 16, activation = 'relu'))
##
### Adding the output layer
model.add(Dense(units = 1))
##
model.add(Dense(units = 1))
##
model.add(Dense(1))
### Compiling the ANN
model.compile(optimizer = 'nadam', loss = 'mean_squared_error',metrics= ['accuracy'])
##
### Fitting the ANN to the Training set
history = model.fit(X_train, Y_train, epochs=125, batch_size=5, verbose=1, validation_split=0.1)
##
y_pred = model.predict(X_test)
##
y_pred1 = model.predict(X_train)
print (y_pred1)
Y_test.reset_index(drop= True, inplace= True)
print (Y_train)
Y_train.reset_index(drop= True, inplace= True)
plt.plot(y_pred1)
plt.plot(Y_train)
plt.show()
plt.plot(y_pred)
plt.plot(Y_test)
plt.show()
print (y_pred)
print (Y_test)
plt.plot((Y_test-y_pred)*100/Y_test)
plt.show()
我从这段代码中得到的拟合如下所示。 适合 现在,当我预测时,在某些情况下误差很大,如下所示 预测
任何人都可以指导我即兴编写代码以获得更好的预测吗?

对于您提供的小数据集,可能是您的神经网络模型过于复杂(4 个隐藏层和最多 64 个神经元)。
您可以尝试手动减少层数,看看精度是否会提高。但将来,如果您想更实际地调整超参数或优化模型参数,您应该考虑使用随机/网格搜索、交叉验证和正则化等方法。
随机搜索:https://scikit-learn.org/stable/modules/ generated/sklearn.model_selection.RandomizedSearchCV.html