此线性回归预测模型存在问题。当数据集中不存在负值时,散点图显示的数据点为负值。我已经检查了形状和最小值,图表不应显示这些负值,但我无法弄清楚为什么散点图表明它们存在。
指标定义代码
def evaluate_model(y_test,price_pred):
gradient = price_linear.coef_
intercept = price_linear.intercept_
mile_mae = mean_absolute_error(y_test, price_pred)
mile_mse = mean_squared_error(y_test, price_pred)
mile_rmse=np.sqrt(mile_mse)
mile_r2 = r2_score(y_test, price_pred)
print(f'Gradient: {gradient} \n Intercept: {intercept}')
print(f'\n Mean absolute error: {mile_mae} \n Mean squared error: {mile_mse} \n Root
mean squared error: {mile_rmse} \n Coefficient of determination: {mile_r2}')
线性回归模型的代码
numerical_inputs = ['Mileage', 'Year of manufacture', 'Engine size']
x = df[numerical_inputs]
y = df['Price']
#splitting of the data
x_num_train, x_num_test, y_price_train, y_price_test = train_test_split(x, y, test_size
= 0.2, random_state = 42)
#scaling the numerical data
scale = StandardScaler()
#fitting only to train data to prevent data leakage
scale.fit(x_num_train)
num_train_scaled = scale.transform(x_num_train)
num_test_scaled = scale.transform(x_num_test)
multi_price_linear = LinearRegression()
multi_price_linear.fit(num_train_scaled, y_price_train)
multi_price_pred = multi_price_linear.predict(num_test_scaled)
evaluate_model(y_price_test, multi_price_pred)
#plt.show()
plt.figure(figsize=(14,8))
plt.scatter(y_price_test, multi_price_pred, alpha = 0.6)
plt.plot([min(y_price_test), max(y_price_test)], [min(y_price_test),
max(y_price_test)], color = 'red')
plt.ylabel('Actual Price')
plt.xlabel('Predicted Price')
plt.title('Predicted Price vs Actual Price')
plt.show()
输出如下:
渐变:[-2720.41736808 9520.41488938 6594.02448017] 拦截:13854.628699999997
平均绝对误差:6091.458141656242 均方误差:89158615.76017143 均方根误差:9442.38400829851 决定系数:0.671456306417368
这是散点图所指示内容的图像:
如果这表明数据或代码存在问题,我不想限制图表显示负值。谢谢!
答案是您不小心切换了轴标签。您的预测值绘制在 Y 轴上,您的实际值绘制在 X 轴上。