为什么我得到这个错误。发现输入变量的样本数不一致。[1, 15]

问题描述 投票:0回答:3

我正试图解决以下问题,但我得到一个错误。

from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.metrics.regression import r2_score
import numpy as np

degrees = np.arange(0, 9)
np.random.seed(0)
n = 15
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10
for i in degrees:
    poly = PolynomialFeatures(i)
    x_poly = poly.fit_transform(x)
    X_train, X_test, y_train, y_test = train_test_split(x_poly, y, random_state = 0)
    linreg = LinearRegression().fit(X_train, y_train)
    r2_train = linreg.r2_score(X_train, y_train)
    r2_test = linreg.r2_train(X_test, y_test)

发现输入变量的样本数不一致。[1, 15]

有什么原因导致我出现上述错误。

python scikit-learn linear-regression
3个回答
0
投票

代码中出现了三个错误。

  1. 你需要 重塑 x 变成 2D numpy 阵列 使用 x.reshape(-1,1).
  2. linreg.r2_score无效. 另外,不需要使用 r2_score. 只是 使用 linreg.score. 这将返回确定系数 R^2 的预测(参考).
  3. degree r2_score be 0 所以用 PolynomialFeatures(i+1) 圈内 除了 如果你真的打算使用0度多项式展开。请记住,如果一个输入样本是二维的,并且是 [a, b] 的形式,那么 2 度多项式的特征是 [1, a, b, a^2, ab, b^2]。

完整的工作实例。

from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.metrics.regression import r2_score
import numpy as np
from sklearn.model_selection import train_test_split


degrees = np.arange(0, 9)
np.random.seed(0)
n = 15
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10
for i in degrees:
    poly = PolynomialFeatures(i+1)
    x_poly = poly.fit_transform(x.reshape(-1,1))
    X_train, X_test, y_train, y_test = train_test_split(x_poly, y, random_state = 0)
    linreg = LinearRegression().fit(X_train, y_train)
    r2_train = linreg.score(X_train, y_train)
    r2_test = linreg.score(X_test, y_test)

0
投票

你没有对x进行重塑. x的形状应该是 (n_samples, n_features). 而且linreg.r2_score也没有了。我修改了下面的代码。

degrees = np.arange(0, 9)
np.random.seed(0)
n = 15
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10

x = x.reshape(-1, 1)

for i in degrees:
    poly = PolynomialFeatures(i)
    x_poly = poly.fit_transform(x)
    X_train, X_test, y_train, y_test = train_test_split(x_poly, y, random_state = 0)
    linreg = LinearRegression().fit(X_train, y_train)
    r2_train = linreg.score(X_train, y_train)
    r2_test = linreg.score(X_test, y_test)

-1
投票

你的代码有很多错误和错别字. 如果你能先练习一些已知的问题,比如iris,房价回归问题等,会很有用。

正确的代码 。

from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.metrics.regression import r2_score
from sklearn.model_selection import train_test_split
import numpy as np

degrees = np.arange(0, 9)
np.random.seed(0)
n = 15
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10

#### convert x into 2D matrix  #####
x= x.reshape(-1,1)

i=1   
for i in degrees:
  poly = PolynomialFeatures(i)
  x_poly = poly.fit_transform(x)
  X_train, X_test, y_train, y_test = train_test_split(x_poly, y, random_state = 0)
  linreg = LinearRegression().fit(X_train, y_train)
  r2_train = r2_score(y_train,linreg.predict(X_train))
  r2_test = r2_score(y_test ,linreg.predict(X_test))

#### linreg.score(X_train, y_train) can also used to calculate r2_score
© www.soinside.com 2019 - 2024. All rights reserved.