我已经为线性回归模型编写了一些代码来预测房价。我的想法与教程视频完全相同;当我写
random_state=42
时,它可以正常工作,没有任何错误,但是当我将 random_state
更改为任何其他数字时,它会出现此错误。
这是代码:
from sklearn.model_selection import train_test_split
X = data.drop('SalesPrice', axis = 1)
y = data['SalesPrice']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(X_train, y_train)
predictions = lr.predict(X_test)
print("Actual value of the house: ", y_test[0])
print("Model prediction value: ", predictions[0])
这是错误:
KeyError Traceback (most recent call last)
File C:\ProgramData\anaconda3\Lib\site-packages\pandas\core\indexes\base.py:3653, in Index.get_loc(self, key)
3652 try:
-> 3653 return self._engine.get_loc(casted_key)
3654 except KeyError as err:
File C:\ProgramData\anaconda3\Lib\site-packages\pandas\_libs\index.pyx:147, in pandas._libs.index.IndexEngine.get_loc()
File C:\ProgramData\anaconda3\Lib\site-packages\pandas\_libs\index.pyx:176, in pandas._libs.index.IndexEngine.get_loc()
File pandas\_libs\hashtable_class_helper.pxi:2606, in pandas._libs.hashtable.Int64HashTable.get_item()
File pandas\_libs\hashtable_class_helper.pxi:2630, in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 0
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
Cell In[66], line 3
1 predictions = lr.predict(X_test)
----> 3 print("Actual value of the house: ", y_test[0])
4 print("Model prediction value: ", predictions[0])
File C:\ProgramData\anaconda3\Lib\site-packages\pandas\core\series.py:1007, in Series.__getitem__(self, key)
1004 return self._values[key]
1006 elif key_is_scalar:
-> 1007 return self._get_value(key)
1009 if is_hashable(key):
1010 # Otherwise index.get_value will raise InvalidIndexError
1011 try:
1012 # For labels that don't resolve as scalars like tuples and frozensets
File C:\ProgramData\anaconda3\Lib\site-packages\pandas\core\series.py:1116, in Series._get_value(self, label, takeable)
1113 return self._values[label]
1115 # Similar to Index.get_value, but we do not fall back to positional
-> 1116 loc = self.index.get_loc(label)
1118 if is_integer(loc):
1119 return self._values[loc]
File C:\ProgramData\anaconda3\Lib\site-packages\pandas\core\indexes\base.py:3655, in Index.get_loc(self, key)
3653 return self._engine.get_loc(casted_key)
3654 except KeyError as err:
-> 3655 raise KeyError(key) from err
3656 except TypeError:
3657 # If we have a listlike key, _check_indexing_error will raise
3658 # InvalidIndexError. Otherwise we fall through and re-raise
3659 # the TypeError.
3660 self._check_indexing_error(key)
KeyError: 0
正如回溯所提到的,错误源自
print("Actual value of the house: ", y_test[0])
。
仅当随机 20% 的数据在 y_test[0]
之后也有第
0
索引时,
train_test_split
才会起作用。这就是为什么它适用于某些 random_state
值,但不适用于大多数值。
通常您想使用:
y_test.to_list()[0]
y_test.iloc[0]
TLDR:替换打印语句中的
y_test[0]