我已经从http://archive.ics.uci.edu/ml/datasets/pamap2+physical+activity+monitoring
我的任务是从给定的数据中洞悉数据,我在数据框中拥有第34个属性(所有整数均无nan值)
并且要根据一个目标属性'heart_rate'给定其余属性来训练模型(所有属性都是参加各种活动的参与者的数量)] >>
我想使用线性回归模型,但由于某些原因不能使用我的数据框,但是,如果您认为我做错了,我不介意从0开始
我的DataFrame列:
> Index(['timestamp', 'activity_ID', 'heart_rate', 'IMU_hand_temp', > 'hand_acceleration_16_1', 'hand_acceleration_16_2', > 'hand_acceleration_16_3', 'hand_gyroscope_rad_7', > 'hand_gyroscope_rad_8', 'hand_gyroscope_rad_9', > 'hand_magnetometer_μT_10', 'hand_magnetometer_μT_11', > 'hand_magnetometer_μT_12', 'IMU_chest_temp', 'chest_acceleration_16_1', > 'chest_acceleration_16_2', 'chest_acceleration_16_3', > 'chest_gyroscope_rad_7', 'chest_gyroscope_rad_8', > 'chest_gyroscope_rad_9', 'chest_magnetometer_μT_10', > 'chest_magnetometer_μT_11', 'chest_magnetometer_μT_12', > 'IMU_ankle_temp', 'ankle_acceleration_16_1', 'ankle_acceleration_16_2', > 'ankle_acceleration_16_3', 'ankle_gyroscope_rad_7', > 'ankle_gyroscope_rad_8', 'ankle_gyroscope_rad_9', > 'ankle_magnetometer_μT_10', 'ankle_magnetometer_μT_11', > 'ankle_magnetometer_μT_12', 'Intensity'], > dtype='object')
前5行:
timestamp activity_ID heart_rate IMU_hand_temp hand_acceleration_16_1 hand_acceleration_16_2 hand_acceleration_16_3 hand_gyroscope_rad_7 hand_gyroscope_rad_8 hand_gyroscope_rad_9 ... ankle_acceleration_16_1 ankle_acceleration_16_2 ankle_acceleration_16_3 ankle_gyroscope_rad_7 ankle_gyroscope_rad_8 ankle_gyroscope_rad_9 ankle_magnetometer_μT_10 ankle_magnetometer_μT_11 ankle_magnetometer_μT_12 Intensity 2928 37.66 lying 100.0 30.375 2.21530 8.27915 5.58753 -0.004750 0.037579 -0.011145 ... 9.73855 -1.84761 0.095156 0.002908 -0.027714 0.001752 -61.1081 -36.8636 -58.3696 low 2929 37.67 lying 100.0 30.375 2.29196 7.67288 5.74467 -0.171710 0.025479 -0.009538 ... 9.69762 -1.88438 -0.020804 0.020882 0.000945 0.006007 -60.8916 -36.3197 -58.3656 low 2930 37.68 lying 100.0 30.375 2.29090 7.14240 5.82342 -0.238241 0.011214 0.000831 ... 9.69633 -1.92203 -0.059173 -0.035392 -0.052422 -0.004882 -60.3407 -35.7842 -58.6119 low 2931 37.69 lying 100.0 30.375 2.21800 7.14365 5.89930 -0.192912 0.019053 0.013374 ... 9.66370 -1.84714 0.094385 -0.032514 -0.018844 0.026950 -60.7646 -37.1028 -57.8799 low 2932 37.70 lying 100.0 30.375 2.30106 7.25857 6.09259 -0.069961 -0.018328 0.004582 ... 9.77578 -1.88582 0.095775 0.001351 -0.048878 -0.006328 -60.2040 -37.1225 -57.8847 low
[如果检查timestamp属性,您会看到所获取的数据以毫秒为单位,因此,最好每隔2-5秒使用此数据框中的数据并训练模型
也是一种选择,我想使用以下模型之一来完成此任务:线性,多项式,多重线性,凝聚聚类和kmeans聚类。
我的代码:
target = subject1.DataFrame(data.target, columns=["heart_rate"]) X = df y = target[“heart_rate”] lm = linear_model.LinearRegression() model = lm.fit(X,y) predictions = lm.predict(X) print(predictions)[0:5]
错误:
AttributeError Traceback (most recent call last) <ipython-input-93-b0c3faad3a98> in <module>() 3 #heart_rate 4 # Put the target (housing value -- MEDV) in another DataFrame ----> 5 target = subject1.DataFrame(data.target, columns=["heart_rate"]) c:\python36\lib\site-packages\pandas\core\generic.py in __getattr__(self, name) 5177 if self._info_axis._can_hold_identifiers_and_holds_name(name): 5178 return self[name] -> 5179 return object.__getattribute__(self, name) 5180 5181 def __setattr__(self, name, value): AttributeError: 'DataFrame' object has no attribute 'DataFrame'
用于修复我使用的错误:
subject1.columns = subject1.columns.str.strip()
但仍然没有用
谢谢,抱歉,如果我不够精确。
我已经从http://archive.ics.uci.edu/ml/datasets/pamap2+physical+activity+下载并标记了数据,我的任务是从给出的数据中洞察数据34 ...
尝试一下: