我必须能够在二元分类问题上绘制ROC曲线,但是作为预测因子必须插入一个数值或有序向量,由于我已经进行了分类,我的预测因子是因子(0,1)。
有什么方法可以解决这个问题吗?
rfCarseats
Call:
randomForest(formula = Salesdic ~ ., data = train_Carseats, proximity = TRUE)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 3
OOB estimate of error rate: 20%
Confusion matrix:
0 1 class.error
0 153 17 0.1000000
1 39 71 0.3545455
> prediction_rf_Carseats
2 3 4 6 10 13 15 19 24 28 32 45 46 52 54 56 60 61 66 67 69 70 73 76 79 81 101 106 111
1 1 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 1
116 121 128 130 139 143 149 155 161 162 163 164 167 168 171 172 176 179 186 188 189 190 191 194 195 201 203 204 206
0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 1 1 1 1 0 0 0 0
207 208 211 215 220 221 225 229 232 233 234 236 239 243 249 251 253 257 258 264 267 274 279 283 290 295 297 300 301
0 0 0 0 1 1 0 0 1 1 1 0 0 0 0 1 0 0 0 0 1 1 1 1 0 1 1 1 1
304 306 307 308 311 312 316 318 321 323 326 331 332 336 338 339 340 346 353 356 362 363 369 370 372 374 376 385 388
1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 1 1 0 0 1 1 0 0 0 1 0
392 396 397 399
0 1 0 0
Levels: 0 1
> train_Carseats$Salesdic
[1] 1 0 1 1 0 1 0 0 0 0 0 1 1 0 0 0 1 1 0 0 1 0 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 0 0 1 0 0 0 0 1 0 1 0 1 0 1 0
[57] 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 1 0 0 0 0 0 1 0 1 1 0 0 1 0 1 1 0 0 0 0 0 1 1 1 1 0 1
[113] 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 1 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 1 1 1 1
[169] 0 1 0 0 1 0 0 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 1 0 1 0 1 0 1 0 0 0 0 1 0 0 0 0 1 1 1 0 1 0 0 0 1
[225] 1 0 0 0 1 1 0 1 1 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 1
Levels: 0 1
编辑(问题解决):从我有一个randomForest对象的事实出发,从输出中我可以进行下面的代码。
library(pROC)
ROC_Carseats_RF <- roc(train_Carseats$Salesdic, rfCarseats$votes[ , 1],
smoothed = TRUE,
ci=TRUE, ci.alpha=0.9, stratified=FALSE,
plot=TRUE, auc.polygon=TRUE, max.auc.polygon=TRUE, grid=TRUE,
print.auc=TRUE, show.thres=TRUE)
plot.roc(ROC_Carseats_RF, print.auc = TRUE)
它考虑到了森林中正确投票的树木的响应值.