我正在使用IRIS数据集对异常值进行离群值测试

问题描述 投票:0回答:0

但是我得到这个错误

Error in `mutate()`:
ℹ In argument: `res_stud_large = as.numeric(!between(res_stud, -2, 2))`.
Caused by error:
! length(g) must match nrow(X)
Backtrace:
  1. dplyr::mutate(...)
 13. base::stop(`<Rcpp::xc>`)
> 
我检查了那个

str(rstudent(mod)) Named num [1:150] -0.0113 -1.2776 0.0609 -0.0142 0.6545 ... - attr(*, "names")= chr [1:150] "1" "2" "3" "4" ...

由于这个问题,我会遇到这个错误吗?
我尝试使用
subset

功能,但没有成功。

使用

data.table::between

似乎有效。
> iris2 <-
+   iris |> 
+   transform(res_stud=rstudent(mod)) |> 
+   transform(res_stud_large=as.numeric(!data.table::between(res_stud, -2, 2)))
> 
> summary(iris2)
  Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
 Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
 1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
 Median :5.800   Median :3.000   Median :4.350   Median :1.300  
 Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
 3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
 Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  
       Species      res_stud         res_stud_large
 setosa    :50   Min.   :-2.809756   Min.   :0.00  
 versicolor:50   1st Qu.:-0.606569   1st Qu.:0.00  
 virginica :50   Median :-0.010823   Median :0.00  
                 Mean   :-0.001546   Mean   :0.06  
                 3rd Qu.: 0.620524   3rd Qu.:0.00  
                 Max.   : 2.294517   Max.   :1.00 


r dataset
最新问题
© www.soinside.com 2019 - 2025. All rights reserved.