在模拟研究中,从一个不完整的数据集中创建5个完整的数据集[r]中的小鼠套件

问题描述 投票:0回答:1

对于一项研究,我需要在R.

中的
mice

软件包的帮助下为100个不完整的数据集生成五个完整的数据集。

该代码正常工作(当您拥有

df1
数据集时):
df1_imp <- mice(df1, m = 5, method = 'logreg', print = F)
然后,我们可以访问如下所示的完整数据集(5):

dataset1 <- complete(df1_imp, 1)
dataset2 <- complete(df1_imp, 2)
dataset3 <- complete(df1_imp, 3)
dataset4 <- complete(df1_imp, 4)
dataset5 <- complete(df1_imp, 5)

-fine。但是,我有100个不完整的数据集。每个将产生5个完整的数据集(总计500个)。如何查看这500个数据集?因为我要分析它们。

[DFS]我的数据集列表(每组必须产生5个完整的数据集,3x5 = 15)

list(structure(c(1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, NA, 1, NA, 0, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, NA, NA, 0, 1, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, NA, 1, 0, 1, NA, 1, 0, 0, 0, 1, 1, 0), dim = 6:5))
    
r simulation missing-data r-mice
1个回答
0
投票
in

complete

,选择
action='all'
include=FALSE
排除未输入的数据集。对于仿真研究,您可能需要指定
seed
> library(mice) > seed. <- 42 > lapply(raw_data, mice, m=5, method='pmm', seed=seed., printFlag=FALSE) |> + lapply(complete, action='all', include=FALSE) [[1]] $`1` V1 V2 V3 V4 V5 1 1 1 0 0 0 2 0 0 0 1 1 3 0 1 1 0 1 4 1 1 0 1 1 5 0 0 1 1 1 6 0 0 1 0 1 $`2` V1 V2 V3 V4 V5 1 1 1 0 0 0 2 0 0 0 1 1 3 0 1 1 0 1 4 1 1 0 1 1 5 0 0 1 1 1 6 0 0 1 0 1 $`3` V1 V2 V3 V4 V5 1 1 1 0 0 0 2 0 0 0 1 1 3 0 1 1 0 1 4 1 1 0 1 1 5 0 0 1 1 1 6 0 0 1 0 1 $`4` V1 V2 V3 V4 V5 1 1 1 0 0 0 2 0 0 0 1 1 3 0 1 1 0 1 4 1 1 0 1 1 5 0 0 1 1 1 6 0 0 1 0 1 $`5` V1 V2 V3 V4 V5 1 1 1 0 0 0 2 0 0 0 1 1 3 0 1 1 0 1 4 1 1 0 1 1 5 0 0 1 0 1 6 0 0 1 0 1 attr(,"class") [1] "mild" "list" [[2]] $`1` V1 V2 V3 V4 V5 1 1 0 0 1 0 2 1 0 0 0 1 3 0 0 1 1 1 4 1 0 1 0 1 5 1 1 0 0 1 6 0 0 1 1 1 $`2` V1 V2 V3 V4 V5 1 1 0 0 1 0 2 1 0 0 0 1 3 0 0 1 1 1 4 1 0 1 0 1 5 1 1 0 0 1 6 0 0 1 1 1 $`3` V1 V2 V3 V4 V5 1 1 0 0 1 0 2 1 0 0 0 1 3 0 0 1 1 1 4 1 0 1 0 1 5 1 1 0 0 1 6 0 0 1 1 1 $`4` V1 V2 V3 V4 V5 1 1 0 0 1 0 2 1 0 0 0 1 3 0 0 1 1 1 4 1 0 1 0 1 5 1 1 0 0 1 6 0 0 1 1 1 $`5` V1 V2 V3 V4 V5 1 1 0 0 1 0 2 1 0 0 0 1 3 0 0 1 1 1 4 1 0 1 1 1 5 1 1 0 0 1 6 0 0 1 1 1 attr(,"class") [1] "mild" "list" [[3]] $`1` V1 V2 V3 V4 V5 1 1 1 0 NA 0 2 0 0 0 1 0 3 1 1 1 0 0 4 0 0 1 1 1 5 0 0 1 NA 1 6 0 0 0 1 0 $`2` V1 V2 V3 V4 V5 1 1 1 0 NA 0 2 0 0 0 1 0 3 1 1 1 0 0 4 0 0 1 1 1 5 0 0 1 NA 1 6 0 0 0 1 0 $`3` V1 V2 V3 V4 V5 1 1 1 0 NA 0 2 0 0 0 1 0 3 1 1 1 0 0 4 0 0 1 1 1 5 0 0 1 NA 1 6 0 0 0 1 0 $`4` V1 V2 V3 V4 V5 1 1 1 0 NA 0 2 0 0 0 1 0 3 1 1 1 0 0 4 0 0 1 1 1 5 0 0 1 NA 1 6 0 0 0 1 0 $`5` V1 V2 V3 V4 V5 1 1 1 0 NA 0 2 0 0 0 1 0 3 1 1 1 0 0 4 0 0 1 1 1 5 0 0 1 NA 1 6 0 0 0 1 0 attr(,"class") [1] "mild" "list" Warning messages: 1: Number of logged events: 30 2: Number of logged events: 30 3: Number of logged events: 2

注意,在您的示例中,第三个数据集的归类由于共线性而失败。您可以通过设置
printFlag=TRUE

而不将管道调查为

complete
.
进行调查。
data:


> dput(raw_data) list(structure(c(1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, NA, 1, NA, 0, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, NA, NA, 0, 1, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, NA, 1, 0, 1, NA, 1, 0, 0, 0, 1, 1, 0), dim = 6:5))

© www.soinside.com 2019 - 2025. All rights reserved.