这与 将 prop.test 应用于数据帧中的每一行是同一个问题,但我正在寻找一个
data.table
答案,它不依赖于像 broom::tidy
这样的另一个包。简短的问题:如何在 data.table 中的所有行上运行 prop.test
并将结果分配给列。
# test data
dt1 <- structure(list(Strategy = c("active immunization", "cell regeneration/restoration",
"cellular senescence", "combination", "energy metabolism", "epigenome/transcription",
"general neuroprotection", "immune response", "lipid metabolism",
"metal ion modulation", "microtubule stabilization", "neuromodulator/transmission",
"neurotrophin pathway", "non-pharmacological", "oxidative stress",
"passive immunization", "proteostasis network", "tau aggregation",
"tau enzyme/PTM", "tau isoform imbalance correction", "tau propagation",
"tau reduction"), Outcome = c("DE", "DE", "DE", "DE", "DE", "DE",
"DE", "DE", "DE", "DE", "DE", "DE", "DE", "DE", "DE", "DE", "DE",
"DE", "DE", "DE", "DE", "DE"), X = c(14L, 1L, 1L, 1L, 7L, 2L,
25L, 32L, 5L, 1L, 6L, 15L, 4L, 18L, 9L, 46L, 33L, 26L, 43L, 2L,
1L, 6L), N = c(18L, 4L, 1L, 2L, 7L, 2L, 25L, 39L, 8L, 3L, 7L,
19L, 5L, 22L, 10L, 63L, 38L, 27L, 48L, 3L, 2L, 6L)), class = c("data.table",
"data.frame"), row.names = c(NA, -22L), sorted = c("Strategy",
"Outcome"))
因此,我通常可以使用
unlist
展平测试对象或公式并根据需要进行切片,但 prop.test
对象无法按预期工作。这可以通过将其更改为带有 broom::tidy
的小标题来解决,尽管我想象速度会慢并且需要依赖:
test <- prop.test(dt1[[1,3]], dt1[[1,4]])
unlist(test)[c(1, 3, 6:7)]
statistic.X-squared p.value conf.int1 conf.int2
"4.5" "0.0338948535246893" "0.519186146108054" "0.926276897802783"
# does not work
dt1[, c("chi", "p", "lower", "upper") := unlist(prop.test(x = X, n = N))[c(1, 3, 6:7)], by = .I]
# does work, but...
dt1[, c("chi", "p", "lower", "upper") := broom::tidy(prop.test(x = X, n = N))[c(2:3, 5:6)], by = .I]
> dt1
Key: <Strategy, Outcome>
Strategy Outcome X N chi p lower upper
<char> <char> <int> <int> <num> <num> <num> <num>
1: active immunization DE 14 18 4.500000 3.389485e-02 0.51918615 0.9262769
2: cell regeneration/restoration DE 1 4 0.250000 6.170751e-01 0.01319116 0.7805735
3: cellular senescence DE 1 1 0.000000 1.000000e+00 0.05462076 1.0000000
4: combination DE 1 2 0.000000 1.000000e+00 0.09453121 0.9054688
5: energy metabolism DE 7 7 5.142857 2.334220e-02 0.56093387 1.0000000
6: epigenome/transcription DE 2 2 0.500000 4.795001e-01 0.19786746 1.0000000
7: general neuroprotection DE 25 25 23.040000 1.586656e-06 0.83422699 1.0000000
8: immune response DE 32 39 14.769231 1.215020e-04 0.65890539 0.9189739
9: lipid metabolism DE 5 8 0.125000 7.236736e-01 0.25894769 0.8975920
10: metal ion modulation DE 1 3 0.000000 1.000000e+00 0.01765279 0.8746655
11: microtubule stabilization DE 6 7 2.285714 1.305700e-01 0.42007834 0.9924972
12: neuromodulator/transmission DE 15 19 5.263158 2.178146e-02 0.53902028 0.9302931
13: neurotrophin pathway DE 4 5 0.800000 3.710934e-01 0.29879105 0.9894700
14: non-pharmacological DE 18 22 7.681818 5.577994e-03 0.58992882 0.9400827
15: oxidative stress DE 9 10 4.900000 2.685670e-02 0.54115398 0.9947577
16: passive immunization DE 46 63 12.444444 4.192370e-04 0.60131748 0.8306991
17: proteostasis network DE 33 38 19.184211 1.186911e-05 0.71116190 0.9505271
18: tau aggregation DE 26 27 21.333333 3.859616e-06 0.79110819 0.9980636
19: tau enzyme/PTM DE 43 48 28.520833 9.269570e-08 0.76556981 0.9610103
20: tau isoform imbalance correction DE 2 3 0.000000 1.000000e+00 0.12533447 0.9823472
21: tau propagation DE 1 2 0.000000 1.000000e+00 0.09453121 0.9054688
22: tau reduction DE 6 6 4.166667 4.122683e-02 0.51681705 1.0000000
Strategy Outcome X N chi p lower upper
使用
lapply
迭代 unlist
结果
dt1[, c("chi", "p", "lower", "upper") := lapply(unlist(prop.test(X, N))[
c("statistic.X-squared", "p.value", "conf.int1", "conf.int2")], \(x)
as.numeric(x)), by = .I]
There were 12 warnings (use warnings() to see them)
> dt1
Key: <Strategy, Outcome>
Strategy Outcome X N chi p
<char> <char> <int> <int> <num> <num>
1: active immunization DE 14 18 4.500000 3.389485e-02
2: cell regeneration/restoration DE 1 4 0.250000 6.170751e-01
3: cellular senescence DE 1 1 0.000000 1.000000e+00
4: combination DE 1 2 0.000000 1.000000e+00
5: energy metabolism DE 7 7 5.142857 2.334220e-02
6: epigenome/transcription DE 2 2 0.500000 4.795001e-01
7: general neuroprotection DE 25 25 23.040000 1.586656e-06
8: immune response DE 32 39 14.769231 1.215020e-04
9: lipid metabolism DE 5 8 0.125000 7.236736e-01
10: metal ion modulation DE 1 3 0.000000 1.000000e+00
11: microtubule stabilization DE 6 7 2.285714 1.305700e-01
12: neuromodulator/transmission DE 15 19 5.263158 2.178146e-02
13: neurotrophin pathway DE 4 5 0.800000 3.710934e-01
14: non-pharmacological DE 18 22 7.681818 5.577994e-03
15: oxidative stress DE 9 10 4.900000 2.685670e-02
16: passive immunization DE 46 63 12.444444 4.192370e-04
17: proteostasis network DE 33 38 19.184211 1.186911e-05
18: tau aggregation DE 26 27 21.333333 3.859616e-06
19: tau enzyme/PTM DE 43 48 28.520833 9.269570e-08
20: tau isoform imbalance correction DE 2 3 0.000000 1.000000e+00
21: tau propagation DE 1 2 0.000000 1.000000e+00
22: tau reduction DE 6 6 4.166667 4.122683e-02
Strategy Outcome X N chi p
lower upper
<num> <num>
1: 0.51918615 0.9262769
2: 0.01319116 0.7805735
3: 0.05462076 1.0000000
4: 0.09453121 0.9054688
5: 0.56093387 1.0000000
6: 0.19786746 1.0000000
7: 0.83422699 1.0000000
8: 0.65890539 0.9189739
9: 0.25894769 0.8975920
10: 0.01765279 0.8746655
11: 0.42007834 0.9924972
12: 0.53902028 0.9302931
13: 0.29879105 0.9894700
14: 0.58992882 0.9400827
15: 0.54115398 0.9947577
16: 0.60131748 0.8306991
17: 0.71116190 0.9505271
18: 0.79110819 0.9980636
19: 0.76556981 0.9610103
20: 0.12533447 0.9823472
21: 0.09453121 0.9054688
22: 0.51681705 1.0000000
lower upper