如何在一个整洁的输出中可视化wilcoxon测试循环？

Question

我为我的数据集的子类别执行了几次wilcoxon测试。 R执行这些测试，但每次测试都会显示一个大输出。我希望有一个输出，例如以表格的形式总结所有14个wilcoxon测试，以整洁的方式（分析子集的名称，测试统计值，p值，结果，例如替代假设：......）

我已经尝试了很多在网上找到的技巧，但由于我不太熟悉R，我无法分析问题，它根本无法正常工作而且朋友告诉我：“stackoverflow是你的朋友。请求帮助！”。你能帮我进一步吗？

最好，罗马

这是我为获取输出而执行的代码：

strFlaecheNames<-c(df_summary$Flaeche)
varResult<-array(vector("list",10000),1000)

for(i in 1:14){ 

varResult[i]<-wilcox.test(df1$y,data=df1,subset(df1$y, df1$x ==  strFlaecheNames[i]))

print((wilcox.test(df1$y,data=df1,subset(df1$y, df1$x == strFlaecheNames[i]))))

}

我的14个输出之一看起来像这样：

Wilcoxon rank sum test with continuity correction

data:  df1$y and subset(df1$y, df1$x == strFlaecheNames[i])
W = 1170300, p-value = 4.888e-13
alternative hypothesis: true location shift is not equal to 0

这是一个代码示例，我也有repx的形式，但我有点不能发布但是因为代码工作，我想可以发布它吗？：

    ed_exp2 <- structure(list(x = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
                                              1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 7L, 7L, 7L, 7L, 7L, 7L,
                                              7L, 7L, 7L, 7L, 7L, 7L, 7L), .Label = c("Area1", "Area10", "Area11",
                                                                                      "Area12", "Area13", "Area14", "Area2", "Area3", "Area4", "Area5",
                                                                                      "Area6", "Area7", "Area8", "Area9"), class = "factor"), y = c(0L,
                                                                                                                                                    0L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 2L, 1L,
                                                                                                                                                    2L, 0L, 1L, 0L, -2L, 2L, 0L, 2L, 1L, 2L, 2L, -2L, 0L, 0L)), .Names = c("x",
                                                                                                                                                                                                                           "y"), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
                                                                                                                                                                                                                                               11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 169L, 170L, 171L, 172L,
                                                                                                                                                                                                                                               173L, 174L, 175L, 176L, 177L, 178L, 179L, 180L, 181L), class = "data.frame")
#load libraries
library("stats")
library("dplyr")
library("ggpubr")
library("tidyverse")
library("reprex")

strAreaNames<-c("Area1","Area2")

##required size of memory for output unclear - therefore "10000),1000)"
varResult<-array(vector("list",10000),1000)
#run wilcox.test
for(i in 1:2){
  varResult[i]<-wilcox.test(ed_exp2$y,data=ed_exp2,subset(ed_exp2$y, ed_exp2$x == strAreaNames[i]))
  print((wilcox.test(ed_exp2$y,data=ed_exp2,subset(ed_exp2$y, ed_exp2$x == strAreaNames[i]))))
}

Answer 1

是的，R中的大多数统计函数返回某种嵌套列表，通常是因为它们的输出比单个数据行更复杂。 broom包旨在获取最常见的统计函数（lm等）的主要输出，并将它们“整理”到数据行中。

如果我理解正确，你想测试y水平内的平均x是否与y的总体平均值不同，使用未配对的wilcox.test（秩和检验）进行平均值的比较。我不确定这是一件有意义的事情，因为你的样本显然不是独立的，但无论如何我都会告诉你如何去做。

整理`wilcox.test`的输出

library(broom)
tidy(wilcox.test(90:100, 94:100, exact = FALSE))

# A tibble: 1 x 4
  statistic p.value method                                            alternative
      <dbl>   <dbl> <chr>                                             <chr>      
1      24.5   0.220 Wilcoxon rank sum test with continuity correction two.sided

tibble是一种稍微好一点的数据框架，是tidyverse的一部分。

工作原理：如?wilcox.test所述，该函数实际返回一个列表：

> str(wilcox.test(90:100, 94:100, exact = FALSE))

List of 7
 $ statistic  : Named num 24.5
  ..- attr(*, "names")= chr "W"
 $ parameter  : NULL
 $ p.value    : num 0.22
 $ null.value : Named num 0
  ..- attr(*, "names")= chr "location shift"
 $ alternative: chr "two.sided"
 $ method     : chr "Wilcoxon rank sum test with continuity correction"
 $ data.name  : chr "90:100 and 94:100"
 - attr(*, "class")= chr "htest"

broom包只提取适合数据帧行的主要部分。

Split-apply-combine，基础R方式

现在，您希望为x（Area1，Area2等）的每个唯一值执行此操作，然后在数据框中收集结果，该数据框显示每个结果的子集。在基础R中有很多种方法可以做到这一点;这是一个：

# Example data frame (I'm calling it "d" for brevity)
d <- data.frame(x = c("Area1", "Area1", "Area1", "Area2", "Area2"), y = c(1, 2, 3, 2, 3))
# Empty list to hold our results
L <- list()
for (i in unique(d$x)) {
  # Run Wilcoxon test, "tidy" the result, and assign it to element i of L
  L[[i]] <- tidy(wilcox.test(d$y, d[d$x == i, "y"], exact = FALSE))
}
L
# Combine the results
results <- do.call(rbind, L)  # Same as rbind(L[[1]], L[[2]], ...)
# Add a column identifying the subsets
results$area <- names(L)
results

# A tibble: 2 x 5
  statistic p.value method                                            alternative area 
*     <dbl>   <dbl> <chr>                                             <chr>       <chr>
1       8.5   0.875 Wilcoxon rank sum test with continuity correction two.sided   Area1
2       4     0.834 Wilcoxon rank sum test with continuity correction two.sided   Area2

Split-apply-combine，整齐的方式

split-apply-combine工作流程非常常见，但在基础R中实现起来有些麻烦.tidyverse / dplyr方式更简洁，通过一些练习，更容易阅读：

library("tidyverse")
library("broom")

d %>% 
  group_by(x) %>% 
  do(wilcox.test(d$y, .$y, exact = FALSE) %>% tidy)

# A tibble: 2 x 5
# Groups:   x [2]
  x     statistic p.value method                                            alternative
  <fct>     <dbl>   <dbl> <chr>                                             <chr>      
1 Area1       8.5   0.875 Wilcoxon rank sum test with continuity correction two.sided  
2 Area2       4     0.834 Wilcoxon rank sum test with continuity correction two.sided

笔记：

管道%>%是语法糖：x %>% f(a)意味着f(x, a)。这样可以更容易地以可读的方式编写嵌套函数调用。阅读pipes手册以获得更深入的了解。
通常在dplyr中，您通过不带引号的名称引用数据框列，而不是命名数据框。你的情况需要d$y来获得所有y，无论group_by，但这是一个例外。再次阅读手册。
do()中的点是指数据帧的当前子集，因此.$y是当前组中的y。

如何在一个整洁的输出中可视化wilcoxon测试循环？

问题描述投票：0回答：1

1个回答

整理`wilcox.test`的输出

Split-apply-combine，基础R方式

Split-apply-combine，整齐的方式

最新问题

如何在一个整洁的输出中可视化wilcoxon测试循环？

问题描述 投票：0回答：1

1个回答

整理wilcox.test的输出

Split-apply-combine，基础R方式

Split-apply-combine，整齐的方式

最新问题

问题描述投票：0回答：1

整理`wilcox.test`的输出