我在一张图表上生成了多个箱线图,比较了三个标记的三个位置。我想为所有位置上的每个标记添加统计显着性标记。然而,我不想在标记之间画出意义。
library(ggplot2)
library(dplyr)
library(tidyr)
library(ggpubr)
data <- data.frame(
fat = rep(c("Lake A", "Lake B", "Lake C"), each = 50),
zon = c(rnorm(50, mean = 5, sd = 1), rnorm(50, mean = 5.5, sd = 1), rnorm(50, mean = 4.5, sd = 1)),
cal = c(rnorm(50, mean = 6, sd = 1.5), rnorm(50, mean = 6.5, sd = 1.5), rnorm(50, mean = 5.5, sd = 1.5)),
si = c(rnorm(50, mean = 7, sd = 1), rnorm(50, mean = 7.5, sd = 1), rnorm(50, mean = 6.5, sd = 1)), other1 = rnorm(150),other2 = rnorm(150))
data_selected <- data %>% select(place = fat, zon, cal, si)
data_long <- pivot_longer(data_selected, cols = c("zon", "cal", "si"), names_to = "category", values_to = "value")
comparisons <- list(unique(data_long$place))
ggplot(data_long, aes(x = place, y = value, fill = category)) +
geom_boxplot() +
labs(title = NULL,
x = "Location",
y = "Marker level (Log-values)",
fill = "Marker") +
theme_bw() +
theme(legend.position = "top") +
ggpubr::stat_compare_means(comparisons = comparisons, aes(label = ..p.signif..), method = "t.test", size = 5, vjust = .5)
这个例子只是一次比较,如何进行多次比较?
如果你阅读文档,它说
comparisons
对象应该是:
长度为 2 的向量列表。向量中的条目要么是 x 轴上 2 个值的名称,要么是与要比较的感兴趣组的索引相对应的 2 个整数。
虽然你有一个包含所有三个湖泊的向量:
dput(comparisons)
#> list(c("Lake A", "Lake B", "Lake C"))
所以你的比较应该是这样的:
comparisons <- list(c("Lake A", "Lake B"),
c("Lake A", "Lake C"),
c("Lake B", "Lake C"))
请注意,
stat_compare_means
将仅比较x轴位置之间的值,它不会比较每个x轴位置处的不同组。例如,请参阅 GitHub 上的此讨论。据我了解,您希望对所有三个组进行所有三个站点比较,总共 9 次不同的比较。
作者建议在这种情况下使用facet。这实际上会给你一个比一堆 9 个括号进行比较更整洁的图:
ggboxplot(data_long, x = "place", y = "value", facet.by = "category",
fill = "category") +
labs(title = NULL,
x = "Location",
y = "Marker level (Log-values)",
fill = "Marker") +
stat_compare_means(comparisons = comparisons,
aes(label = ..p.signif.., group = category),
method = "t.test", size = 5) +
theme_bw() +
theme(legend.position = "top")
还有更复杂的方法来获取您所描述的情节,但我不确定它们是否值得您需要投入的工作量。