我无法理解为什么我的 Dunn 检验事后检验在经过显着的克鲁斯卡尔-沃利斯检验后,检测到 2022 年和 2023 年存在显着差异,而这两个年份似乎具有最相似的跨站点总丰度分布。我认为 2021-2022 年和 2021-2023 年的差异实际上应该很大。有人可以解释一下吗? 我的结果:
>shapiro.test(summary$Abundance)
data: summary$Abundance
W = 0.49864, p-value = 1.742e-09
>kruskal.test(Abundance ~ Year, data=summary)
data: Abundance by Year
Kruskal-Wallis chi-squared = 9.817, df = 2, p-value = 0.007384
>dunnTest(summary$Abundance, summary$Year, method="bonferroni")
Comparison Z P.unadj P.adj
1 2021 - 2022 -0.8256766 0.408987553 1.00000000
2 2021 - 2023 2.2707925 0.023159544 0.06947863
3 2022 - 2023 2.8414068 0.004491497 0.01347449 # How??
这是我的数据表。丰度列是每个站点收集的个体数量。
> dput(summary)
structure(list(Genus = c("Ceratina", "Ceratina", "Ceratina",
"Ceratina", "Ceratina", "Ceratina", "Ceratina", "Ceratina", "Ceratina",
"Ceratina", "Ceratina", "Ceratina", "Ceratina", "Ceratina", "Ceratina",
"Ceratina", "Ceratina", "Ceratina", "Ceratina", "Ceratina", "Ceratina",
"Ceratina", "Ceratina", "Ceratina", "Ceratina", "Ceratina", "Ceratina",
"Ceratina", "Ceratina", "Ceratina", "Ceratina", "Ceratina", "Ceratina"
), Sex = c("M", "M", "M", "M", "M", "M", "M", "M", "M", "M",
"M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M",
"M", "M", "M", "M", "M", "M", "M", "M", "M", "M"), Year = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L
), levels = c("2021", "2022", "2023"), class = "factor"), SiteID = structure(c(1L,
3L, 8L, 10L, 13L, 17L, 18L, 23L, 25L, 27L, 28L, 3L, 8L, 13L,
17L, 18L, 19L, 27L, 1L, 3L, 6L, 12L, 13L, 17L, 18L, 19L, 20L,
22L, 23L, 24L, 25L, 26L, 27L), levels = c("H02", "H03", "H04",
"H05", "H06", "H07", "H08", "H09", "L01", "L02", "L04", "L05",
"L06", "L07", "L08", "L09", "L10", "L12", "M01", "M02", "M03",
"M04", "M05", "M06", "M08", "M09", "M10", "M11", "M12"), class = "factor"),
ITD.avg = c(1.089, 1.155, 1.199, 1.32, 1.21, 1.17203225806452,
1.3695, 1.188, 1.089, 1.1913, 1.111, 1.21025, 1.3068125,
1.17591666666667, 1.3081, 1.27319444444444, 1.17591666666667,
1.326125, 1.245, 1.162, 1.1952, 1.2948, 1.2948, 1.23788571428571,
1.245, 1.1205, 1.1205, 1.245, 1.0458, 1.2699, 1.1703, 1.245,
1.245), WW.Avg = c(0.666666666666667, 2, 1, 0, 3.33333333333333,
0.709677419354839, 0, 0, 2, 3.9, 0, 1.75, 0.25, 1.5, 0.95,
1.16666666666667, 1.33333333333333, 3, 1, 2.16666666666667,
4, 5, 4.5, 2.07142857142857, 0.75, 5, 3.25, 2, 1.5, 3, 3.5,
1, 4), Abundance = c(3, 2, 3, 1, 6, 31, 2, 1, 2, 10, 3, 2,
4, 3, 10, 9, 3, 2, 1, 3, 1, 1, 1, 7, 2, 2, 2, 1, 1, 2, 2,
1, 1)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -33L), groups = structure(list(Genus = c("Ceratina",
"Ceratina", "Ceratina"), Sex = c("M", "M", "M"), Year = structure(1:3, levels = c("2021",
"2022", "2023"), class = "factor"), .rows = structure(list(1:11,
12:18, 19:33), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -3L), .drop = TRUE, class = c("tbl_df",
"tbl", "data.frame")))
如果我绘制多年来每个地点的总丰度分布,2022 年和 2023 年看起来最相似......
source("~/so.txt")
library(ggplot2)
kruskal.test(Abundance ~ Year, summ)
#>
#> Kruskal-Wallis rank sum test
#>
#> data: Abundance by Year
#> Kruskal-Wallis chi-squared = 9.817, df = 2, p-value = 0.007384
FSA::dunnTest(Abundance ~ Year, summ)
#> Dunn (1964) Kruskal-Wallis multiple comparison
#> p-values adjusted with the Holm method.
#> Comparison Z P.unadj P.adj
#> 1 2021 - 2022 -0.8256766 0.408987553 0.40898755
#> 2 2021 - 2023 2.2707925 0.023159544 0.04631909
#> 3 2022 - 2023 2.8414068 0.004491497 0.01347449
ggplot(summ, aes(Year, Abundance)) +
geom_boxplot() +
theme_bw()
创建于 2024 年 10 月 30 日,使用