按组绘制置信区间

问题描述 投票:0回答:1

我正在寻求有关我的代码的帮助。 我有一个数据集,其中许多人被要求对 5 个不同的场景进行评分,范围为 -5 到 +5。

然后我将这 2 组分为 S 和 A,因为我想比较这 2 组。 数据集如下:

Score<-c(-2, 3, 4, -1, 3, 4, 5, -1, 3, 5, -3, 3, 5, 1, -4, 5, -2, 
         1, 3, 4, -4, 2, -1, 3, 4)

Group<-c( "S", "S", "A", "S", "A", "S", "A", "S", "S", "A", "S", "A", "S", "A", 
          "S", "S", "A", "S", "A", "S", "A", "S", "S", "A", "S"
           )

Scenerio_ID <-c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 
                1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5)


CombinedTable<-data.frame(Score,Group, Scenerio_ID )

我正在尝试执行以下操作:

  1. 创建一个新列或表,其中包含每个 Scenerio_ID 的平均分数以及组“A”和组“S”的 95% 置信区间(上限和下限)。

  2. 我正在尝试绘制一个图表,其中 Y 轴上有分数,x 轴上有场景,对于每个场景、A 组和 S 组。我想要平均点以及 95% 置信区间的上限和下限。与我刚刚附上的图片非常相似。

我尝试计算每个场景相对于组的平均值和置信区间的代码是这样的:

library(dplyr)
library(Rmisc)
library(ggplot2)

MeansCombinedTable <- 
  CombinedTable %>%
  group_by(Group, Scenerio_ID) %>%
  dplyr::summarise(avg_Score = mean(Score), 
                   uci_Score = CI(Score)[1], 
                   lci_Score = CI(Score)[3]) %>%
  mutate(Scenerio_ID = Scenerio_ID %>% as.factor())

我绘制情节的尝试是这样的:

CombinedTable %>%
  ggplot(aes(x = Group, y = avg_Score, fill = Scenerio_ID)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_errorbar(aes(ymin = lci_Score, ymax = uci_Score), position = "dodge")

非常感谢您对此提供的任何帮助,因为我已经为此工作了一段时间,这给我带来了一个巨大的问题。 非常感谢。

r ggplot2 data-manipulation confidence-interval
1个回答
0
投票

您正在汇总数据,然后使用原始数据集进行绘图。这是主要问题。

suppressPackageStartupMessages({
  library(dplyr)
  library(Rmisc)
  library(ggplot2)
})

MeansCombinedTable <- 
  CombinedTable %>%
  group_by(Group, Scenerio_ID) %>%
  dplyr::summarise(avg_Score = mean(Score), 
                   uci_Score = CI(Score)[1], 
                   lci_Score = CI(Score)[3],
                   .groups = "drop") %>%
  mutate(Scenerio_ID = paste("Scenerio", Scenerio_ID))
#> Warning: There were 4 warnings in `dplyr::summarise()`.
#> The first warning was:
#> ℹ In argument: `uci_Score = CI(Score)[1]`.
#> ℹ In group 1: `Group = "A"`, `Scenerio_ID = 1`.
#> Caused by warning in `qt()`:
#> ! NaNs produced
#> ℹ Run `dplyr::last_dplyr_warnings()` to see the 3 remaining warnings.

MeansCombinedTable %>%
  ggplot(aes(x = Group, y = avg_Score, fill = Scenerio_ID)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_errorbar(aes(ymin = lci_Score, ymax = uci_Score), position = "dodge") +
  labs(x = "", y = "Score") +
  facet_wrap(~ Scenerio_ID, nrow = 1L, strip.position = "bottom") +
  theme_classic()

创建于 2023-02-20,使用 reprex v2.0.2


编辑

为了只选择少数场景,请先创建一个

wanted_scenerio
的向量,这将使代码更加灵活。我稍微更改了摘要代码,仅在过滤数据后粘贴字符串
"Scenerio"
和变量
Scenerio_ID

要将平均值绘制为点,只需将
geom_point
替换为
geom_bar
,并且不要将
fill
映射到
Scenerio_ID

suppressPackageStartupMessages({
  library(dplyr)
  library(Rmisc)
  library(ggplot2)
})

wanted_scenerios <- c(1, 2, 4)

MeansCombinedTable <- 
  CombinedTable %>%
  group_by(Group, Scenerio_ID) %>%
  dplyr::summarise(avg_Score = mean(Score), 
                   uci_Score = CI(Score)[1], 
                   lci_Score = CI(Score)[3],
                   .groups = "drop") 
#> Warning: There were 4 warnings in `dplyr::summarise()`.
#> The first warning was:
#> ℹ In argument: `uci_Score = CI(Score)[1]`.
#> ℹ In group 1: `Group = "A"`, `Scenerio_ID = 1`.
#> Caused by warning in `qt()`:
#> ! NaNs produced
#> ℹ Run `dplyr::last_dplyr_warnings()` to see the 3 remaining warnings.
  
MeansCombinedTable %>%
  filter(Scenerio_ID %in% wanted_scenerios) %>%
  mutate(Scenerio_ID = paste("Scenerio", Scenerio_ID)) %>%
  ggplot(aes(x = Group, y = avg_Score)) +
  geom_point(size = 2) +
  geom_errorbar(aes(ymin = lci_Score, ymax = uci_Score), position = "dodge") +
  labs(x = "", y = "Score") +
  facet_wrap(~ Scenerio_ID, nrow = 1L, strip.position = "bottom") +
  theme_classic()

创建于 2023-02-21,使用 reprex v2.0.2

最新问题
© www.soinside.com 2019 - 2024. All rights reserved.