我是 R 的频繁用户,但总是试图理解为什么这两个图表非常不同,以及我可以做什么来模仿 geom_line 来匹配 stat_summary 生成的显示(这更好) 额外问题:保持 geom_line() 这样工作的合理理由是什么?
library(tidyverse)
df = structure(list(year_completed_cat = structure(c(5L, 4L, 5L, 4L,
4L, 6L, 6L, 4L, 6L, 4L, 6L, 5L, 4L, 4L, 4L, 5L, 6L, 5L, 6L, 5L,
6L, 6L, 6L, 5L, 4L, 6L, 6L, 6L, 6L, 5L, 4L, 6L, 6L, 5L, 5L, 6L,
6L, 4L, 4L, 6L, 6L, 6L, 6L, 5L, 4L, 6L, 5L, 6L, 6L, 5L), levels = c("18",
"19", "20", "21", "22", "23", "24"), class = "factor"), asqse_quest = structure(c(6L,
7L, 7L, 7L, 7L, 6L, 6L, 5L, 5L, 5L, 5L, 6L, 6L, 5L, 7L, 6L, 5L,
6L, 7L, 5L, 6L, 7L, 6L, 7L, 7L, 7L, 5L, 7L, 5L, 5L, 6L, 7L, 5L,
5L, 7L, 5L, 7L, 6L, 6L, 5L, 6L, 5L, 6L, 6L, 6L, 5L, 6L, 6L, 5L,
5L), levels = c("2", "6", "12", "18", "24", "30", "36", "48",
"60"), class = "factor"), asqse_total = c(205, 40, 80, 60, 40,
60, 120, 0, 20, 20, 70, 70, 35, 35, 225, 140, 80, 215, 230, 110,
180, 155, 25, 165, 75, 60, 20, 85, 20, 75, 30, 35, 25, 55, 160,
70, 140, 35, 140, 30, 40, 40, 25, 40, 75, 5, 35, 205, 5, 40)), row.names = c(NA,
-50L), class = "data.frame")
ggplot(df, aes(x = year_completed_cat, y = asqse_total,
group = asqse_quest, color = asqse_quest)) +
geom_line() + geom_point()
ggplot(df, aes(x = year_completed_cat, y = asqse_total,
group = asqse_quest, color = asqse_quest)) +
stat_summary(geom = "line", fun = mean)
创建于 2024-07-07,使用 reprex v2.1.0
每个
asqse_total
有多个 year_completed_cat
值,因此当您绘制 geom_line()
时,它将按顺序连接这些点:首先是 year_completed_cat
21 的所有点(一条垂直线),然后到x 轴(对角线)上的下一步,然后是 year_completed_cat
22(另一条垂直线)的点,依此类推。
如果您想为每个数据点绘制一个点,然后绘制连接 means 的线,您可以结合使用两种方法:
geom_line()
加 stat_summary(..., geom = "line")
。