带有facet_wrap的百分比直方图

Question

我正在尝试将百分比直方图与

facet_wrap

结合起来，但百分比不是根据组计算的，而是根据所有数据计算的。我希望每个直方图显示一个组中的分布，而不是相对于所有人口的分布。我知道可以做几个图并将它们与

multiplot

结合起来。

library(ggplot2)
library(scales)
library(dplyr)

set.seed(1)
df <- data.frame(age = runif(900, min = 10, max = 100),
                 group = rep(c("a", "b", "c", "d", "e", "f", "g", "h", "i"), 100))

tmp <- df %>%
  mutate(group = "ALL")

df <- rbind(df, tmp)

ggplot(df, aes(age)) + 
  geom_histogram(aes(y = (..count..)/sum(..count..)), binwidth = 5) + 
  scale_y_continuous(labels = percent ) + 
  facet_wrap(~ group, ncol = 5)

输出：

Answer 1

尝试使用

y = stat(density)

（或 ggplot2 版本 3.0.0 之前的

y = ..density..

）而不是

y = (..count..)/sum(..count..)

ggplot(df, aes(age, group = group)) + 
  geom_histogram(aes(y = stat(density) * 5), binwidth = 5) + 
  scale_y_continuous(labels = percent ) +
  facet_wrap(~ group, ncol = 5)

来自“计算变量”下的

?geom_histogram

密度：bin中点的密度，缩放至积分为1

我们乘以 5（bin 宽度），因为 y 轴是密度（面积积分为 1），而不是百分比（高度总和为 1），请参阅 Hadley 的评论（感谢 @MariuszSiatka）。

Answer 2

虽然看起来

facet_wrap

没有在每个子集中运行特殊的

geom_histogram

百分比计算，但请考虑单独构建一个图列表，然后将它们网格排列在一起。

具体来说，调用

by

在 group 的子集中运行 ggplots，然后调用

gridExtra::grid.arrange()

（实际的包方法）来模拟

facet_wrap

：

library(ggplot2)
library(scales)
library(gridExtra)

...

grp_plots <- by(df, df$group, function(sub){
  ggplot(sub, aes(age)) + 
    geom_histogram(aes(y = (..count..)/sum(..count..)), binwidth = 5) + 
    scale_y_continuous(labels = percent ) + ggtitle(sub$group[[1]]) +
    theme(plot.title = element_text(hjust = 0.5))
})

grid.arrange(grobs = grp_plots, ncol=5)

但是，为了避免重复的 y 轴和 x 轴，请考虑在

theme

调用中有条件地设置

by

，假设您提前知道您的组并且它们的数量合理。

grp_plots <- by(df, df$group, function(sub){

  # BASE GRAPH
  p <- ggplot(sub, aes(age)) + 
    geom_histogram(aes(y = (..count..)/sum(..count..)), binwidth = 5) + 
    scale_y_continuous(labels = percent ) + ggtitle(sub$group[[1]])

  # CONDITIONAL theme() CALLS
  if (sub$group[[1]] %in% c("a")) {
    p <- p + theme(plot.title = element_text(hjust = 0.5), axis.title.x = element_blank(), 
                  axis.text.x = element_blank(), axis.ticks.x = element_blank())
  }
  else if (sub$group[[1]] %in% c("f")) {
    p <- p + theme(plot.title = element_text(hjust = 0.5))
  }
  else if (sub$group[[1]] %in% c("b", "c", "d", "e")) {
    p <- p + theme(plot.title = element_text(hjust = 0.5), axis.title.y = element_blank(), 
                   axis.text.y = element_blank(), axis.ticks.y = element_blank(),
                   axis.title.x = element_blank(), axis.text.x = element_blank(), 
                   axis.ticks.x = element_blank())
  }
  else {
    p <- p + theme(plot.title = element_text(hjust = 0.5), axis.title.y = element_blank(), 
                   axis.text.y = element_blank(), axis.ticks.y = element_blank())
  }
  return(p)
})

grid.arrange(grobs=grp_plots, ncol=5)

Answer 3

在 @markus 的答案中添加评论后，我看到一些评论要求将其作为独立的答案。

ggplot(df, aes(age)) + 
  geom_histogram(aes(y = stat(width*density)), binwidth = 10) + 
  scale_y_continuous(labels = percent ) +
  facet_wrap(~ group, ncol = 5)

与我最初的评论相比，为了灵活性，我添加了

bindwidth

。

信用：克劳斯维尔克这里

带有facet_wrap的百分比直方图

问题描述投票：0回答：3

3个回答

最新问题

带有facet_wrap的百分比直方图

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3