我正在尝试使用以下数据创建频率(在% terms
中)条形图:
>fulldata
Type Category
Sal 0
Sal 0
Sal 1
Sal 0
Sal 1
Sal 1
Self 1
Self 0
Self 1
Self 0
Self 0
因此,我试图创建一个条形图(使用ggplot),该条形图同时显示完整数据中Sal
和Self
的百分比以及Sal
中Self
和Category==1
的百分比(带有%值的标签)。我尝试通过从完整数据中过滤Category==1
来创建单独的数据帧,但是它们彼此重叠。我尝试了以下方法:
> Category1 = fulldata[which(fulldata$Category==1),]
ggplot(fulldata, aes(x=Type,y = (..count..)/sum(..count..)))+
geom_bar()+
geom_label(stat = "count", aes(label=round(..count../sum(..count..),3)*100),
vjust=1.2,size=3, format_string='{:.1f}%')+
scale_y_continuous(labels = scales::percent)+
labs(x = "Type", y="Percentage")+
geom_bar(data = Category1, position = "dodge", color = "red")
*原始数据大约有80000行。
ggplot2
中的所有比例开始。这里是一个假的例子:
df <- data.frame(Type = sample(c("Sal","Self"),100, replace = TRUE),
Category = sample(c(0,1),100, replace = TRUE))
我们可以如下计算每个比例以获得最终数据帧:
library(tidyr) library(dplyr) df %>% group_by(Category, Type) %>% count() %>% pivot_wider(names_from = Category, values_from = n) %>% mutate(Total = `0`+ `1`) %>% pivot_longer(-Type, names_to = "Category", values_to = "n") %>% group_by(Category) %>% mutate(Percent = n / sum(n)) # A tibble: 6 x 4 # Groups: Category [3] Type Category n Percent <fct> <chr> <int> <dbl> 1 Sal 0 27 0.458 2 Sal 1 22 0.537 3 Sal Total 49 0.49 4 Self 0 32 0.542 5 Self 1 19 0.463 6 Self Total 51 0.51
然后,如果您具有到ggplot2
的序列,则可以在一个序列中获得barg raph:
df %>% group_by(Category, Type) %>% count() %>% pivot_wider(names_from = Category, values_from = n) %>% mutate(Total = `0`+ `1`) %>% pivot_longer(-Type, names_to = "Category", values_to = "n") %>% group_by(Category) %>% mutate(Percent = n / sum(n)) %>% ggplot(aes(x = reorder(Category, desc(Category)), y = Percent, fill = Type))+ geom_col()+ geom_text(aes(label = scales::percent(Percent)), position = position_stack(0.5))+ scale_y_continuous(labels = scales::percent)+ labs(y = "Percentage", x = "Category")
它回答了您的问题吗?