想象一下有这个数据集:
Country Energy_Source Twh Tot
Italy Biofuel 25 100
Italy Nuclear 15 100
Italy Gas 40 100
Italy Hydro 20 100
France Biofuel 20 120
France Nuclear 75 120
France Gas 10 120
France Hydro 5 120
France Wind 10 120
注:
Tot
是Twh
与Country
之和
dataset1 <- data.frame(
"Country" = c(rep(x = "Italy", times = 4), rep(x = "France", times = 5)),
"Energy_Source" = c("Biofuel", "Nuclear", "Gas", "Hydro", "Biofuel", "Nuclear", "Gas", "Hydro", "Wind"),
"Twh" = c(25, 15, 40, 20, 20, 75, 10, 5, 10),
"Tot" = c(rep(x = 100, times = 4), rep(x = 120, times = 5))
)
现在,我们希望
ggplot2
将此 dataset1
解释为如下 (dataset2
) 而不需要 对 pivot_longer
执行
dataset1
这里新的
dataset2
表示与 dataset1
完全相同的信息,但 ggplot2
具有重复项,以将每个元素的 出现 解释为 比例
Country Energy_Source Twh Tot
Italy Biofuel 25 100
Italy Biofuel 25 100
Italy Biofuel 25 100
.
.
. (22 more rows)
Italy Nuclear 15 100
. (14 more rows)
Italy Gas 40 100
. (etcetera)
dataset2 <- data.frame(
"Country" = c(rep(x = "Italy", times = 100), rep(x = "France", times = 120)),
"Energy_Source" = c(rep(x = "Biofuel", times = 25), rep(x = "Nuclear", times = 15),
rep(x = "Gas", times = 40), rep(x = "Hydro", times = 20), rep(x = "Biofuel", times = 20),
rep(x = "Nuclear", times = 75), rep(x = "Gas", times = 10), rep(x = "Hydro", times = 5),
rep(x = "Wind", times = 10)),
"Tot" = c(rep(x = 100, times = 100), rep(x = 120, times = 120))
)
现在,通常我们会使用以下代码来表示条形图
ggplot(data = dataset2, mapping = aes(
x = Tot,
y = reorder(Country, Tot),
fill = Energy_Source
)) +
geom_col()
看这里:
但是是否可以使用
dataset1
而不是 dataset2
来创建与 ggplot2
相同的图形?
换句话说:
如何填充条形图,不是根据数据集中元素的出现次数,而是根据其在另一个变量中的值?
谢谢!
我尝试从
pivot_longer
包中执行 tidyr
,但对于我的 Shiny 应用程序来说成本太高了。
两个选项:
Twh
:ggplot(data = dataset, mapping = aes(
x = Tot*Twh,
y = reorder(Country, Tot),
fill = Energy_Source
)) +
geom_col()
tidyr::uncount
就是您想要的。这复制了您的 dataset2
方法。我添加了边框来展示如何将许多小条堆叠在一起。ggplot(data = dataset |> tidyr::uncount(Twh), mapping = aes(
x = Tot,
y = reorder(Country, Tot),
fill = Energy_Source
)) +
geom_col(color = "gray50")