堆叠条形图,其中每个堆叠层根据附加变量进行细分

问题描述 投票:0回答:1

我想创建一个堆叠条形图,其中 x 轴是一个 bin,表示共享一个克隆的唯一样本 ID 的数量,y 轴是每个人 bin 中的克隆数量(我真的不想显示clone_id,而不是每个bin类别内的克隆数量),并且每个克隆实际上进一步除以构成克隆的单元的状态标识。例如,如果克隆 A 由 10 个单元组成,其中 4 个单元格状态为红色,而 6 个单元格状态为蓝色,我想通过将堆栈的克隆 A 层着色为 40% 红色和 60% 蓝色来显示这一点。堆栈的下一个克隆 B 层可能是 66% 红色和 33% 蓝色等。我还没有找到我正在寻找的示例,但如果有人知道一种将每个颜色分开的方法条形图的单独分层我很感激您的建议!

数据示例:

new_df <- structure(list(clone_id = c(101, 101, 101, 101, 102, 102, 103, 103, 103, 103, 104, 104, 104, 104, 104, 104, 104, 104), 
                         sample_id = c(201, 201, 202, 202, 203, 204, 205, 206, 206, 206, 207, 207, 207, 207, 207, 207, 207, 208), 
                         status = c("red", "red", "blue", "blue", "red", "blue", "red", "blue", "blue", "blue", "red", "red", "red", "red", "red", "red", "red", "blue"), 
                         bin_id = c(4, 4, 4, 4, 2, 2, 4, 4, 4, 4, 8, 8, 8, 8, 8, 8, 8, 8), 
                         perc_red = c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.25, 0.25, 0.25, 0.25, 0.875, 0.875, 0.875, 0.875, 0.875, 0.875, 0.875, 0.875), 
                        perc_blue = c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.75, 0.75, 0.75, 0.75, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125)), 
class = "data.frame", row.names = c(NA, -18L))

堆叠条形图代码:

ggplot(data, aes(fill=clone_id, y=clone_id, x=bin_id)) + 
  geom_bar(position="stack", stat="identity")

我可以制作一个堆叠条形图,如下所示,无需额外的细节级别,并且 y 轴有点混乱,因为它试图识别克隆 ID,我只想要 bin 堆栈中克隆的数量。

我无法找到我正在寻找的绘图类型的任何示例,并且不知道如何分割每个堆叠层的颜色。

编辑: 我已经将上面的数据框编辑得更加简单。

我想显示堆叠条形图中属于 2 个集合变量(本例中为红色和蓝色)的每一层的数量。 I am looking for something like this crude powerpoint example.

r ggplot2 charts bar-chart stacked-bar-chart
1个回答
0
投票

你所描述的对我来说听起来像是一个“马赛克情节”。我不确定您提供的数据的预期结果是什么(即绘图应该是什么样子),但这是我的最佳猜测: 示例数据:

library(tidyverse) library(ggmosaic) data <- structure(list(clone_id = c(101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 102, 102, 102, 102, 103, 103, 103, 103, 103, 103, 103, 103, 103, 103, 103, 103, 103, 103, 103, 103, 103, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 105, 105, 105, 105, 105, 105, 105, 106, 106, 106, 106, 106, 106, 106, 106, 106, 106, 106, 106, 106, 106, 106, 106, 106, 106, 106, 106, 107, 107, 107), sample_id = c(201, 201, 201, 201, 202, 202, 204, 204, 204, 209, 209, 210, 210, 210, 203, 203, 204, 205, 206, 207, 208, 208, 208, 208, 208, 208, 208, 208, 209, 209, 210, 207, 207, 207, 204, 204, 204, 204, 204, 204, 204, 203, 203, 206, 207, 208, 208, 208, 201, 201, 201, 202, 203, 203, 204, 204, 204, 204, 205, 205, 206, 207, 208, 208, 208, 208, 208, 208, 204, 205, 205), status = c("red", "red", "red", "red", "blue", "blue", "blue", "blue", "blue", "red", "red", "blue", "blue", "blue", "red", "red", "blue", "red", "blue", "red", "blue", "blue", "blue", "blue", "blue", "blue", "blue", "blue", "red", "red", "blue", "red", "red", "red", "blue", "blue", "blue", "blue", "blue", "blue", "blue", "red", "red", "blue", "red", "blue", "blue", "blue", "red", "red", "red", "blue", "red", "red", "blue", "blue", "blue", "blue", "red", "red", "blue", "red", "blue", "blue", "blue", "blue", "blue", "blue", "blue", "red", "red"), bin_id = c(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 2, 2, 2, 2, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 4, 4, 4, 4, 4, 4, 4, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 2, 2, 2), perc_red = c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.25, 0.25, 0.25, 0.25, 0.35, 0.35, 0.35, 0.35, 0.35, 0.35, 0.35, 0.35, 0.35, 0.35, 0.35, 0.35, 0.35, 0.35, 0.35, 0.35, 0.35, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.66, 0.66, 0.66), perc_blue = c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.75, 0.75, 0.75, 0.75, 0.65, 0.65, 0.65, 0.65, 0.65, 0.65, 0.65, 0.65, 0.65, 0.65, 0.65, 0.65, 0.65, 0.65, 0.65, 0.65, 0.65, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.57, 0.57, 0.57, 0.57, 0.57, 0.57, 0.57, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.33, 0.33, 0.33)), class = "data.frame", row.names = c(NA, -71L))

data %>%
  mutate(unique_samples = n_distinct(sample_id), .by = clone_id) %>%
  mutate(unique_clones = n_distinct(clone_id), .by = sample_id) %>%
  ggplot() +
  geom_mosaic(aes(x = product(unique_clones), conds = product(unique_samples),
                  fill = status), offset = 0.05, na.rm = FALSE) +
  theme_classic() +
  scale_fill_identity()

创建于 2024-03-20,使用

reprex v2.1.0 这就是你想要做的吗?如果不是,您会做出哪些改变才能获得“正确”答案?

© www.soinside.com 2019 - 2024. All rights reserved.