使用R对Sankey/Alluvial图中的一些变量进行美化和排序

问题描述 投票:0回答:3

我正在努力提高我在数据可视化方面的技能,而且我几乎得到了我想要的。但在某个时刻,我陷入了困境,无法继续前进。请注意,伙计们,我在这里做了广泛的研究,试图找到我的疑虑,这对我有很大帮助。

这是我的数据集:

https://app.box.com/s/pp5p5chgypn6ba33anotie7wlxvdu01v

这是我的代码:

library(tidyverse)
library(ggalluvial)
library(alluvial)

A_col <- "firebrick3"
B_col <- "darkorange"
C_col <- "aquamarine2"
D_col <- "dodgerblue2"
E_col <- "darkviolet"
F_col <- "chartreuse2"
G_col <- "goldenrod1"
H_col <- "gray73"
set.seed(39)

ggplot(df,
       aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
  geom_alluvium(aes(fill = Positions, color = Positions), 
        width = 4/12, alpha = 0.5, knot.pos = 0.3) +
  geom_stratum(width = 4/12, color = "grey36") +
  geom_text(stat = "stratum", label.strata = TRUE) +
  scale_x_continuous(breaks = 1:3, 
       labels = c("Activity", "Category", "Positions/Movements"), expand = c(.01, .05)) +
  ylab("Time 24 hours") +
  scale_fill_manual(values  = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
  scale_color_manual(values = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
  ggtitle("Physical Activity during the week and weekend") +
  theme_minimal() +
  theme(legend.position = "none", panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(), axis.text.y = element_blank(), 
        axis.text.x = element_text(size = 12, face = "bold"))

# I also have this code that I run without pre-choosing the colours.
# I like this one because the flow diagram doesn't have any border.

ggplot(df,
       aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
  scale_x_discrete(limits = c("Activity", "Category", "Positions/Moviments"), 
       expand = c(.01, .05)) +
  ylab("Time 24 hours") +
  geom_alluvium(aes(fill = Positions), width = 4/12, alpha = 0.5, knot.pos = 0.3) +
  geom_stratum() + geom_text(stat = "stratum", label.strata = TRUE) +
  theme_minimal() +
  ggtitle("Physical Activity during the week and weekend") +
  theme(legend.position = "none", panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(), axis.text.y = element_blank(), 
        axis.text.x = element_text(size = 12, face = "bold"))

这是可视化的: enter image description here

有三件事我确实做不到:

  1. Category
    进行排序,清晰查看一周和周末之后的情况,例如
    Working
    Non Working
    Sleep Week
    Leisure
    Sleep Weekend

  2. 对位置/动作进行排序,例如

    Sitting
    Lying
    Standing
    Moving
    Stairs
    Walk Slow
    Walk Fast
    Running
    。另外,我想用与流程图相同的颜色填充此列的方块。另一件事是,有些名称没有足够的空间,我不知道是否可以重置空间来容纳它们,或者可以将它们放在外面,用箭头指示属于它们的方块。差点忘了,有没有办法手动为每个变量分配颜色,例如为
    black
    分配颜色
    Walk Slow
    ?另外,如果可能的话,我想去掉流程图边缘的线条。

  3. 有没有办法将位置和动作的名称堆叠起来?

有什么方法可以改进这种可视化并使其变得漂亮吗?

预先感谢,路易斯

r ggplot2 charts graph-visualization sankey-diagram
3个回答
3
投票

这是解决您的一些问题的解决方案。

df <- read_csv('Desktop/plot_alluvial_category_position_plus_moviments.csv')
positions <- c("Sitting", "Lying", "Standing", "Moving", "Stairs", "Walk Slow",
               "Walk Fast", "Running")
df$Positions <- factor(df$Positions, levels = positions, labels = positions)
category <- c("Working", "Non Working", "Sleep Week", "Leisure", 
              "Sleep Weekend")
df$Category <- factor(df$Category, levels = category, labels = category)

ggplot(df,
       aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
  geom_alluvium(aes(fill = Positions), 
                width = 4/12, alpha = 0.5, knot.pos = 0.3) +
  geom_stratum(width = 4/12, color = "grey36") +
  geom_text(stat = "stratum", label.strata = TRUE, min.height=100) +
  scale_x_continuous(breaks = 1:3, 
                     labels = c("Activity", "Category", "Positions\nMovements"), expand = c(.01, .05)) +
  ylab("Time 24 hours") +
  scale_fill_manual(values  = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
  scale_color_manual(values = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
  ggtitle("Physical activity during the week and weekend") +
  theme_minimal() +
  theme(legend.position = "none", panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(), axis.text.y = element_blank(), 
        axis.text.x = element_text(size = 12, face = "bold"))
  1. 要对分层进行排序,您需要将
    Category
    Position
    列转换为您设置级别顺序的因素。
  2. 要删除流程图的边缘,只需从
    color = Position
    级别删除
    aes
    即可。
  3. 您可以通过在标签中添加换行符来堆叠名称“位置”和“移动”。
  4. 您可以将颜色分配给分层,但前提是类别始终相同(查看
    ggalluvial
    文档中的一些示例)。
  5. 为了避免小层的重叠,您可以使用
    min.height
    中的
    geom_text
    参数,该参数是在
    ggalluvial
    版本
    0.9.2
    中引入的,如here所示。

0
投票

非常有帮助,谢谢您的发帖! 我在@Arienrhod 答案中找到了#4 的解决方法(抱歉,由于声誉较低,我不能只发表评论)。您可以创建与数据长度相同的因子,并在 geom_stratum(aes(fill='your.factor'), width = 4/12, color = "grey36") 中按正确的顺序分配各个类别,然后使用 '如上所示的scale_fill_manual()'。虽然很麻烦,但很管用。


0
投票

我也有类似的问题。我尝试解决它,但没有取得太大成功。我将把我的代码粘贴到这里,看看你是否可以帮助我。

tempo_1 <- as.character(Sankey$Tempo_1)
Sankey$Tempo_1 <- factor(Sankey$Tempo_1, levels = tempo_1, labels = tempo_1)
tempo_2 <- as.character(Sankey$Tempo_2)
Sankey$Tempo_2 <- factor(Sankey$Tempo_2, levels = tempo_2, labels = tempo_2)

ggplot(Sankey, aes(axis1 = Tempo_1, axis2 = Tempo_2, y = Valor)) +
  geom_alluvium(aes(fill = Tempo_1)) +
  geom_stratum(aes(fill = NULL), color = NA) +
  geom_text(stat = "stratum", aes(label = after_stat(stratum))) +
  theme_minimal() +
  theme(panel.border = element_blank(), legend.position = "none")
  labs(title = "Important Cytokines in time")

图表

© www.soinside.com 2019 - 2024. All rights reserved.