在火山图上标记特定点

问题描述 投票:0回答:1

我无法在 ggplot 上显示兴趣点。当我在 geom_label_repel() 中选择我的基因时,当我知道列表中出现了 57 个时,我只得到 3 个出现的基因。

图1。这是我运行 geom_label_repel

时得到的结果

图2。这是我使用 geom_text 得到的结果,但我无法轻松阅读一些蓝色的文本。试图将它们装箱

图3。我希望图 2 看起来像这样,但用方框显示图 2 中的基因,以便于阅读

我尝试将 geom_label_repel 更改为 geom_text 但 geom_label_repel 中显示的框使文本更易于阅读。我还隔离了感兴趣的点和图表,这导致我的所有点都被显示出来,所以我知道问题出在 label_repel 行的某个地方。

  results_df$Gene <- rownames(resultsLFC_df)
  
  #create named character vector for color palette
  palette <- c("Upregulated" = "red",
               "Downregulated" = "blue",
               "Not_significant" = "gray")

  # show volcano plot
  results_df %>%
    mutate(Expression = if_else(padj < custom_alpha & log2FoldChange > 0, "Upregulated",
                               if_else(padj < custom_alpha & log2FoldChange < 0, "Downregulated", "Not_significant"))) %>%
    ggplot(aes(x = log2FoldChange, y = -log10(padj), color = Expression)) +
    geom_point(alpha = 0.8, size = 0.5) +
    geom_vline(xintercept = 0, linetype = "dashed") +
    geom_hline(yintercept = -log10(custom_alpha), linetype = "dashed") +
    

    geom_text(
      aes(label = ifelse(Gene %in% genes_to_label, as.character(Gene), "")), color = "black",
      arrow = arrow(length = unit(0.02, "npc")),
      box.padding=.5, point.padding=0.5, segment.color="black", show.legend=FALSE, max.overlaps = 10,
      hjust=0,vjust=0) +

    
    # geom_label_repel(
    #   aes(label = if_else(Gene %in% genes_to_label, Gene, "")),
    #   arrow = arrow(length = unit(0.02, "npc")),
    #   box.padding=.1, point.padding=0.5, segment.color="gray70", show.legend=FALSE, max.overlaps = 20
    # ) +
    
    labs(title = condition_contrast, x = "log2(Fold Change)", y = "-log10(padj)") +
    scale_color_manual(values = palette, limits = names(palette))+
    theme_classic()

我已经展示了我的 geom_text 和 geom_label_repel,因为我一直在尝试解决它们。

r ggplot2 geom-text
1个回答
0
投票

这里有两个选项,第一个使用

geom_text
,第二个使用
ggrepel::geom_label_repel

通过

geom_text
更改文本的水平对齐方式对过度绘制有很大帮助(我认为)。

使用

geom_label_repel
,您可以使用
nudge_x
nudge_y
参数,以及将
max_overlaps
更改为高于 10 的值。标签未显示的原因可能是
max_overlaps
值太低。根据我的经验,这对于报告火山图来说效果很好,即使它所描述的点旁边没有标签,因为否则必须有大量的过度绘制。

set.seed(123)
df <- data.frame(
  padj = runif(1000),
  log2FoldChange = rnorm(1000),
  Gene = paste0("G", 1:1000)
)
palette <- c("Upregulated" = "red",
             "Downregulated" = "blue",
             "Not_significant" = "gray")
custom_alpha = 0.05
df$Expression <- if_else(df$padj < custom_alpha & df$log2FoldChange > 0, "Upregulated",
                         if_else(df$padj < custom_alpha & df$log2FoldChange < 0,
                                 "Downregulated", "Not_significant"))
genes_to_label <- df[df$Expression != "Not_significant", "Gene"]

ggplot(df, aes(x = log2FoldChange, y = -log10(padj), color = Expression)) +
  geom_point(alpha = 0.8, size = 0.5) +
  geom_vline(xintercept = 0, linetype = "dashed") +
  geom_hline(yintercept = -log10(custom_alpha), linetype = "dashed") +
  geom_text(
    data = df[df$Gene %in% genes_to_label,],
    aes(label = Gene), color = "black",
    show.legend=FALSE,
    # varying hjust by row might help make this easier to read by biasing the down regulated labels
    # leftward and up regulated things righward. 
    hjust = ifelse(df[df$Gene %in% genes_to_label, "Expression"] == "Upregulated", 0, 1),
      size = 3, vjust = -0.5) +
  labs(title = "geom text", x = "log2(Fold Change)", y = "-log10(padj)") +
  scale_color_manual(values = palette, limits = names(palette))+
  theme_classic()


ggplot(df, aes(x = log2FoldChange, y = -log10(padj), color = Expression)) +
  geom_point(alpha = 0.8, size = 0.5) +
  geom_vline(xintercept = 0, linetype = "dashed") +
  geom_hline(yintercept = -log10(custom_alpha), linetype = "dashed") +
  # here the geom_label_repel calls allows infinite overlaps and nudges points based on
  # the gene regulation (by 3 or -3). You could also just call geom_label_repel twice.
  ggrepel::geom_label_repel(
    data = df[df$Gene %in% genes_to_label,],
    aes(label = Gene),
    arrow = arrow(length = unit(0.02, "npc")),
    nudge_x = 3 * ifelse(df[df$Gene %in% genes_to_label, "Expression"] == "Upregulated", 1, -1),
    nudge_y = 1,
    box.padding=.1, point.padding=0.5, segment.color="gray70", show.legend=FALSE, max.overlaps = Inf
  ) +
  labs(title = "condition_contrast", x = "log2(Fold Change)", y = "-log10(padj)") +
  scale_color_manual(values = palette, limits = names(palette))+
  theme_classic()

geom text version geom label repel version

© www.soinside.com 2019 - 2024. All rights reserved.