获取 ggpairs 以显示所有预测变量与单个响应的关系图

问题描述 投票:0回答:1

我正在尝试找出一种使用 ggpairs 来进行更简单的探索性数据分析的方法。 我的真实数据集有大约 50 个预测变量和一个响应,所以我不能只用 ggpairs 做一个标准散点图矩阵。 感谢这个问题,我已经弄清楚如何仅绘制 ggpairs 图的顶行(减去响应的对角线密度图):

library(ggplot2)
library(GGally)
data(mtcars)
mtcars$vs <- as.factor(mtcars$vs)
mtcars$am <- as.factor(mtcars$am)
mtcars$gear <- as.factor(mtcars$gear)
mtcars$carb <- as.factor(mtcars$carb)
mtcars$cyl <- as.factor(mtcars$cyl)

primary_var <- "mpg"
pairs <- ggpairs(mtcars, columns = c(1:11), 
                 upper = list(continuous = "points", combo = "box_no_facet", discrete = "count", na = "na"),
                 lower = list(continuous = "cor", combo = "facethist", discrete = "facetbar", na = "na"))
pvar_pos <- match(primary_var, pairs$yAxisLabels)
plots <- lapply(2:pairs$ncol, function(j) getPlot(pairs, i = pvar_pos, j = j))
ggmatrix(
  plots,
  nrow = 1,
  ncol = pairs$ncol-1,
  xAxisLabels = pairs$xAxisLabels[-1],
  yAxisLabels = primary_var
)

a ten-panel figure (1 row, 10 col) of scatterplots and boxplots showing mpg on the y-axis and various predictors from mtcars on the x-axis

这非常好,但我喜欢将其转换为 2x5 的图矩阵而不是 1x10 矩阵(将其扩展到我的 ~50 个预测变量以了解原因)。这似乎是一种可能性,因为 nrow 和 ncol 有参数(没有默认值),但是让面板像这样环绕似乎与允许它们用条带标题标记不兼容:

ggmatrix(
  plots,
  nrow = 2,
  ncol = 5,
  xAxisLabels = pairs$xAxisLabels[-1],
  yAxisLabels = primary_var
)
Error in pmg$grobs[[grob_pos]] <- axis_panel : 
  attempt to select more than one element in integerOneIndex

如果我们删除参数

xAxisLabels
yAxisLabels
那么它会做我想要的事情,但没有标签,并忽略顶行的正确 x 轴刻度......

ggmatrix(
  plots,
  nrow = 2,
  ncol = 5,
)

a ten-panel figure (2 row, 5 col) of scatterplots and boxplots with no axis labels

有没有办法既包装图形集又保留变量和轴的标签?

r ggplot2 ggally ggpairs
1个回答
0
投票
一个潜在的解决方案是将列表分成 n 行,然后使用 patchwork 包将它们“粘贴”在一起,例如

library(tidyverse) library(GGally) library(patchwork) data(mtcars) mtcars$vs <- as.factor(mtcars$vs) mtcars$am <- as.factor(mtcars$am) mtcars$gear <- as.factor(mtcars$gear) mtcars$carb <- as.factor(mtcars$carb) mtcars$cyl <- as.factor(mtcars$cyl) primary_var <- "mpg" pairs <- ggpairs(mtcars, columns = c(1:11), upper = list(continuous = "points", combo = "box_no_facet", discrete = "count", na = "na"), lower = list(continuous = "cor", combo = "facethist", discrete = "facetbar", na = "na")) pvar_pos <- match(primary_var, pairs$yAxisLabels) plots <- lapply(2:pairs$ncol, function(j) getPlot(pairs, i = pvar_pos, j = j)) plot_ggpairs_primary_vars <- function(plots = plots, n_rows = NULL, n_cols = NULL) { if (is.null(n_rows) | is.null(n_cols)) { stop("n_rows and n_cols must be specified") } lst <- split(plots, 1:n_rows) output <- list() for (i in 1:length(lst)) { output[[i]] <- wrap_elements(ggmatrix_gtable(ggmatrix( lst[[i]], nrow = 1, ncol = n_cols, xAxisLabels = unlist(map(1:n_cols, ~pluck(lst[[i]], .x, "labels", "x"))), yAxisLabels = primary_var ))) } wrap_plots(output, nrow = n_rows) } plot_ggpairs_primary_vars(plots, n_rows = 2, n_cols = 5)

2 by 5 pairs plot

plot_ggpairs_primary_vars(plots, n_rows = 5, n_cols = 2)

5 by 2 pairs plot

plot_ggpairs_primary_vars(plots, n_rows = 3, n_cols = 5) #> Warning in split.default(plots, 1:n_rows): data length is not a multiple of #> split variable

uneven number pairs plot

创建于 2024 年 12 月 20 日,使用 reprex v2.1.0

© www.soinside.com 2019 - 2024. All rights reserved.