我有一个精心设计的绘图例程,该例程生成带有附加散点图的箱形图并将它们添加到图列表中。
如果在for循环中直接通过print(current_plot_complete)
创建了这些图,则例程将生成正确的图。
但是,如果将它们添加到仅在末尾打印的for循环期间添加到打印列表中,则该打印不正确:最终索引用于生成all个打印(而不是当前索引位于生成图的时间)。这似乎是默认的ggplot2
行为,我正在寻找一种解决方案以在当前用例中规避它。
问题似乎在y = eval(parse(text=(paste0(COL_i))))
中,在该位置中使用了全局环境(因此使用了最终索引值),而不是循环执行时的当前值。
我尝试了各种方法来使eval()使用正确的变量值,例如local(…)
或指定环境-但没有成功。
下面提供了非常简化的MWE。
原始例程比该MWE复杂得多,因此for
循环不能轻易地用apply
系列成员替换。
# create some random data
data_temp <- data.frame(
"a" = sample(x = 1:100, size = 50),
"b" = rnorm(n = 50, mean = 45, sd = 1),
"c" = sample(x = 20:70, size = 50),
"d" = rnorm(n = 50, mean = 40, sd = 15),
"e" = rnorm(n = 50, mean = 50, sd = 10),
"f" = rnorm(n = 50, mean = 45, sd = 1),
"g" = sample(x = 20:70, size = 50)
)
COLs_current <- c("a", "b", "c", "d", "e") # define COLs of data to include in box plots
choice_COLs <- c("a", "d") # define COLs of data to add scatter to
plot_list <- list(NA)
plot_index <- 1
for (COL_i in choice_COLs) {
COL_i_index <- which(COL_i == COLs_current)
# Generate "basis boxplot" (to plot scatterplot on top)
boxplot_scores <- data_temp %>%
gather(COL, score, all_of(COLs_current)) %>%
ggplot(aes(x = COL, y = score)) +
geom_boxplot()
# Get relevant data of COL_i for scattering: data of 4th quartile
quartile_values <- quantile(data_temp[[COL_i]])
threshold <- quartile_values["75%"] # threshold = 3. quartile value
data_temp_filtered <- data_temp %>%
filter(data_temp[[COL_i]] > threshold) %>% # filter the data of the 4th quartile
dplyr::select(COLs_current)
# Create layer of scatter for 4th quartile of COL_i
scatter_COL_i <- geom_point(data=data_temp_filtered, mapping = aes(x = COL_i_index, y = eval(parse(text=(paste0(COL_i))))), color= "orange")
# add geom objects to create final plot for COL_i
current_plot_complete <- boxplot_scores + scatter_COL_i
print(current_plot_complete)
plot_list[[plot_index]] <- current_plot_complete
plot_index <- plot_index + 1
}
plot_list
我认为问题是ggplot
使用惰性评估。呈现list
时,循环索引具有其最终值,这是用于评估列表中所有元素的值。
This post是相关的。