同事们。
我正在尝试构建一个满足以下条件的分布图:
问题: 为了可视化“超过限制”的所有内容,我必须使 x 轴离散,否则最后一个条可能实际上是无穷无尽的,包括从限制到最大值的所有值。但为了在指定点放置垂直截距,x 轴应该是连续的。
有什么想法可以解决这个问题吗?
代码: 这是代码示例:
data <- data.frame(value = runif(1000, min = 0, max = 1000))
data$value <- round(data$value, digits = 0)
median_elapsed <- median(data$value)
bin_breaks <- c(seq(0,
median_elapsed,
length.out = 11),
Inf)
bin_labels <- c(seq(0,
median_elapsed - (median_elapsed / 10),
length.out = 10),
paste0("> ", median_elapsed))
data$bins <- cut(data$value,
breaks = bin_breaks,
labels = bin_labels,
include.lowest = TRUE,
right = FALSE)
get_home_data_percent <- data %>%
group_by(bins) %>%
summarize(count = n()) %>%
mutate(percentage = count / sum(count) * 100)
ggplot(get_home_data_percent, aes(x = bins, y = percentage)) +
geom_bar(stat = "identity", just = 0) +
scale_x_discrete(drop = FALSE) +
labs(x = "Elapsed Time",
y = "Percentage",
title = "Histogram of Elapsed Time") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
图解: 所以这里我几乎拥有了所需的一切,但没有中值的垂直线,因为 x 轴是离散的。
med = median(data$value)
bin_size = med / 11
data %>%
mutate(bin = if_else(value < med,
value %/% bin_size * bin_size, med)) %>%
summarize(n = n(), .by = bin) |>
ggplot(aes(bin + bin_size/2, n)) +
geom_col() +
geom_vline(xintercept = med) +
scale_x_continuous(breaks = scales::breaks_width(bin_size),
labels = scales::number_format(accuracy = 0.1))