所以我有一个小标题,其中包含火车路线数据,以及骑手是否是会员,使用ggplot的条形图,我将起始站名称为x,计数为y,以及基于他们是否是会员的颜色或不。
但是这里有超过 700 个电台,因此图表很混乱,我希望选取前 10 个(最频繁的)和后 10 个(最不频繁的),问题是我不认为我可以使用标准的 slice_min 和切片最大。由于计数列不存在,因为我依靠 ggplots 默认行为将计数放在 y 轴上,而不是计数列上。
有没有办法只选择前 10 个计数和后 10 个计数,以便图表不拥挤,另外将顶部和底部显示为 2 个子图。
A tibble: 6 × 3
starting_station_name ending_station_name member_status
<chr> <chr> <chr>
1 American University East Campus 39th & Veazey St NW member
2 Washington & Independence Ave SW/HHS Independence Ave & L'Enfant Plaza SW/DOE member
3 15th St & Massachusetts Ave SE 12th St & Pennsylvania Ave SE member
4 New Hampshire Ave & Ward Pl NW 14th & Rhode Island Ave NW casual
5 11th & Girard St NW Georgia & New Hampshire Ave NW member
6 15th & W St NW California St & Florida Ave NW member
使用代码
rides_stations <- subset(rides_cleaned, select = c(5,7,8))
q1 <- ggplot(rides_stations, aes(x=starting_station_name, fill = member_status)) +
geom_bar()
q1
library(dplyr); library(forcats)
data.frame(starting_station_name = sample(letters, 500, TRUE, prob = 26:1),
member_status = sample(c("casual", "member"), 500, TRUE)) |>
count(starting_station_name, member_status) |>
mutate(starting_station_name = factor(starting_station_name) |>
fct_lump(n = 10, w = n) |>
fct_reorder(-n, sum)) |>
filter(starting_station_name != "Other") |>
ggplot(aes(starting_station_name, n, fill = member_status)) +
geom_col()