当计数不是列时，仅选择 GGPLOT2 条形图中的前 10 个条形

Question

所以我有一个小标题，其中包含火车路线数据，以及骑手是否是会员，使用ggplot的条形图，我将起始站名称为x，计数为y，以及基于他们是否是会员的颜色或不。

但是这里有超过 700 个电台，因此图表很混乱，我希望选取前 10 个（最频繁的）和后 10 个（最不频繁的），问题是我不认为我可以使用标准的 slice_min 和切片最大。由于计数列不存在，因为我依靠 ggplots 默认行为将计数放在 y 轴上，而不是计数列上。

有没有办法只选择前 10 个计数和后 10 个计数，以便图表不拥挤，另外将顶部和底部显示为 2 个子图。

   A tibble: 6 × 3
  starting_station_name                ending_station_name                      member_status
  <chr>                                <chr>                                    <chr>        
1 American University East Campus      39th & Veazey St NW                      member       
2 Washington & Independence Ave SW/HHS Independence Ave & L'Enfant Plaza SW/DOE member       
3 15th St & Massachusetts Ave SE       12th St & Pennsylvania Ave SE            member       
4 New Hampshire Ave & Ward Pl NW       14th & Rhode Island Ave NW               casual       
5 11th & Girard St NW                  Georgia & New Hampshire Ave NW           member       
6 15th & W St NW                       California St & Florida Ave NW           member

使用代码

rides_stations <- subset(rides_cleaned, select = c(5,7,8)) 


q1 <- ggplot(rides_stations, aes(x=starting_station_name, fill = member_status)) + 
  geom_bar()

q1

这会产生一个严重拥挤的图表

Answer 1

library(dplyr); library(forcats)

data.frame(starting_station_name = sample(letters, 500, TRUE, prob = 26:1),
           member_status = sample(c("casual", "member"), 500, TRUE)) |>
  count(starting_station_name, member_status) |> 
  mutate(starting_station_name = factor(starting_station_name) |>
           fct_lump(n = 10, w = n) |>
           fct_reorder(-n, sum)) |>
  filter(starting_station_name != "Other") |>
  ggplot(aes(starting_station_name, n, fill = member_status)) + 
  geom_col()

当计数不是列时，仅选择 GGPLOT2 条形图中的前 10 个条形

问题描述投票：0回答：1

1个回答

最新问题

当计数不是列时，仅选择 GGPLOT2 条形图中的前 10 个条形

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1