我正在使用来自运输统计局的数据来进行边境口岸,并尝试通过复制我在 Tableau 中构建的仪表板(在此处发布)来熟练使用 R。 我想在 R 中所做的工作中使用相同的颜色,但努力将特定颜色强制为“测量”字段的值(个人车辆、行人、卡车等)。
这是我的代码
data_00_base <- reader("Border_Crossing_Entry_Data.csv")
data_00_base <- data.frame(data_00_base)
data_00_base$Value <- as.numeric(gsub("[^0-9.-]", "", data_00_base$Value))
# Make this happen
# Line Chart of Border Crossings per Month, Mexico vs Canada
# Bar Charts stratified by Modes of Crossing the Border, MEX vs CAN
# Nuances
# Any drastic changes in the 20+ years of data collection. COVID hit in March 2020. How long was the recovery.
# Did 9/11 impact border crossings
# MEX vs CAN, Volume, cyclical patterns, almost no Pedestrians from Canada, and MEX about 4:1 on most Modes. CAN wins for Trains.
#
# Personal Vehicles
# Strong cyclical for CAN
# MEX steady # of cars but decreasing # of Passengers. Show graph that shows ratio of Passengers/Vehicle vs time
# MEX peaks in December (holidays), trough in Feb
# rm(data_10a)
# Add some features to the BASE data frame. This is the BASE that all sub-queries (data frames) will pull from.
data_00_base_mod <- data_00_base %>%
mutate(Border_Code = ifelse(grepl("Canada", Border, fixed = TRUE), "CAN", "MEX")) %>%
separate(Date, c("Month", "Year"), remove = FALSE) %>%
mutate(Measure = factor(Measure, order = TRUE, levels = c("Bus Passengers","Buses","Pedestrians","Personal Vehicle Passengers","Personal Vehicles","Rail Containers Empty","Rail Containers Loaded","Train Passengers","Trains","Truck Containers Empty","Truck Containers Loaded","Trucks")))
# Augment with a dimension table to help control colors and sorting in the presentation tier
# This is all stuff I decided on when building the Tableau dashboard. I want to follow that lead with R to make it easier to compare/contrast output side-by-side
dim_measure <- data.frame(
Measure=c("Bus Passengers","Buses","Pedestrians","Personal Vehicle Passengers","Personal Vehicles","Rail Containers Empty","Rail Containers Loaded","Train Passengers","Trains","Truck Containers Empty","Truck Containers Loaded","Trucks")
,mode_main_sort=c(8,7,3,2,1,11,10,12,9,6,5,4)
,mode_color_code=c("#C799BC","#8074A8","#4E79A7","#F59C3C","#C14F22","#CDCECD","#5B6570","#89C8CC","#848E93","#F4D166","#B2C25B","#34844A")
)
# Left Join the Base table with supplemental information about the Measures to help with the presentation layer
data_00_base_mod <- merge(x = data_00_base_mod, y = dim_measure, by = "Measure", all.x = TRUE)
# Round #1
# High-level summary of CAN vs MEX vs time vs Mode
data_10a_core <- data_00_base_mod %>%
select(Year, Border_Code, Measure, Value, mode_main_sort, mode_color_code) %>%
filter(Measure == "Personal Vehicles" | Measure == "Pedestrians" | Measure == "Trucks" | Measure == "Buses" | Measure =="Trains") %>%
group_by(Year, Border_Code, Measure, mode_main_sort, mode_color_code) %>%
summarise(annual_crossings = sum(Value), record_cnt = n(), .groups = 'keep') %>%
arrange(Border_Code, Year, mode_main_sort, Measure)
#############################################################################
# NOT Working as Expected
# I've tried lots of different options, but can't figure out how to assign
# the values in the dim_measure data frame to the Measures field.
#
# Many thanks for anybody that can get me over this hurdle.
#############################################################################
ggplot(data_10a_core) +
geom_col(aes(x=Year, y=annual_crossings, color = mode_color_code, fill = mode_color_code)) +
facet_grid(, vars(Border_Code)) +
theme(axis.text.x = element_text(angle=90)) +
scale_y_continuous("Annual Inbound Border Crossings",
# breaks = scales::breaks_extended(8),
labels = scales::label_number()
)
如果您希望几何图形使用您通过 aes() 映射的填充和颜色值,请添加
scale_fill_identity()+
scale_color_identity()