我想使用冲积图比较两个植被图 (.shp)。 一张植被地图是 2010 年的,一张是 2023 年的。2010 年和 2023 年的制图单位相同。 然而,映射区域并不相同,所以我将两张地图相交。
现在我拥有以下格式的数据(在对数据帧进行彻底重组之后)(15行的随机子集,原始数据集中的总数为9220):
ID value total Jaar
<int> <chr> <dbl> <dbl>
1 1927 H0000 2.33e- 8 2023
2 1030 H4030 7.64e+ 5 2010
3 2447 H7120 3.65e- 5 2023
4 301 H0000 2.47e- 8 2023
5 611 H0000 2.73e-17 2023
6 4021 H0000 1.17e+ 5 2010
7 1531 H0000 3.11e+ 4 2023
8 759 H0000 2.84e- 4 2010
9 1339 H6230 6.51e- 7 2010
10 2848 H9999 2.23e- 5 2010
11 1740 H4010A 3.17e- 7 2023
12 335 H4030 5.90e- 5 2023
13 4182 H7120 1.47e- 3 2023
14 2676 H0000 3.81e+ 4 2023
15 2828 H9999 2.89e+ 5 2010
ID = 唯一的 ID,能够将 2010 年地图多边形中的植被与 2023 年同一多边形中的植被耦合起来。每个多边形最多可以出现三种不同的植被类型,并且 2010 年可能只有一种类型发生在多边形中,2023 年有 3 种类型发生在多边形中。这意味着 2010 年将有 1 个 611 ID,2023 年将有 3 个 611 ID。
值=植被类型(栖息地类型)
总计 = 总表面积(平方米)
Jaar = 绘制地图的年份
我使用以下代码来制作冲积图:
data_alluv_long %>%
mutate(Jaar = factor(Jaar, levels = c("2010",
"2023")),
value = factor(value, levels = c("H9999",
"H0000",
"H91D0",
"H7110B",
"H4030",
"H7120",
"H4010A",
"H3160",
"H6230",
"H7150",
"H7110A",
"H2320",
"H0410A",
"H0401A"
))) %>%
ggplot(
aes(x = Jaar,
stratum = value,
alluvium = ID,
y = total)) +
geom_alluvium(aes(fill = value),
alpha = 0.7) +
geom_stratum() +
theme_minimal() +
labs(
title = "Veranderingen in Habitattypen",
x = "Jaar",
y = "Oppervlakte"
)
我不断收到此错误:
Error in `geom_alluvium()`:
! Problem while computing stat.
ℹ Error occurred in the 1st layer.
Caused by error in `setup_data()`:
! Data is not in a recognized alluvial form (see `help('alluvial-data')` for details).
据我在“帮助”选项中发现,我的 df 格式正确。 有些值非常小(来自交叉点的伪影),因此我尝试通过添加
filter(total > 0.001)
来省略这些值,但出现了相同的错误。
我想从冲积层得到什么: 两个条形图,一个代表 2010 年,一个代表 2023 年。条形图的填充必须是植被类型(栖息地类型)。该流程必须显示 13 年来某些类型如何保持不变,或者某些时间如何变化。
我的问题: 错误从何而来?我的数据结构为何不正确?
is_lodes_form()
检查显示您有重复的 ID 轴配对,考虑到您之前的描述,即某些多边形在年份之间具有不同数量的植被类型,我怀疑可能会发生这种情况。
我们不要求每个 ID 每年只出现一次,而是创建一个唯一的 flow_id,将原始 ID 与该 ID 中每种植被类型的索引相结合。
# First, let's load required libraries
library(tidyverse)
library(ggalluvial)
setwd(dirname(rstudioapi::getSourceEditorContext()$path)) # set the current script's location as working directory
# Read in data from csv
data_alluv_long <- read.csv("alluv_long.csv")
# Fix 1: Filter out very small values that might cause issues
data_filtered <- data_alluv_long %>%
filter(total > 0.001)
# Fix 2: Ensure each ID appears in both years
data_complete <- data_filtered %>%
group_by(ID) %>%
filter(n_distinct(Jaar) == 2) %>%
ungroup()
is_lodes_form(
data_complete,
Jaar,
value,
ID
)
# is wrong so there is something wrong
# Duplicated id-axis pairings. This is your error
# Check if IDs appear in both years
id_counts <- data_complete %>%
group_by(ID) %>%
summarise(n_years = n_distinct(Jaar))
table(id_counts$n_years)
nrow(data_complete)-2*nrow(id_counts) # ids have more than two values
# Fix it
# Function to prepare the data
prepare_alluvial_data <- function(data) {
# Step 1: Create a unique identifier for each ID-vegetation combination
data_prepared <- data %>%
group_by(ID, Jaar) %>%
# Create a unique combination identifier
mutate(veg_index = row_number(),
# Create a unique flow identifier
flow_id = paste(ID, veg_index, sep = "_")) %>%
ungroup()
# Step 2: Ensure the data is properly structured for the alluvial diagram
data_prepared <- data_prepared %>%
# Convert Jaar to factor
mutate(Jaar = factor(Jaar),
# Ensure value is a factor with specified levels if needed
value = factor(value))
return(data_prepared)
}
# Using your data:
data_ready <- prepare_alluvial_data(data_alluv_long)
data_ready <- data_ready %>%
filter(total > 0.001)
# Create the plot with the modified data
ggplot(data_ready,
aes(x = Jaar,
stratum = value,
alluvium = flow_id, # Using the new flow_id instead of ID
y = total,
fill = value)) +
geom_flow(alpha = 0.7) +
geom_stratum(alpha = 0.8) +
scale_x_discrete(expand = c(0.1, 0.1)) +
theme_minimal() +
labs(
title = "Veranderingen in Habitattypen",
x = "Jaar",
y = "Oppervlakte (m²)"
) +
scale_fill_discrete(name = "Habitattype") +
theme(legend.position = "right")