y年 | long | InSecta | ||||||
---|---|---|---|---|---|---|---|---|
KAV01 | nass | -17.4 | 18.5 | InSecta | 0 | 2023 | ||
nass | -17.4 | 18.5 | InSecta | 1 | 2021 | KAV02 | ||
-17.7 | 18.7 | InSecta | 0 | 2023 | KAV02 | nass | ||
18.7 | InSecta | 0 | 2021 | KAV03 | nass | -17.8 | ||
InSecta | 0 | 2023 | KAV03 | nass | -17.8 | 19.1 | ||
有许多独特的家庭,我的存在值为0和1,但家庭价值并未完全覆盖。这是,并非所有家庭价值都分配了存在的值为0或1的值。让我更多地解释一下,在所有数据集中,都有唯一数量的家庭= 100,但是网站kav03具有90的存在值/不存在的值。我想要的是确保剩下的10个家庭也存在,当然,存在值为0。随着数据的扩展,我也希望保持其余变量。 我希望我能很好地解释,让我知道您是否需要更多信息。这是我尝试过但失败的代码: | MorphoData <- expand.grid(
Site = unique(MorphoData$Site),
Family = unique(MorphoData$Family),
Year = unique(MorphoData$Year)
) %>%
left_join(MorphoData, by = c("Site", "Family", "Year")) %>%
group_by(Site, Family, Year) %>%
mutate(
Presence = replace_na(Presence, 0) # Ensure missing Presence values are 0
) %>%
group_by(Site, Year) %>%
fill(everything(.), .direction = "downup") %>% # Fill missing taxonomy/spatial data
ungroup()
|
这里是我拥有的假设例子: | data <- data_frame(Site = c("KAV01", "KAV01", "KAV01", "KAV01", "KAV01", "KAV01", "KAV01",
"KAV02", "KAV02", "KAV02", "KAV02", "KAV02", "KAV02", "KAV02",
"KAV03", "KAV03", "KAV03", "KAV03", "KAV03", "KAV03", "KAV03"),
Family = sample(c("Fam1", "Fam2", "Fam3", "Fam4", "Fam5", "Fam6"), 21, replace = TRUE),
Year = sample(c(2021, 2022, 2023), 21, replace = TRUE),
Presence = sample(c(0, 1), 21, , replace = TRUE),
Lon = rnorm(n = 21, mean = 5, sd = 1),
Lat = rnorm(n = 21, mean = 2, sd = 0.3))
| 。 | complete |