我有一个像这样的
df
,我想通过非常复杂的条件来填充NA
,但我不知道编码..
df <- data.frame(
age = c("age15","age15","age16","age16"),
occo = c("ag","ag","uuse","ag"),
occn = c("tw","uuse","use","uuse"),
num = c(12,NA,567,NA),
occo2 = c("ag","use","ag","use"),
occn2 = c("tw","use","tw","use"),
num2 = c(2,45,67,789)
)
首先,我想通过条件
NA
找到num
列中的occo == "ag" & occn == "uuse"
单元格,然后以条件num2
为occo2 == "ag" & occn2 == "tw"
赋值,即2
。
更重要的是,按 age
组,因此第二个 NA
将被 67
取代。
预期结果将是:
df <- data.frame(
age = c("age15","age15","age16","age16"),
occo = c("ag","ag","uuse","ag"),
occn = c("tw","uuse","use","uuse"),
num = c(12,2,567,67),
occo2 = c("ag","use","ag","use"),
occn2 = c("tw","use","tw","use"),
num2 = c(2,45,67,789)
)
类似这样的事情
df2 <- df %>%
group_by(age) %>%
mutate(num = case_when(
occo == "ag" & occn == "uuse" ~ .[which(occo2 == "ag", occn2 == "tw"),][7]
))
这会创建所需的输出,但显然不是预期的方式:
df |>
mutate(num = if_else(is.na(num), lag(num2), num))
我对规则的假设是:
对于每个年龄段,有一个 num2 为 occo2 == "ag" & occn2 == "tw",还有一个 num 为 NA,其中 occo == "ag" & occn == "uuse"。将每个组的 num 值设置为来自同一组且满足上述条件的 num2 值。不过还不清楚