在变量上匹配时有条件替换

问题描述 投票:0回答:1

我想替换特定子组中观察值的NA值,但是该组中观察值的顺序未正确排序。因此,我想知道是否存在某些dplyrplyr命令,这些命令将允许我使用另一个数据帧中同一列的值来替换属于一个数据帧的列中的缺失值,同时匹配该“键”的值“列。

这就是我得到的。希望有人能对此有所启发。谢谢。

## data frame that contains missing values in "diff" column

df <- data.frame(type = c(1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3), 
diff = c(0.1, 0.3, NA, NA, NA, NA, NA, 0.2, 0.7, NA, 0.5, NA), 
name = c("A", "B", "C", "D", "E", "A", "B", "C", "F", "A", "B", "C"))

## replace with values from this smaller data frame

df2 <- data.frame(diff_rep = c(0.3, 0.2, 0.4), name = c("A", "B", "C"))

## replace using ifelse
df$diff <- ifelse(is.na(df$diff) & (df$type == 2), df2$diff_rep , df$diff)

df

   type diff name
1     1  0.1    A
2     1  0.3    B
3     1   NA    C
4     2  0.3    D
5     2  0.2    E
6     2  0.4    A
7     2  0.3    B
8     2  0.2    C
9     2  0.7    F
10    3   NA    A
11    3  0.5    B
12    3   NA    C

## desired output

   type diff name
1     1  0.1    A
2     1  0.3    B
3     1   NA    C
4     2   NA    D
5     2   NA    E
6     2  0.3    A
7     2  0.2    B
8     2  0.4    C
9     2  0.7    F
10    3   NA    A
11    3  0.5    B
12    3   NA    C
r if-statement replace dplyr na
1个回答
0
投票

假设第9行是一个错误,您可以先使用左联接,然后使用ifelse()coalesce()获得所需的结果。 coalesce()返回第一个非缺失值

left_join(df, df2, by = "name") %>% 
  mutate(diff_wanted = if_else(type == 2,
                               coalesce(diff, diff_rep),
                               diff),
         diff_wanted = ifelse(name %in% df2$name,
                              diff_wanted,
                              NA)) %>% 
  select(type, diff_wanted, name)
© www.soinside.com 2019 - 2024. All rights reserved.