为什么不使用 ifelse() 基于两个 df 中的匹配观察值创建新列适用于此数据?

问题描述 投票:0回答:1

为了先发制人,我已经使用

left_join()
(或
merge()
)找到了解决我的问题的方法,但我不太明白为什么
ifelse()
不适用于这个问题。我很想听听我可以做到这一点或改进我的
left_join()
使用的任何其他方法。抱歉,如果这是一篇很长的文章。

基本上,我试图通过将列

df1
中的观察结果与索引 df
df1$code
中相应列中的观察结果进行匹配,在数据框
index.df$code
中创建新列。新列
df1$type
将是
index.df$type
中对应于
df1$code
值的值:

#index data frame
index.df <- data.frame(
  code = c("c10", "c20", "c03", "c48", "c19"),
  id = c("apple", "strawberry", "pear", "banana", "blackberry"),
  type = c("pome", "aggregate", "pome", "berry", "aggregate")
)
> index.df
  code         id      type
1  c10      apple      pome
2  c20 strawberry aggregate
3  c03       pear      pome
4  c48     banana     berry
5  c19  blackberry aggregate

#df to add col to
df1 <- data.frame(
  code = c("c10", "c19", "c03", "c20", "c19", "c10", "c48", "c03", "c10", "c03"),
  id = c("apple", "blackberry", "pear","strawberry", "blackberry", "apple", "banana", "pear", "apple", "pear")
)
> df1
   code         id
1   c10      apple
2   c19 blackberry
3   c03       pear
4   c20 strawberry
5   c19 blackberry
6   c10      apple
7   c48     banana
8   c03       pear
9   c10      apple
10  c03       pear

这就是所需的输出

> df2
   code         id      type
1   c10      apple      pome
2   c19 blackberry aggregate
3   c03       pear      pome
4   c20 strawberry aggregate
5   c19 blackberry aggregate
6   c10      apple      pome
7   c48     banana     berry
8   c03       pear      pome
9   c10      apple      pome
10  c03       pear      pome

我尝试过

ifelse()
这样:

df2 <- df1 %>%
  mutate(df1, type = ifelse(df1$code == index.df$code, index.df$type, NA))

> df2
   no code         id      type
1   1  c10      apple      pome
2   2  c19 blackberry      <NA>
3   3  c03       pear      pome
4   4  c20 strawberry      <NA>
5   5  c19 blackberry aggregate
6   6  c10      apple      pome
7   7  c48     banana      <NA>
8   8  c03       pear      pome
9   9  c10      apple      <NA>
10 10  c03       pear      <NA>

为什么是这样的输出?我是否错误地使用了

ifelse()
?预先感谢您!

此外,我用来获得所需输出的(相当庞大的)代码是:

df1 <- data.frame(
  no = 1:10,
  code = c("c10", "c19", "c03", "c20", "c19", "c10", "c48", "c03", "c10", "c03"),
  id = c("apple", "blackberry", "pear","strawberry", "blackberry", "apple", "banana", "pear", "apple", "pear")
)

df2 <- index.df %>%
  left_join(df1, by = c("code", "id")) %>%
  arrange(no) %>%
  select(-no)
dplyr tidyverse
1个回答
0
投票

你在做

c("c10", "c19", "c03", "c20", "c19", "c10", "c48", "c03", "c10", "c03") == c("c10", "c20", "c03", "c48", "c19") 

看起来它是有效的,但可能不会像你想象的那样。

这也许更简洁一点?

df2 <- df1
df2$type <- index.df$type[match(df1$code, index.df$code)]
© www.soinside.com 2019 - 2024. All rights reserved.