根据其他两列添加新列，并具有多个条件、字符

Question

我想根据其他两列向我的数据框添加一个新列。数据如下：

df
job    honorary  

yes    yes
yes    no
no     yes
yes    yes
yes    NA
NA     no

现在我想要第三列，如果工作和荣誉都是“是”，则包含“两者”；如果只有荣誉列包含“是”，则包含“荣誉”；如果只有“工作”列包含“是”，则包含“工作”；如果两者都包含 NA，则包含 NA或者一列包含 NA，另一列包含 no。第三列应如下所示：

result

both
job
honorary
both
job
NA

我尝试过使用 if 和 mutate 编写代码，但我对 R 还很陌生，我的代码根本不起作用。如果我像这样单独分配值：

data_nature_fewmissing$urbandnat[data_nature_fewmissing$nature =="yes" & data_nature_fewmissing$urbangreen =="yes"] <- "yes"

它不起作用，因为在每一步中我都会覆盖之前的结果。

感谢您的帮助！

Answer 1

对于这些类型的复杂条件句，我喜欢

case_when

中的

dplyr

。

df<-tibble::tribble(
   ~job, ~honorary,
  "yes",     "yes",
  "yes",      "no",
   "no",     "yes",
  "yes",     "yes",
  "yes",        NA,
     NA,      "no"
  )

library(dplyr)

df_new <- df %>%
  mutate(result=case_when(
    job=="yes" & honorary=="yes" ~ "both",
    honorary=="yes" ~ "honorary", 
    job=="yes" ~ "job", 
    is.na(honorary) & is.na(job) ~ NA_character_, 
    is.na(honorary) & job=="no" ~ NA_character_, 
    is.na(job) & honorary=="no" ~ NA_character_, 
    TRUE ~ "other"
  ))

df_new
#> # A tibble: 6 × 3
#>   job   honorary result  
#>   <chr> <chr>    <chr>   
#> 1 yes   yes      both    
#> 2 yes   no       job     
#> 3 no    yes      honorary
#> 4 yes   yes      both    
#> 5 yes   <NA>     job     
#> 6 <NA>  no       <NA>

或以 R 为基数


df_new<-df

df_new=within(df_new,{
  result=NA
  result[ honorary=="yes"] = "honorary"
  result[ job=="yes"] = "job"
  result[job=="yes" & honorary=="yes"]='both'
})

^{由 reprex 包于 2022 年 1 月 16 日创建（v2.0.1）}

Answer 2

您的代码返回错误，因为您尚未对行建立索引。索引数据帧时，语法为

df[rows, columns]

。因此，要索引行并选择所有列，您必须添加逗号：

data_nature_fewmissing$urbandnat[data_nature_fewmissing$nature =="yes" & data_nature_fewmissing$urbangreen =="yes",] <- "yes"

然而，更简单的方法是使用 tidyverse。我们将使用

mutate

来创建新列，并使用

case_when

来处理多个 if-else 条件。

library(tidyverse)

df = data_nature_fewmissing
df %>% mutate(result = case_when(
  job == 'yes' & honorary == 'yes' ~ 'both', 
  job == 'yes' & (honorary == 'no' | is.na(honorary)) ~ 'job',
  honorary == 'yes' & (job == 'no' | is.na(job)) ~ 'honorary',
  ))

Answer 3

我有一个类似的问题，我在这里发布我的解决方案，供其他正在寻找它的人使用！（对于 R 来说也是非常新的）。我还不能发表评论，但这是基于 Macgregor 的回答。

Macgregor 上面基于变异的答案对我来说很有效，只需稍加编辑。当使用编写的 mutate 代码时，我得到了混乱的打印输出，并且没有添加列。我必须添加“mydataframe <-" for it to work.

df = SurveyData
SurveyData <- df %>% mutate(Gender.Id.3 = case_when(
   Q67 == '2' |  Q67 == '3' ~ 'Men',    
   Q67 == '1' | Q67 == '4' ~ 'Women',  
   Q67 == '8'     |  Q67 == '5' | Q67 == '6'| Q67 == '7' | Q67 == '12' ~ 'Other', 
))

根据其他两列添加新列，并具有多个条件、字符

问题描述投票：0回答：3

3个回答

最新问题

根据其他两列添加新列，并具有多个条件、字符

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3