变异多个相似列的最惯用方法?

问题描述 投票:0回答:1

我在

mutate()
调用中生成多个列,所有这些都是通过使用 1)现有列和 2)每个输出列不同的某个值的函数来实现的。下面的代码产生了我想要的结果,但它有味道:

df <- tibble(base_string = c("a", "b", "c"))
df_desired_result <- df |>
  mutate(
    one = str_c(base_string, "1"),
    two = str_c(base_string, "2"),
    three = str_c(base_string, "3")
  )
df_desired_result
# A tibble: 3 × 4
  base_string one   two   three
  <chr>       <chr> <chr> <chr>
1 a           a1    a2    a3   
2 b           b1    b2    b3   
3 c           c1    c2    c3   

如果还有很多其他列,这将是一个糟糕的解决方案。

我想出的最好的改进是:

df_also_desired_result <- df |>
  expand_grid(
    tibble(
      number_name = c("one", "two", "three"),
      number_string = c("1", "2", "3")
    )
  ) |>
  mutate(final_string = str_c(base_string, number_string)) |>
  pivot_wider(
    id_cols = base_string,
    names_from = number_name,
    values_from = final_string
  )
df_also_desired_result
# A tibble: 3 × 4
  base_string one   two   three
  <chr>       <chr> <chr> <chr>
1 a           a1    a2    a3   
2 b           b1    b2    b3   
3 c           c1    c2    c3   

但这似乎太冗长了。希望有任何关于更好的方法来做到这一点的建议。

r dplyr tidyr
1个回答
0
投票

不完全相信它有多“惯用”,但使用嵌套小标题的方法更短,由与

purrr::map()
一起使用的命名向量控制,名称成为列名称,项目是函数参数。

library(dplyr)
library(tidyr)
library(purrr)
library(stringr)

df <- tibble(base_string = c("a", "b", "c"))

# named vector, column_name = "argument"
col_args <- c(one = "1", two = "2", three = "3") 

# add a tibble column with new columns, controlled by col_args; unnest
df |> 
  mutate(cols = map(col_args, \(x) str_c(base_string, x)) |> bind_cols()) |> 
  unnest(cols)
#> # A tibble: 3 × 4
#>   base_string one   two   three
#>   <chr>       <chr> <chr> <chr>
#> 1 a           a1    a2    a3   
#> 2 b           b1    b2    b3   
#> 3 c           c1    c2    c3

创建于 2024-07-27,使用 reprex v2.1.0

© www.soinside.com 2019 - 2024. All rights reserved.