重命名由列索引指定的数据框列，作为这些索引的函数

Question

作为管道的一部分，我想获取一个数据框或 tibble 并重命名由位置索引向量指定的列的子集，并将新的列名称作为其索引的函数而不是它们的名称.我不想离开管道、存储中间结果、存储索引向量，或者必须输入两次索引向量（如果我想更改它们，等待发生的事故）。

我可以通过使用

dplyr::rename_with

或

rlang::set_names

管道进入一个可怕的匿名函数来实现我的目标。但肯定有比我想出的更简洁的方法来做到这一点？

library(tidyverse)
# Base R does what I want: but not pipe-friendly
temp <- starwars |>
  head(c(2, 6))
idx <- c(2, 4:6)
colnames(temp)[idx] <- str_c("col_", idx, "_new")
print(temp)
#> # A tibble: 2 × 6
#>   name           col_2_new  mass col_4_new col_5_new col_6_new
#>   <chr>              <int> <dbl> <chr>     <chr>     <chr>    
#> 1 Luke Skywalker       172    77 blond     fair      blue     
#> 2 C-3PO                167    75 <NA>      gold      yellow

# Can repeat the vector of selected indices in the .fn argument of rename_with
# but surely there's a way to avoid writing c(2, 4:6) twice?
starwars |>
  head(c(2, 6)) |>
  rename_with(.cols = c(2, 4:6), ~ str_c("col_", c(2, 4:6), "_new"))
#> # A tibble: 2 × 6
#>   name           col_2_new  mass col_4_new col_5_new col_6_new
#>   <chr>              <int> <dbl> <chr>     <chr>     <chr>    
#> 1 Luke Skywalker       172    77 blond     fair      blue     
#> 2 C-3PO                167    75 <NA>      gold      yellow

# rename_with doesn't *quite* do what I want here
# Can specify cols by index, but .x is the column name not its index
starwars |>
  head(c(2, 6)) |>
  rename_with(.cols = c(2, 4:6), ~ str_c("col_", .x, "_new"))
#> # A tibble: 2 × 6
#>   name           col_height_new  mass col_hair_color_new col_skin_colo…¹ col_e…²
#>   <chr>                   <int> <dbl> <chr>              <chr>           <chr>  
#> 1 Luke Skywalker            172    77 blond              fair            blue   
#> 2 C-3PO                     167    75 <NA>               gold            yellow 
#> # … with abbreviated variable names ¹col_skin_color_new, ²col_eye_color_new

# Anonymous function avoids repeating c(2, 4:6) - supplying the external vector
# means using all_of() or any_of() depending on whether you want an error if
# an index is missing.
# But surely there's an easier way than this?
starwars |>
  head(c(2, 6)) |>
  (\(tbl, idx) rename_with(tbl, .cols = all_of(idx),
                           ~ str_c("col_", idx, "_new")))(c(2, 4:6))
#> # A tibble: 2 × 6
#>   name           col_2_new  mass col_4_new col_5_new col_6_new
#>   <chr>              <int> <dbl> <chr>     <chr>     <chr>    
#> 1 Luke Skywalker       172    77 blond     fair      blue     
#> 2 C-3PO                167    75 <NA>      gold      yellow

# There's also rlang::set_names ... but this is even uglier
starwars |>
  head(c(2, 6)) |>
  (\(tbl, idx) set_names(tbl, ifelse(seq_along(tbl) %in% idx,
                                     str_c("col_", seq_along(tbl), "_new"),
                                     colnames(tbl))))(c(2, 4:6))
#> # A tibble: 2 × 6
#>   name           col_2_new  mass col_4_new col_5_new col_6_new
#>   <chr>              <int> <dbl> <chr>     <chr>     <chr>    
#> 1 Luke Skywalker       172    77 blond     fair      blue     
#> 2 C-3PO                167    75 <NA>      gold      yellow

相关问题，但不重复，因为它们不要求新名称是索引的函数：R: dplyr - Rename column name by position instead of name 和How to dplyr rename a column, by column指数？

Answer 1

我认为没有规范/干净的方法可以做到这一点，除非 i）两次使用索引的值或 ii）将它们存储在临时变量中（或 iii）使用 hacky 方法将值即时存储在临时变量或函数并再次使用它们）。

我想说一个规范的方法是创建一个查找向量并在里面使用它

rename(all_of())

。稍后再看这段代码时，很容易理解列名是如何重新编码的。

library(tidyverse)

idx <- c(2, 4:6)
lookup_vec <- setNames(idx, str_c("col_", idx, "_new"))

starwars |>
  head(c(2, 6)) |>
  rename(all_of(lookup_vec))

#> # A tibble: 2 × 6
#>   name           col_2_new  mass col_4_new col_5_new col_6_new
#>   <chr>              <int> <dbl> <chr>     <chr>     <chr>    
#> 1 Luke Skywalker       172    77 blond     fair      blue     
#> 2 C-3PO                167    75 <NA>      gold      yellow

如果你想大量应用这种操作并且想不惜一切代价避免临时变量，那么辅助函数可能会起作用：

rename_at_idx <- function(df, idx, before = "", after = "") {
  rename(df, all_of(setNames(idx,
                  str_c(before, idx, after))
                )
  )
}

starwars |>
  head(c(2, 6)) |>
  rename_at_idx(c(2, 4:6), "col_", "_new")
#> same output

^{创建于 2023-03-20 与 reprex v2.0.2}

重命名由列索引指定的数据框列，作为这些索引的函数

问题描述投票：0回答：1

1个回答

最新问题

重命名由列索引指定的数据框列，作为这些索引的函数

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1