基于 dplyr 中的正则表达式在函数中选择列的便捷方法

Question

我正在尝试根据 dplyr 命令中的函数的正则表达式选择列。

也许我想得太复杂了，或者我必须先重新构建我的数据？

我提供了一个可重现的示例。我很想定义列以使用正则表达式的内部长度。所以在我的例子中

col_of_interest

是所有里面有

的列...

library(dplyr)

set.seed(1)
col_of_interest <- 'a'
df <- 
  data.frame(
    id=rep(1:2, each=5),
    a1=rnorm(10),
    a2=c(NA, NA, NA, rnorm(7)),
    b1=rnorm(10)
  )

df %>% 
  group_by(id) %>% 
  reframe(value_count=length(na.omit(c(a1, a2))))

Answer 1

我首先会独立计算每个匹配列的每组非 NA，然后使用

c_across

:

将它们相加

df %>% 
  group_by(id) %>% 
  summarize(across(matches(col_of_interest), ~ length(na.omit(.x))),
            .groups = "keep") %>% 
  mutate(value_count = sum(c_across(matches(col_of_interest))), .by = id,
         .keep = "none") # everything() insetad of matches(...) would do too
#   id value_count
# 1  1           7
# 2  2          10

基于 dplyr 中的正则表达式在函数中选择列的便捷方法

问题描述投票：0回答：1

1个回答

最新问题

基于 dplyr 中的正则表达式在函数中选择列的便捷方法

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1