在 dplyr mutate 中使用自定义 case_when 函数

Question

我已经查看了许多与我的问题相关的帖子，但我似乎无法弄清楚我的问题。

我有一个基本表格，随着收集（NFL 赛季）的继续，该表格将包含额外的列。我无法使用基于周的 case_when 来运行我的函数而不返回“！对象'WK1'未找到”

第二周后我的数据。

NumCorrect <- data.frame(
  TEAM = c(A:E),
  WK1 = c(11,12,11,12,13),
  WK2 = c(7,7,9,10,7)
)

我的代码。

# Defined variable at the top of the code that I change as the weeks are played.
curWEEK = 2

# Simplified version of my dplyr code at the step that won't work. 
correct_d <- mutate(AVG_16 = average16(curWEEK))

我的功能。

average16 <- function(x) {case_when(x == 1 ~ WK1,
                                    x == 2 ~ round(mean((WK1:WK2), na.rm=TRUE),1),
                                    x == 3 ~ round(mean((WK1:WK3), na.rm=TRUE),1),
                                    x %in% 4:7 ~ round(mean((WK1:WK4), na.rm=TRUE),1),
                                    x %in% 8:11 ~ round(mean(c(WK1:WK4,WK8), na.rm=TRUE),1),
                                    x %in% 12:14 ~ round(mean(c(WK1:WK4,WK8,WK12), na.rm=TRUE),1),
                                    x == 15 ~ round(mean(c(WK1:WK4,WK8,WK12,WK15), na.rm=TRUE),1),
                                    x == 16 ~ round(mean(c(WK1:WK4,WK8,WK12,WK15:WK16), na.rm=TRUE),1),
                                    x == 17 ~ round(mean(c(WK1:WK4,WK8,WK12,WK15:WK17), na.rm=TRUE),1),
                                    x == 18 ~ round(mean(c(WK1:WK4,WK8,WK12,WK15:WK18), na.rm=TRUE),1)
                                    )}

我尝试让 R 和 chatGPT 决一死战，但功能变得越来越复杂，没有解决方案。

如何使用函数来查找不完整且将添加列的表中仅某些列的平均值？

我尝试了很多版本，并且能够修改代码，将所有未玩的周列（WK1 到 WK18）保留为 NA，但仍然收到“找不到对象”错误。

每次都会出错：

Error in `mutate()`:
ℹ In argument: `AVG_16
  = average16(curWEEK)`.
ℹ In row 1.
Caused by error in `case_when()`:
! Failed to evaluate
  the right-hand side of
  formula 1.
Caused by error:
! object 'WK1' not found

我也尝试过这个：

average16 <- function(x) {if(x == 1) {WK1}
                          if(x == 2) {round(mean((WK1:WK2), na.rm=TRUE),1)}
                          if(x == 3) {round(mean((WK1:WK3), na.rm=TRUE),1)}
                          if(x %in% 4:7)   {round(mean((WK1:WK4), na.rm=TRUE),1)}
                          if(x %in% 8:11)  {round(mean(c(WK1:WK4,WK8), na.rm=TRUE),1)}
                          if(x %in% 12:14) {round(mean(c(WK1:WK4,WK8,WK12), na.rm=TRUE),1)}
                          if(x == 15) {round(mean(c(WK1:WK4,WK8,WK12,WK15), na.rm=TRUE),1)}
                          if(x == 16) {round(mean(c(WK1:WK4,WK8,WK12,WK15:WK16), na.rm=TRUE),1)}
                          if(x == 17) {round(mean(c(WK1:WK4,WK8,WK12,WK15:WK17), na.rm=TRUE),1)}
                          if(x == 18) {round(mean(c(WK1:WK4,WK8,WK12,WK15:WK18), na.rm=TRUE),1)}
                          }

Answer 1

如果您正在使用

dplyr

，那么也许一种非常不同的方法可能会实现您的目标。

您可以使用字符向量来指定要取平均值的列（如果存在）。如果它们不存在，那么

any_of

将默默地忽略它们。

library(tidyverse)
NumCorrect <- data.frame(
    TEAM = LETTERS[1:5],
    WK1 = c(11,12,11,12,13),
    WK2 = c(7,7,9,10,7)
)

cols_to_avg <- paste0("WK", c(1:4, 8, 12, 15:18))

NumCorrect |> 
    rowwise() |> 
    mutate(avg16 = round(mean(c_across(any_of(cols_to_avg))),1)) |> 
    ungroup()

在 dplyr mutate 中使用自定义 case_when 函数

问题描述投票：0回答：1

1个回答

最新问题

在 dplyr mutate 中使用自定义 case_when 函数

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1