我有一个数据帧 df_3,我想从中突变以 Team_ 开头的多个列。我想用 NA 替换列中包含的 0。我使用了之前成功使用过的代码,但现在出现以下错误:
Error in `mutate()`:
ℹ In argument: `across(starts_with("Team_"), ~na_if(., "0"))`.
Caused by error in `across()`:
! Can't compute column `Team_Num_1`.
Caused by error in `na_if()`:
! Can't convert `y` <character> to match type of `x` <double>.
Backtrace:
1. df_3 %>% mutate(across(starts_with("Team_"), ~na_if(., "0")))
10. dplyr::na_if(Team_Num_1, "0")
知道为什么会这样或者我该如何解决它吗?我没有更改原始 df 中的任何内容以及之前运行的代码,不确定发生了什么变化。
可复制代码:
structure(list(Team_1 = c("0", "werg", "sdf"), Team_Desc_1 = c("wer",
"wtrb", "wergt"), Team_URL_1 = c("ewrg", "werg", "asd"), Team_Ver_1 = c("25",
"2523", "342"), Team_Num_1 = c(0, 23, 12), Team_Value_1 = c("aed",
"jfsa", "vsf"), Name_1 = c("etwbv", "werg", "sdfg"), Txt_1 = c("abc",
"bfh", "fse"), Head_1 = c("abc1", "bfh", "fse"), Team_2 = c("werh",
"wtt", "qwe"), Team_Desc_2 = c("sdfg", "wer", "sdfgv"), Team_URL_2 = c("qwe",
"gvre", "vrw"), Team_Ver_2 = c("4123", "5133", "4126"), Team_Num_2 = c(3,
0, 123), Team_Value_2 = c("aewed", "jfsbwa", "vsbf"), Name_2 = c("qwreg",
"gvr", "wref"), Txt_2 = c("rege", "wer", "vwr"), Head_2 = c("rege1",
"wer", "vwr")), row.names = c(NA, -3L), class = c("tbl_df", "tbl",
"data.frame"))
根据 dplyr 1.1.0 的 changelog,
na_if()
现在使用 vctrs 包,该包对类型稳定性更加严格:
(#6329) 现在在比较之前将
na_if()
转换为y
的类型,这使得更清楚地表明该函数在x
上的类型和大小是稳定的。x
所以,请使用
na_if(x, "0")
:
library(dplyr)
dat %>%
mutate(across(starts_with("Team_"), ~ na_if(.x, "0")))
# # A tibble: 3 × 18
# Team_1 Team_Desc_1 Team_UR…¹ Team_…² Team_…³ Team_…⁴ Name_1 Txt_1 Head_1 Team_2
# <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
# 1 NA wer ewrg 25 aed aed etwbv abc abc1 werh
# 2 werg wtrb werg 2523 jfsa jfsa werg bfh bfh wtt
# 3 sdf wergt asd 342 vsf vsf sdfg fse fse qwe
# # … with 8 more variables: Team_Desc_2 <chr>, Team_URL_2 <chr>,
# # Team_Ver_2 <chr>, Team_Num_2 <chr>, Team_Value_2 <chr>, Name_2 <chr>,
# # Txt_2 <chr>, Head_2 <chr>, and abbreviated variable names ¹Team_URL_1,
# # ²Team_Ver_1, ³Team_Num_1, ⁴Team_Value_1
如果您混合使用字符和数字列,您可以这样做:
# example data
dat2 <- tibble(
Team_1 = c("0", "werg", "sdf"),
Team_Desc_1 = c(0, 3, 4),
Name_1 = c("etwbv", "werg", "sdfg")
)
dat2 %>%
mutate(
across(starts_with("Team_") & where(is.character), ~ na_if(.x, "0")),
across(starts_with("Team_") & where(is.numeric), ~ na_if(.x, 0)),
)
# # A tibble: 3 × 3
# Team_1 Team_Desc_1 Name_1
# <chr> <dbl> <chr>
# 1 NA NA etwbv
# 2 werg 3 werg
# 3 sdf 4 sdfg
我遇到了同样的问题,为了简单起见,选择了以下方法,无论不同列的数据类如何,它都应该有效。
all_data[all_data %in% c(-Inf, Inf)] <- NA