防止 sjlabelled 在使用 mutate 创建新变量时自动复制标签

问题描述 投票:0回答:1

当我使用具有标签的变量创建新变量时,新变量似乎继承了我提供的变量之一的标签。这是一个错误还是我该如何防止这种情况?显然,我可以给新变量一个新标签或删除旧标签,但我不想为我创建的每个变量都这样做

library(sjlabelled)
library(dplyr)

df <- data.frame(aa = c(1),
                 bb = c(2)) %>% 
  var_labels(
    aa = "label a",
    bb = "label b")
df

df <- df %>% 
  mutate(cc = aa + bb)


# I want to avoid having to relabel the variable
# df <- df %>% 
#   var_labels(
#   cc = "")

enter image description here

r dplyr label
1个回答
0
投票

出现此行为是因为 sjlabelled 在数据转换期间保留变量标签,这是设计使然,而不是错误。当您使用 mutate 创建新变量时,新变量会继承其中一个贡献变量(通常是第一个)的标签。如果您不希望新变量自动继承任何标签,有几种方法可以全局或本地处理此问题:

library(sjlabelled)
#install.packages("sjlabelled")
library(dplyr)

df <- data.frame(aa = c(1),
                 bb = c(2)) %>% 
  var_labels(
    aa = "label a",
    bb = "label b")

# remove label from newly mutated columns
mutate_strip_labels <- function(.data, ...) {
  # Capture the state before mutation
  old_vars <- names(.data)
  
  # Perform the mutation
  new_data <- mutate(.data, ...)
  
  # Capture the state after mutation
  new_vars <- setdiff(names(new_data), old_vars)
        
  # Remove labels from any new variables
  for (var in new_vars) {
    # Directly modify the column's label
    new_data[[var]] <- set_label(new_data[[var]], label = "")
  }
  
  new_data
}

df <- df %>% 
  mutate_strip_labels(cc = aa + bb)

# or remove all labels when mutating
mutate_no_labels <- function(.data, ...) {
  result <- mutate(.data, ...)
  result <- remove_all_labels(result)
  result
}

df <- df %>% 
  mutate_no_labels(cc = aa + bb)
© www.soinside.com 2019 - 2024. All rights reserved.