gt 表中“匿名”计数低于 3

问题描述 投票:0回答:1

我正在使用 RStudio。 如果表中的任何单元格包含的计数少于三(数据保护原因),则不允许发布我正在使用的数据。然而,正如您在下面的表中所看到的,我在没有患有该疾病的体重不足的男性中得到了一项计数。

GT表没有“<3"

library(gt)
library(gtsummary)

df <- data.frame(
  disease = c("No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", 
              "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes"),
  sex = c("Female", "Female", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Male", "Male",
          "Female", "Female", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Male", "Male", "Female"),
  bmi_class = c("Underweight", "Underweight", "Underweight", "Normal", "Normal", "Normal", "Underweight", "Normal", "Normal", "Normal", "Normal", "Normal",
                "Overweight", "Overweight", "Overweight", "Overweight", "Overweight", "Overweight", "Normal", "Normal", "Normal", NA, NA, NA, "Overweight"))

df |> 
  tbl_strata(
    strata = disease,
    .tbl_fun = 
      ~ .x |> 
      tbl_summary(by = sex) |> 
      add_overall()
  ) 

df |> 
  tbl_strata(
    strata = disease,
    .tbl_fun = 
      ~ .x |> 
      tbl_summary(by = sex) |> 
      add_overall()
  ) |> 
  as_gt() |> 
  sub_values(
    columns = all_stat_cols(),
    pattern = "^1 \\(.*\\)$",
    replacement = "<3"
  ) |> 
  sub_values(
    columns = all_stat_cols(),
    pattern = "^2 \\(.*\\)$",
    replacement = "<3"
  )

GT 表带有“<3"

我设法将每个单元格值 1 或 2 替换为“<3" using sub_values, however, the count can still be derived based on the proportions from the other levels and the missing data count together with the header count. Thus I need to also "anonymize" the proportions somehow and the missing data. Can you help me work this out? I have multiple variables, and in case the solution is to completely omit the missing data count, then I only want to do it in the bmi_class in the disease == "Yes" strata. The same goes for the proportions if they need to be omitted or altered (e.g. only include the levels with 3 or more counts in the proportion-calculation). An option could also be to change the NA to an interval, thus still keeping some information to get the sense of the magnitude of NA's. I considered creating a new variable changing the count to an NA (and thus changing the proportions), however I am also using the stratified variable in tbl_uvregression, and I want that estimate to be calculated based on the true values.

我的大脑现在已经达到了极限,试图解决这个问题。你们中的一些人可以从其他角度甚至更好的方面帮助我:解决方案吗?

亲切的问候 玛蒂尔德

编辑: 一个简化的(无分层)示例,具有两个不同版本的潜在解决方案。 新示例

r gtsummary gt
1个回答
0
投票

您可以遵循以下通用格式来更改某些统计数据在表格中的显示方式。在下面的示例中,我只修改了显示的 n,但您可以进一步扩展它并更改格式化百分比的函数。

library(gtsummary)
library(tidyverse)

df <- data.frame(
  disease = c("No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", 
              "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes"),
  sex = c("Female", "Female", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Male", "Male",
          "Female", "Female", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Male", "Male", "Female"),
  bmi_class = c("Underweight", "Underweight", "Underweight", "Normal", "Normal", "Normal", "Underweight", "Normal", "Normal", "Normal", "Normal", "Normal",
                "Overweight", "Overweight", "Overweight", "Overweight", "Overweight", "Overweight", "Normal", "Normal", "Normal", NA, NA, NA, "Overweight")) |> 
  slice(1:7)

# first build the table
tbl <- tbl_summary(df, include = c(sex, bmi_class), missing = "always")
as_kable(tbl)
特点 N = 7
6 (86%)
1 (14%)
未知 0
bmi_class
正常 3 (43%)
体重不足 4 (57%)
未知 0


# extract the ARD, and change the default formatting function
ard <-
  gather_ard(tbl) |> 
  cards::bind_ard() |> 
  dplyr::mutate(
    fmt_fn = 
      case_when(
        stat_name %in% "n" & stat %in% 0:2 ~ list(\(x) "<3"),
        stat_name %in% "N_miss" & !stat %in% 0:2 ~ list(\(x) ">2"),
        .default = fmt_fn
      )
  )

# recycle the ARD back through gtsummary
ard |> 
  tbl_ard_summary(missing = "always") |> 
  as_kable()
特点 整体
6 (86%)
<3 (14%)
未知 0
bmi_class
正常 3 (43%)
体重不足 4 (57%)
未知 0

创建于 2024 年 11 月 4 日,使用 reprex v2.1.1

© www.soinside.com 2019 - 2024. All rights reserved.