我正在使用 RStudio。 如果表中的任何单元格包含的计数少于三(数据保护原因),则不允许发布我正在使用的数据。然而,正如您在下面的表中所看到的,我在没有患有该疾病的体重不足的男性中得到了一项计数。
library(gt)
library(gtsummary)
df <- data.frame(
disease = c("No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes"),
sex = c("Female", "Female", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Male", "Male",
"Female", "Female", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Male", "Male", "Female"),
bmi_class = c("Underweight", "Underweight", "Underweight", "Normal", "Normal", "Normal", "Underweight", "Normal", "Normal", "Normal", "Normal", "Normal",
"Overweight", "Overweight", "Overweight", "Overweight", "Overweight", "Overweight", "Normal", "Normal", "Normal", NA, NA, NA, "Overweight"))
df |>
tbl_strata(
strata = disease,
.tbl_fun =
~ .x |>
tbl_summary(by = sex) |>
add_overall()
)
df |>
tbl_strata(
strata = disease,
.tbl_fun =
~ .x |>
tbl_summary(by = sex) |>
add_overall()
) |>
as_gt() |>
sub_values(
columns = all_stat_cols(),
pattern = "^1 \\(.*\\)$",
replacement = "<3"
) |>
sub_values(
columns = all_stat_cols(),
pattern = "^2 \\(.*\\)$",
replacement = "<3"
)
我设法将每个单元格值 1 或 2 替换为“<3" using sub_values, however, the count can still be derived based on the proportions from the other levels and the missing data count together with the header count. Thus I need to also "anonymize" the proportions somehow and the missing data. Can you help me work this out? I have multiple variables, and in case the solution is to completely omit the missing data count, then I only want to do it in the bmi_class in the disease == "Yes" strata. The same goes for the proportions if they need to be omitted or altered (e.g. only include the levels with 3 or more counts in the proportion-calculation). An option could also be to change the NA to an interval, thus still keeping some information to get the sense of the magnitude of NA's. I considered creating a new variable changing the count to an NA (and thus changing the proportions), however I am also using the stratified variable in tbl_uvregression, and I want that estimate to be calculated based on the true values.
我的大脑现在已经达到了极限,试图解决这个问题。你们中的一些人可以从其他角度甚至更好的方面帮助我:解决方案吗?
亲切的问候 玛蒂尔德
编辑: 一个简化的(无分层)示例,具有两个不同版本的潜在解决方案。 新示例
您可以遵循以下通用格式来更改某些统计数据在表格中的显示方式。在下面的示例中,我只修改了显示的 n,但您可以进一步扩展它并更改格式化百分比的函数。
library(gtsummary)
library(tidyverse)
df <- data.frame(
disease = c("No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes"),
sex = c("Female", "Female", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Male", "Male",
"Female", "Female", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Male", "Male", "Female"),
bmi_class = c("Underweight", "Underweight", "Underweight", "Normal", "Normal", "Normal", "Underweight", "Normal", "Normal", "Normal", "Normal", "Normal",
"Overweight", "Overweight", "Overweight", "Overweight", "Overweight", "Overweight", "Normal", "Normal", "Normal", NA, NA, NA, "Overweight")) |>
slice(1:7)
# first build the table
tbl <- tbl_summary(df, include = c(sex, bmi_class), missing = "always")
as_kable(tbl)
特点 | N = 7 |
---|---|
性 | |
女 | 6 (86%) |
男 | 1 (14%) |
未知 | 0 |
bmi_class | |
正常 | 3 (43%) |
体重不足 | 4 (57%) |
未知 | 0 |
# extract the ARD, and change the default formatting function
ard <-
gather_ard(tbl) |>
cards::bind_ard() |>
dplyr::mutate(
fmt_fn =
case_when(
stat_name %in% "n" & stat %in% 0:2 ~ list(\(x) "<3"),
stat_name %in% "N_miss" & !stat %in% 0:2 ~ list(\(x) ">2"),
.default = fmt_fn
)
)
# recycle the ARD back through gtsummary
ard |>
tbl_ard_summary(missing = "always") |>
as_kable()
特点 | 整体 |
---|---|
性 | |
女 | 6 (86%) |
男 | <3 (14%) |
未知 | 0 |
bmi_class | |
正常 | 3 (43%) |
体重不足 | 4 (57%) |
未知 | 0 |
创建于 2024 年 11 月 4 日,使用 reprex v2.1.1