R table1 包:在额外列中调整 p 值

问题描述 投票:0回答:1

有没有办法对额外列中的所有 p 值应用 p 值调整?通过在某处合并 p.adjust 函数?我假设小插图中的 pvalue 函数正在每行进行计算,所以它在那里没有意义。 table1 函数的参数中的任何位置都可能吗?

小插图中的

pvalue
功能(针对 >2 组进行调整):

pvalue <- function(x, ...) {
# Construct vectors of data y, and groups (strata) g
y <- unlist(x)
g <- factor(rep(1:length(x), times=sapply(x, length)))
if (is.numeric(y)) {
# For numeric variables, perform a standard 2-sample t-test
p <- anova(lm(y ~ g))$`Pr(>F)`[1]
} else {
# For categorical variables, perform a chi-squared test of independence
p <- chisq.test(table(y, g))$p.value
}
# Format the p-value, using an HTML entity for the less-than sign.
# The initial empty string places the output on the line below the variable label.
c("", sub("<", "<", format.pval(p, digits=3, eps=0.001)))
}
插图中的

table1 示例(数据集已替换为

iris
数据集,以使其更易于访问:

table1(~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width | Species,
data=iris, overall=F, extra.col=list(`P-value`=pvalue))

据我理解上面的代码,您不能在

p.adjust
函数中使用
pvalue
函数(stats R),因为这些是对表的各个行进行计算。那么您可以在
extra.col
函数的
table1
属性中指定它吗?
extra.col=p.adjust(list(`P-value`=pvalue)))
extra.col=list(`P-value`=p.adjust(pvalue)))
给出错误:

p.adjust(list(P-value = pvalue)) 中的错误:“list”对象不能 被迫输入“double”

链接:https://cran.r-project.org/web/packages/table1/vignettes/table1-examples.html#example-a-column-of-p-values

r html-table statistics p-value statistical-test
1个回答
0
投票

这是一个使用

{rvest}
的后处理想法:

library(table1)
library(rvest)
library(stringr)

set.seed(42)

# example data with better spread of p-values
data <- data.frame(
  x = factor(rep(paste("Group", 1:2), each = 10)),
  y1 = c(rnorm(10, -0.7), rnorm(10, 0.7)),
  y2 = c(rnorm(10, -0.9), rnorm(10, 0.9)),
  y3 = c(rnorm(10, -1.2), rnorm(10, 1.2))
)

# modified pvalue function to return raw p-values
pvalue <- function(x, ...) {
  # Construct vectors of data y, and groups (strata) g
  y <- unlist(x)
  g <- factor(rep(1:length(x), times = sapply(x, length)))
  if (is.numeric(y)) {
    # For numeric variables, perform a standard 2-sample t-test
    p <- anova(lm(y ~ g))$`Pr(>F)`[1]
  } else {
    # For categorical variables, perform a chi-squared test of independence
    p <- chisq.test(table(y, g))$p.value
  }
  # Format the p-value, using an HTML entity for the less-than sign.
  # The initial empty string places the output on the line below the variable label.
  c("", p) # c("", sub("<", "<", format.pval(p, digits=3, eps=0.001)))
}

my_table1 <- table1(
  ~ . | x,
  data = data,
  overall = FALSE,
  extra.col = list(`P-value` = pvalue)
)

# extract raw p-values from table
raw_p <- my_table1[1] %>%
  read_html() %>%
  html_elements("body > table > tbody > tr > td:nth-child(4)") %>% # p-value is 4th column in example
  as.character() %>%
  str_extract("\\d+\\.\\d+e?-?\\d*")

# format the p-values with experiment-wise adjustment
new_p <- raw_p %>%
  as.numeric() %>%
  p.adjust(n = sum(!is.na(.))) %>%
  format.pval(digits = 3, eps = 0.001)

# replace raw with formatted p-values
for (i in seq_along(raw_p)) {
  if (!is.na(raw_p[i]))
    my_table1[1] <- my_table1[1] %>% str_replace(raw_p[i], new_p[i])
}

# table as desired
my_table1

创建于 2024-09-17,使用 reprex v2.1.1

© www.soinside.com 2019 - 2024. All rights reserved.