有没有办法对额外列中的所有 p 值应用 p 值调整?通过在某处合并 p.adjust 函数?我假设小插图中的 pvalue 函数正在每行进行计算,所以它在那里没有意义。 table1 函数的参数中的任何位置都可能吗?
小插图中的pvalue
功能(针对 >2 组进行调整):
pvalue <- function(x, ...) {
# Construct vectors of data y, and groups (strata) g
y <- unlist(x)
g <- factor(rep(1:length(x), times=sapply(x, length)))
if (is.numeric(y)) {
# For numeric variables, perform a standard 2-sample t-test
p <- anova(lm(y ~ g))$`Pr(>F)`[1]
} else {
# For categorical variables, perform a chi-squared test of independence
p <- chisq.test(table(y, g))$p.value
}
# Format the p-value, using an HTML entity for the less-than sign.
# The initial empty string places the output on the line below the variable label.
c("", sub("<", "<", format.pval(p, digits=3, eps=0.001)))
}
插图中的 table1 示例(数据集已替换为
iris
数据集,以使其更易于访问:
table1(~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width | Species,
data=iris, overall=F, extra.col=list(`P-value`=pvalue))
据我理解上面的代码,您不能在
p.adjust
函数中使用 pvalue
函数(stats R),因为这些是对表的各个行进行计算。那么您可以在 extra.col
函数的 table1
属性中指定它吗? extra.col=p.adjust(list(`P-value`=pvalue)))
或 extra.col=list(`P-value`=p.adjust(pvalue)))
给出错误:
p.adjust(list(P-value = pvalue)) 中的错误:“list”对象不能 被迫输入“double”
这是一个使用
{rvest}
的后处理想法:
library(table1)
library(rvest)
library(stringr)
set.seed(42)
# example data with better spread of p-values
data <- data.frame(
x = factor(rep(paste("Group", 1:2), each = 10)),
y1 = c(rnorm(10, -0.7), rnorm(10, 0.7)),
y2 = c(rnorm(10, -0.9), rnorm(10, 0.9)),
y3 = c(rnorm(10, -1.2), rnorm(10, 1.2))
)
# modified pvalue function to return raw p-values
pvalue <- function(x, ...) {
# Construct vectors of data y, and groups (strata) g
y <- unlist(x)
g <- factor(rep(1:length(x), times = sapply(x, length)))
if (is.numeric(y)) {
# For numeric variables, perform a standard 2-sample t-test
p <- anova(lm(y ~ g))$`Pr(>F)`[1]
} else {
# For categorical variables, perform a chi-squared test of independence
p <- chisq.test(table(y, g))$p.value
}
# Format the p-value, using an HTML entity for the less-than sign.
# The initial empty string places the output on the line below the variable label.
c("", p) # c("", sub("<", "<", format.pval(p, digits=3, eps=0.001)))
}
my_table1 <- table1(
~ . | x,
data = data,
overall = FALSE,
extra.col = list(`P-value` = pvalue)
)
# extract raw p-values from table
raw_p <- my_table1[1] %>%
read_html() %>%
html_elements("body > table > tbody > tr > td:nth-child(4)") %>% # p-value is 4th column in example
as.character() %>%
str_extract("\\d+\\.\\d+e?-?\\d*")
# format the p-values with experiment-wise adjustment
new_p <- raw_p %>%
as.numeric() %>%
p.adjust(n = sum(!is.na(.))) %>%
format.pval(digits = 3, eps = 0.001)
# replace raw with formatted p-values
for (i in seq_along(raw_p)) {
if (!is.na(raw_p[i]))
my_table1[1] <- my_table1[1] %>% str_replace(raw_p[i], new_p[i])
}
# table as desired
my_table1
创建于 2024-09-17,使用 reprex v2.1.1