给定 R 中具有不同列的数据框可以作为因变量,我尝试创建一个函数来接收数据框“df”、具有因变量“vars”的列表或向量、时间变量“time”状态变量“status”使用“survfit”返回生存结果,并使用 ggsurvplot 返回卡普兰-迈耶曲线。
目的是避免过多复制和粘贴代码。
以下数据为例:
library(ggplot2)
library(survival)
library("dplyr")
df <- lung %>%
transmute(time,
status, # censoring status 1=censored, 2=dead
Age = age,
Sex = factor(sex, labels = c("Male", "Female")),
ECOG = factor(lung$ph.ecog),
`Meal Cal` = as.numeric(meal.cal))
# help(lung)
# Turn status into (0=censored, 1=dead)
df$status <- ifelse(df$status == 2, 1, 0)
我当然可以进行这样的生存分析:
fit <- survfit(Surv(time, status) ~ ECOG, data = df)
ggsurvplot(fit,
pval = TRUE, pval.coord = c(750, 0.3),
conf.int = FALSE,
surv.median.line = "hv",
legend = c(0.8, 0.6),
legend.title = "",
risk.table = "absolute",
risk.table.y.text = FALSE,
xlab = "Time (days)", ylab = "Survival",
palette="jco",
title="Overall Survival", font.title = c(16, "bold", "black"),
)
但是,如果我想对“性”做同样的事情,我就必须再次复制并粘贴所有内容。因此,我想在 R 中创建一个函数,它将数据框“df”、因变量“vars”列表、时间变量“time”和状态变量“status”作为输入,并返回生存结果使用“survfit”和使用“ggsurvplot”的 Kaplan-Meier 曲线,如下所示:
vars <- c("ECOG", "Sex")
surv_plot_func <- function(df, vars, time, status) {
results_list <- lapply(vars, function(var, time, status) {
# Fit a survival model
fit <- survfit(Surv(as.numeric(df[[time]]), as.logical(df[[status]])) ~ as.factor(df[[var]]), data = df)
# Plot the Kaplan-Meier curve using ggsurvplot
ggsurv <- ggsurvplot(fit, pval = TRUE, conf.int = TRUE,
risk.table = TRUE, legend.title = "",
surv.median.line = "hv", xlab = "Time", ylab = "Survival Probability")
# Return the fit and ggsurv as a list
list(fit = fit, ggsurv = ggsurv)
})
# Return the list of results
results_list
}
res_list <- surv_plot_func(df, vars, "time", "status")
然而,这并没有成功。有什么想法吗?
下面的代码对我有用。
正如我在评论中提到的,我发现错误是由于
ggsurvplot()
造成的,并且此函数无法读取 form
中的 lapply()
。form
全局制作<<-
,终于成功了。
library(survival)
library(survminer)
library(dplyr)
df <- lung %>%
transmute(time,
status, # censoring status 1=censored, 2=dead
Age = age,
Sex = factor(sex, labels = c("Male", "Female")),
ECOG = factor(lung$ph.ecog),
`Meal Cal` = as.numeric(meal.cal))
vars <- c("ECOG", "Sex")
surv_plot_func <- function(df, vars, time, status) {
results_list <- lapply(vars, \(x){
# # Creating a formula as a string
form <<- paste0("Surv(", time, ", ", status,") ~ ",x)
fit <- survfit(as.formula(form), data=df)
# # Plot the Kaplan-Meier curve using ggsurvplot
ggsurv <- ggsurvplot(fit, pval = TRUE, conf.int = TRUE,
risk.table = TRUE, legend.title = "",
surv.median.line = "hv", xlab = "Time", ylab = "Survival Probability")
# Return the fit and ggsurv as a list
list(fit = fit, ggsurv = ggsurv)
})
# Return the list of results
return(results_list)
}
res_list <- surv_plot_func(df, vars, "time", "status")
res_list[[1]]$ggsurv
res_list[[2]]$ggsurv
创建于 2023-04-12,使用 reprex v2.0.2
如果您遇到如下错误:
Error in as.formula(form) : object 'form' not found Called from: as.formula(form)
Error: object of type 'symbol' is not subsettable
例如,可能是Shiny应用程序的问题引起的,您可以使用
eval(parse(...))
df <- lung %>%
transmute(time,
status, # censoring status 1=censored, 2=dead
Age = age,
Sex = factor(sex, labels = c("Male", "Female")),
ECOG = factor(lung$ph.ecog),
`Meal Cal` = as.numeric(meal.cal))
categorical_variable <- "Sex"
fit_command <- paste0("survfit(Surv(time, status) ~ ", categorical_variable, ", data = df, conf.int = conf.int.survfit)")
fit <- eval(parse(text = fit_command))
ggsurvplot(fit, data = df)