假设我在 R 中有一个数据框,其中包含一个包含样本名称的列以及这些样本收到的不同处理的另外三个列。治疗列中的值对应于接受治疗的天数,如果没有接受治疗,则为 NA。如何转换每一行,以便将数值替换为接受治疗的顺序。换句话说,具有最低值/天的列会突变为“first_treatment”,第二低值会突变为“second_treatment”,依此类推。
输入数据示例:
df = data.frame(samples = LETTERS[1:5],
Treatment_A = c(NA, "3", NA, "10", "12"),
Treatment_B = c("12", NA, NA, "15", "5"),
Treatment_C = c(NA, NA, "5", "8", NA))
所需输出:
df_output = data.frame(samples = LETTERS[1:5],
Treatment_A = c(NA, "First", NA, "Second","Second"),
Treatment_B = c("First", NA, NA, "Third","First"),
Treatment_C = c(NA, NA, "First", "First", NA))
*编辑:修复了输出的样子。
这应该可以。请注意,我在此之前运行了
df <- type.convert(df, as.is = TRUE)
,因为您的输入包含字符向量而不是数字。
cols <- grep("^Treatment", names(df), value = TRUE)
ranks <- c("First", "Second", "Third", "Fourth", "Fifth")
df[cols] <- lapply(df[cols], \(x) ifelse(is.na(x), NA, ranks[rank(x)]))
df
# samples Treatment_A Treatment_B Treatment_C
# 1 A <NA> Second <NA>
# 2 B First <NA> <NA>
# 3 C <NA> <NA> First
# 4 D Second Third Second
# 5 E Third First <NA>