这里是 R 的新用户。我刚刚完成了我工作中的一些 2023 年数据的质量控制脚本。我已附上下面的脚本。我想做的是使这个过程更加自动化,以便将来的人可以轻松地使用这个脚本。唯一应该改变的是选择的年份。理想情况下(不确定这是否可能/相对容易)我想要的是打开脚本,提示询问他们想要进行质量控制的年份(即输入例如 2021 年、2019 年等),然后运行脚本,将年份插入代码的相应部分(我已将其放入 asterix 中)。我是 R 新手,所以不确定这有多容易,但我们将非常感谢任何帮助(如果您不介意的话,还可以提供一些解释)。
# QC Check Function Building Blocks ---------------------------------------
#Bring in QC Results, QC Samples, and Results Tables and Filter Out Unneeded Columns
fn.importData(MDBPATH="C:/Users/h2edhmrs/Desktop/DASLER_TEST_COPY.mdb",
TABLES=c("Analytes"))
"QC-Samples" <- `QC Samples` %>% select(LOC_ID, QC_SAMPLE, SAMPLE_DEPTH, QC_TYPE, ASSOC_SAMP)
"QC-Results" <- `QC Results` %>% select(Loc_ID, QC_Sample, Units, Value, Text_Value, QC_Type, Storet_Num)
'Results_' <- `Results` %>% select(Loc_ID, Sample_Num, Units, Value, Storet_Num, Text_Value)
**#Only get 2023 records from QC Samples table
'QC-Samples' <- `QC-Samples` %>% filter(substr(QC_SAMPLE,1,4)=="2023")**
#Rename QC-Results QC_Sample column to match name in QC-Samples table
colnames(`QC-Results`)[2] <- 'QC_SAMPLE'
#Merge Results and Samples table to get full 2023 QC records
QC_Results <- merge(`QC-Results`, `QC-Samples`[ ,c("QC_SAMPLE", "ASSOC_SAMP")], by = "QC_SAMPLE")
#Now must get associated samples into table
#To do this, I will rename "Sample_Num" column in Results_ table to ASSOC_SAMP and then merge the two
colnames(Results_)[2] <- "ASSOC_SAMP"
QCandResults <- merge(QC_Results, Results_[,c("ASSOC_SAMP","Storet_Num", "Units", "Value", "Text_Value")], by = c("ASSOC_SAMP", "Storet_Num"))
#rename columns of QCandResults for clarity
colnames(QCandResults)[c(1,5,6,7,9,10,11)] <- c("Sample_Num", "Units_QC", "Value_QC", "Text_Value_QC", "Units_Results", "Value_Results", "Text_Value_Results")
#matching Storet_num to display the parameter name in the QCandResults table
colnames(Analytes)[1] <- "Storet_Num"
QCandResults <- (merge(Analytes[,c("Storet_Num", "anl_short")], QCandResults, by = "Storet_Num"))
QCandResults <- QCandResults[-1]
#Making only dups and splits in the table
QCandResults <- subset(QCandResults, QC_Type == c("DUP", "SPL"))
# Relative Percent Difference Function ------------------------------------
#Developing Relative Percent Difference Function
RPD = \(x1, x2) {
x1[is.na(x1)] = 0L; x2[is.na(x2)] = 0L
abs((x1 - x2) / ((x1 + x2) * 0.5)) * 100
}
QCandResults <- transform(QCandResults, RPD = RPD(Value_Results, Value_QC))
#Creating pass column and then creating stat for how many QC failed
QCandResults <- transform(QCandResults, Pass = if_else(RPD > 20, "N", ""))
(sum(QCandResults$Pass == "N", na.rm=T) / nrow(QCandResults))
#export to xl
write.xlsx(QCandResults, "QC2023.xlsx")
如果唯一移动的部分是年份部分,那么您可以将所有内容包装在函数中,然后将年份定义为参数。我只是出于演示目的而更换了
substr
。但实际上您所做的就是使用 rlang::englue
或 paste
将年份放在需要的位置。
library(gapminder)
library(dplyr)
make_string = gapminder |>
mutate(year_string = paste(country, "in", year))
filter_year = \(data, year_col, year_to_select){
year_query = rlang::englue('{year_to_select}')
filter_data = data |>
### you could also totally just use paste here
filter(sub(".*([0-9]{4}).*", '\\1' , {{year_col}}) == year_query)
filter_data
}
examp = filter_year(make_string, year_col = year_string, year_to_select = '1952')
check = make_string |>
filter(year == 1952)
identical(examp, check)
#> [1] TRUE
创建于 2024-07-31,使用 reprex v2.1.1