自动化脚本以方便/将来使用

问题描述 投票:0回答:1

这里是 R 的新用户。我刚刚完成了我工作中的一些 2023 年数据的质量控制脚本。我已附上下面的脚本。我想做的是使这个过程更加自动化,以便将来的人可以轻松地使用这个脚本。唯一应该改变的是选择的年份。理想情况下(不确定这是否可能/相对容易)我想要的是打开脚本,提示询问他们想要进行质量控制的年份(即输入例如 2021 年、2019 年等),然后运行脚本,将年份插入代码的相应部分(我已将其放入 asterix 中)。我是 R 新手,所以不确定这有多容易,但我们将非常感谢任何帮助(如果您不介意的话,还可以提供一些解释)。

# QC Check Function Building Blocks ---------------------------------------

#Bring in QC Results, QC Samples, and Results Tables and Filter Out Unneeded Columns
fn.importData(MDBPATH="C:/Users/h2edhmrs/Desktop/DASLER_TEST_COPY.mdb",
              TABLES=c("Analytes"))

"QC-Samples" <- `QC Samples` %>% select(LOC_ID, QC_SAMPLE, SAMPLE_DEPTH, QC_TYPE, ASSOC_SAMP)
"QC-Results" <- `QC Results` %>%  select(Loc_ID, QC_Sample, Units, Value, Text_Value, QC_Type, Storet_Num)
'Results_' <- `Results` %>% select(Loc_ID, Sample_Num, Units, Value, Storet_Num, Text_Value)

**#Only get 2023 records from QC Samples table
'QC-Samples' <- `QC-Samples` %>% filter(substr(QC_SAMPLE,1,4)=="2023")**

#Rename QC-Results QC_Sample column to match name in QC-Samples table
colnames(`QC-Results`)[2] <- 'QC_SAMPLE'

#Merge Results and Samples table to get full 2023 QC records
QC_Results <- merge(`QC-Results`, `QC-Samples`[ ,c("QC_SAMPLE", "ASSOC_SAMP")], by = "QC_SAMPLE")

#Now must get associated samples into table
#To do this, I will rename "Sample_Num" column in Results_ table to ASSOC_SAMP and then merge the two
colnames(Results_)[2] <- "ASSOC_SAMP"
QCandResults <- merge(QC_Results, Results_[,c("ASSOC_SAMP","Storet_Num", "Units", "Value", "Text_Value")], by = c("ASSOC_SAMP", "Storet_Num"))

#rename columns of QCandResults for clarity
colnames(QCandResults)[c(1,5,6,7,9,10,11)] <- c("Sample_Num", "Units_QC", "Value_QC", "Text_Value_QC", "Units_Results", "Value_Results", "Text_Value_Results")

#matching Storet_num to display the parameter name in the QCandResults table
colnames(Analytes)[1] <- "Storet_Num"
QCandResults <- (merge(Analytes[,c("Storet_Num", "anl_short")], QCandResults, by = "Storet_Num"))
QCandResults <- QCandResults[-1]

#Making only dups and splits in the table
QCandResults <- subset(QCandResults, QC_Type ==  c("DUP", "SPL"))


# Relative Percent Difference Function ------------------------------------

#Developing Relative Percent Difference Function
RPD = \(x1, x2) {
  x1[is.na(x1)] = 0L; x2[is.na(x2)] = 0L
  abs((x1 - x2) / ((x1 + x2) * 0.5)) * 100
}

QCandResults <- transform(QCandResults, RPD = RPD(Value_Results, Value_QC))

#Creating pass column and then creating stat for how many QC failed
QCandResults <- transform(QCandResults, Pass = if_else(RPD > 20, "N", ""))

(sum(QCandResults$Pass == "N", na.rm=T) / nrow(QCandResults))

#export to xl
write.xlsx(QCandResults, "QC2023.xlsx")
r automation prompt
1个回答
0
投票

如果唯一移动的部分是年份部分,那么您可以将所有内容包装在函数中,然后将年份定义为参数。我只是出于演示目的而更换了

substr
。但实际上您所做的就是使用
rlang::englue
paste
将年份放在需要的位置。

library(gapminder)
library(dplyr)

make_string = gapminder |>
  mutate(year_string = paste(country, "in", year))


filter_year = \(data, year_col, year_to_select){
   
  year_query = rlang::englue('{year_to_select}')
 
  filter_data = data |>
     ### you could also totally just use paste here
    filter(sub(".*([0-9]{4}).*", '\\1' , {{year_col}}) == year_query)

  filter_data
}

examp = filter_year(make_string, year_col = year_string, year_to_select = '1952')
  
check = make_string |>
  filter(year == 1952)

identical(examp, check)
#> [1] TRUE

创建于 2024-07-31,使用 reprex v2.1.1

© www.soinside.com 2019 - 2024. All rights reserved.