我有多个文件,格式如下:
> Test1.txt
NameNo Team etc
1: AS001-A. 8 773
2: AS002-S. 7 631
3: AS003-G. 8 970
> Test2.txt
NameNo Team etc
1: AB001-A. 2 773
2: AB002-S. 6 631
3: AB003-G. 6 970
> Test2.txt
NameNo Team etc
1: AR001-A. 1 773
2: AR002-S. 1 631
3: AR003-G. 1 970
这是我的代码,我已一次读取所有文件:
files <- list.files(pattern ="*.txt", full.names = TRUE)
files
> files
Test1.txt
Test2.txt
Test3.txt
然后一一使用
table(Team)
。
我需要总结Coumn
Team
,所需的输出如:
Item Team
Test1 7,8
Test2 2,6
Test3 1
感谢大家的帮助。
# get all the textfiles starting with "Test"
list.files(pattern ="^Test.*\\.txt$") |>
# loop through the files, reading them in, adding a column for their filename
map_dfr(~read_csv(.) |> mutate(file = str_remove(.x, ".txt"))) |>
# separate the weird format into columns using regex
separate_wider_regex(`NameNo Team etc`, patterns = c("\\d+:\\s+", NameNo = "\\S+", "\\s+", Team = "\\d+", "\\s+", etc = "\\d+")) |>
# get the unique teams for each file, sorted
summarise(Team = paste0(sort(unique(`Team`)), collapse = ","), .by = file)
输出:
# A tibble: 3 × 2
file Team
<chr> <chr>
1 Test1 7,8
2 Test2 2,6
3 Test3 1