汇总 R 中多个文件中的列

问题描述 投票:0回答:1

我有多个文件,格式如下:

> Test1.txt
            NameNo  Team  etc
1:         AS001-A.    8  773
2:         AS002-S.    7  631
3:         AS003-G.    8  970

> Test2.txt
            NameNo  Team  etc
1:         AB001-A.    2  773
2:         AB002-S.    6  631
3:         AB003-G.    6  970

> Test2.txt
            NameNo  Team  etc
1:         AR001-A.    1  773
2:         AR002-S.    1  631
3:         AR003-G.    1  970

这是我的代码,我已一次读取所有文件:

files <- list.files(pattern ="*.txt", full.names = TRUE) 
files
> files
Test1.txt
Test2.txt
Test3.txt

然后一一使用

table(Team)

我需要总结Coumn

Team
,所需的输出如:

Item      Team
Test1      7,8
Test2      2,6
Test3        1

感谢大家的帮助。

r summary
1个回答
0
投票
# get all the textfiles starting with "Test" 
list.files(pattern ="^Test.*\\.txt$") |>

  # loop through the files, reading them in, adding a column for their filename
  map_dfr(~read_csv(.) |> mutate(file = str_remove(.x, ".txt"))) |>

  # separate the weird format into columns using regex
  separate_wider_regex(`NameNo  Team  etc`, patterns = c("\\d+:\\s+", NameNo = "\\S+", "\\s+", Team = "\\d+", "\\s+", etc = "\\d+")) |>

  # get the unique teams for each file, sorted
  summarise(Team = paste0(sort(unique(`Team`)), collapse = ","), .by = file)

输出:

# A tibble: 3 × 2
  file  Team 
  <chr> <chr>
1 Test1 7,8  
2 Test2 2,6  
3 Test3 1
最新问题
© www.soinside.com 2019 - 2025. All rights reserved.