列出目录中与文件掩码（也称为模式或 Glob）匹配的所有文件

Question

我想列出该目录中与文件掩码匹配的目录和子目录中的所有文件。

例如“M:\SOURCE\*.doc”，而 SOURCE 可能如下所示：

|-- SOURCE
|   |-- Folder1
|   |   |-- File1.doc
|   |   |-- File1.txt
|   |-- File2.doc
|   |-- File3.xml

应返回 File1.doc 和 File2.doc。

最初，我使用DirectoryStream，因为它已经对掩码/全局语法进行了一些检查，并且能够使用它进行过滤，因为这ISN'T只是一些正则表达式，但实际的文件掩码是常规的用户发现更容易理解

Files.newDirectoryStream(path, mask);

问题是 DirectoryStream 仅检查您提供的直接路径目录，而不检查其子目录

THEN 提供了 Files.walk 的“扁平化”方法，它实际上能够查看所有子目录，问题是，它不提供以相同方式通过文件掩码“过滤”的可能性DirectoryStream 的作用

Files.walk(path, Integer.MAX_VALUE);

所以我陷入困境，无法在这里结合两种方法的优点......

Answer 1

您还可以使用自定义

FileVisitor

 [1]，结合

PathMatcher

 [2]，与 GLOB 完美配合。

代码可能如下所示：

public static void main(String[] args) throws IOException {
    System.out.println(getFiles(Paths.get("/tmp/SOURCE"), "*.doc"));
}

public static List<Path> getFiles(final Path directory, final String glob) throws IOException {
    final var docFileVisitor = new GlobFileVisitor(glob);
    Files.walkFileTree(directory, docFileVisitor);

    return docFileVisitor.getMatchedFiles();
}

public static class GlobFileVisitor extends SimpleFileVisitor<Path> {

    private final PathMatcher pathMatcher;
    private List<Path> matchedFiles = new ArrayList<>();

    public GlobFileVisitor(final String glob) {
        this.pathMatcher = FileSystems.getDefault().getPathMatcher("glob:" + glob);
    }

    @Override
    public FileVisitResult visitFile(Path path, BasicFileAttributes basicFileAttributes) throws IOException {
        if (pathMatcher.matches(path.getFileName())) {
            matchedFiles.add(path);
        }
        return FileVisitResult.CONTINUE;
    }

    public List<Path> getMatchedFiles() {
        return matchedFiles;
    }
}

[1]

https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/nio/file/FileVisitor.html

[2]

https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/nio/file/PathMatcher.html

Answer 2

我想我可能已经用这里收到的见解和提到

PathMatcher

对象的其他问题解决了我自己的问题

final PathMatcher maskMatcher = FileSystems.getDefault()
                  .getPathMatcher("glob:" + mask);

final List<Path> matchedFiles = Files.walk(path)
                  .collect(Collectors.toList());

final List<Path> filesToRemove = new ArrayList<>(matchedFiles.size());

matchedFiles.forEach(foundPath -> {
            if (!maskMatcher.matches(foundPath.getFileName()) || Files.isDirectory(foundPath)) {
              filesToRemove.add(foundPath);
            }
          });

 matchedFiles.removeAll(filesToRemove);

所以基本上

.getPathMatcher("glob:" + mask);

与

DirectoryStream 过滤文件所做的事情是一样的

之后我要做的就是通过删除与我的 PathMatcher 不匹配且不是 File 类型的元素来过滤使用

Files.walk 获得的路径列表

Answer 3

可以使用通用流

filter

 使用

Files.walk

 和适当的正则表达式从

String::matches

 检索过滤后的文件名：

final String SOURCE_DIR = "test";

Files.walk(Paths.get(SOURCE_DIR));
     .filter(p -> p.getFileName().toString().matches(".*\\.docx?"))
     .forEach(System.out::println);

输出

test\level01\level11\test.doc
test\level02\test-level2.doc
test\t1.doc
test\t3.docx

输入目录结构：

│   t1.doc
│   t2.txt
│   t3.docx
│   t4.bin
│
├───level01
│   │   test.do
│   │
│   └───level11
│           test.doc
│
└───level02
        test-level2.doc

更新

可以使用

newDirectoryStream

 实现递归解决方案，但需要将其转换为 Stream:

static Stream<Path> readFilesByMaskRecursively(Path start, String mask) {
        
    List<Stream<Path>> sub = new ArrayList<>();
        
    try {
        sub.add(StreamSupport.stream( // read files by mask in current dir
                Files.newDirectoryStream(start, mask).spliterator(), false));
            
        Files.newDirectoryStream(start, (path) -> path.toFile().isDirectory())
             .forEach(path -> sub.add(recursive(path, mask)));
    } catch (IOException ioex) {
        ioex.printStackTrace();
    }
        
    return sub.stream().flatMap(s -> s); // convert to Stream<Path>
}

// test
readFilesByMaskRecursively(Paths.get(SOURCE_DIR), "*.doc*")
             .forEach(System.out::println);

输出：

test\t1.doc
test\t3.docx
test\level01\level11\test.doc
test\level02\test-level2.doc

更新2

可以将前缀

**/

添加到

PathMatcher

以跨越目录边界，然后基于

Files.walk

的解决方案可以使用简化的过滤器，而无需删除特定条目：

String mask = "*.doc*";
PathMatcher maskMatcher = FileSystems.getDefault().getPathMatcher("glob:**/" + mask);
Files.walk(Paths.get(SOURCE_DIR))
     .filter(path -> maskMatcher.matches(path))
     .forEach(System.out::println);

输出（与递归解相同）：

test\level01\level11\test.doc
test\level02\test-level2.doc
test\t1.doc
test\t3.docx

Answer 4

您可以在文件名字符串级别使用过滤

private static List getListFiles(Path path, String regex) throws Exception { return Files.walk(path).filter(p -> p.getFileName().toString().matches(regex)).toList(); }

列出目录中与文件掩码（也称为模式或 Glob）匹配的所有文件

问题描述投票：0回答：4

4个回答

最新问题

列出目录中与文件掩码（也称为模式或 Glob）匹配的所有文件

问题描述 投票：0回答：4

4个回答

最新问题

问题描述投票：0回答：4