同一对象在并行流中被处理多次

问题描述 投票:0回答:1

我有以下逻辑,尝试将文件列表(SmbFile)上传到AWS S3。
对于每个文件,它将检查是否已上传,如果尚未上传,则上传。

// Upload a batch of day directory files
private void uploadBatch(List<SmbFile> batch) {
    batch.parallelStream().forEach(s -> {
        try (s) {
            if (isEmpty(backedUpFileRepository.findByPath(s.getUncPath()))) {
                // File not yet uploaded
                log.info("File with path {} has not been uploaded yet, will continue to upload it...", s.getUncPath());
                // Upload logic
                ...
                ......
            } else {
                // File already uploaded
                log.info("File with path {} already uploaded, will skip uploading it.", s.getUncPath());
            }
        } catch (IOException ioe) {
            // Log an error message and skip uploading the file
            log.warn("Got IOException uploading file to S3. File name: {}", s.getUncPath());
        } catch (Exception e) {
            // Log an error message and skip uploading the file
            log.error("Got Exception uploading uploading file to S3. File name: {}", s.getUncPath());
        }
    });
}

我面临的问题是,同一个文件似乎被不同的线程处理两次,如日志中所示。

enter image description here

我的印象是 forEach 将确保列表中的每个对象仅被处理一次(如果我错了,请纠正)。
我是否可以知道我是否可以了解如何确保没有 2 个线程对列表中的同一个文件执行操作,并且每个对象仅被处理一次?

非常感谢。

java spring-boot multithreading java-stream thread-safety
1个回答
0
投票

因为ArrayList不是线程安全的,如果没有必要,就改用stream。

测试:

    public static void main(String[] args) {
        List<Integer> list = new ArrayList<>();
        for (int i = 0; i < 10000; i++) {
            list.add(i);
        }
        System.out.println("a:"+list.size());
        List<Integer> streamList = new ArrayList<>();
        list.parallelStream().forEach(streamList::add);
        System.out.println("b:"+streamList.size());
    }

几乎是控制台

b < a

© www.soinside.com 2019 - 2024. All rights reserved.