我有以下逻辑,尝试将文件列表(SmbFile)上传到AWS S3。
对于每个文件,它将检查是否已上传,如果尚未上传,则上传。
// Upload a batch of day directory files
private void uploadBatch(List<SmbFile> batch) {
batch.parallelStream().forEach(s -> {
try (s) {
if (isEmpty(backedUpFileRepository.findByPath(s.getUncPath()))) {
// File not yet uploaded
log.info("File with path {} has not been uploaded yet, will continue to upload it...", s.getUncPath());
// Upload logic
...
......
} else {
// File already uploaded
log.info("File with path {} already uploaded, will skip uploading it.", s.getUncPath());
}
} catch (IOException ioe) {
// Log an error message and skip uploading the file
log.warn("Got IOException uploading file to S3. File name: {}", s.getUncPath());
} catch (Exception e) {
// Log an error message and skip uploading the file
log.error("Got Exception uploading uploading file to S3. File name: {}", s.getUncPath());
}
});
}
我面临的问题是,同一个文件似乎被不同的线程处理两次,如日志中所示。
我的印象是 forEach 将确保列表中的每个对象仅被处理一次(如果我错了,请纠正)。
我是否可以知道我是否可以了解如何确保没有 2 个线程对列表中的同一个文件执行操作,并且每个对象仅被处理一次?
非常感谢。
因为ArrayList不是线程安全的,如果没有必要,就改用stream。
测试:
public static void main(String[] args) {
List<Integer> list = new ArrayList<>();
for (int i = 0; i < 10000; i++) {
list.add(i);
}
System.out.println("a:"+list.size());
List<Integer> streamList = new ArrayList<>();
list.parallelStream().forEach(streamList::add);
System.out.println("b:"+streamList.size());
}
几乎是控制台
b < a
。