我有一个java程序,其中大约800MB的文件通过
java.io.RandomAccessFile
进行内存映射。我将其托管在 EC2 m5.8xlarge
(32 个 CPU,128GB RAM)实例中,JVM OPTS 设置为 -Xms64g -Xmx64g
。启动服务时,我遇到错误:
[thread 3606 also had an error]
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGBUS (0x7) at pc=0x00007f214f3c5e73, pid=3556, tid=3637
#
# JRE version: OpenJDK Runtime Environment Temurin-17.0.6+10 (17.0.6+10) (build 17.0.6+10)
# Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.6+10 (17.0.6+10, mixed mode, sharing, tiered, compressed class ptrs, z gc, linux-amd64)
# Problematic frame:
# V [libjvm.so+0x602e73] Copy::fill_to_memory_atomic(void*, unsigned long, unsigned char)+0x103
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/user/builds/current/hs_err_pid3556.log
#
# If you would like to submit a bug report, please visit:
# https://github.com/adoptium/adoptium-support/issues
#
/home/user/builds/current/start.sh: line 78: 3556 Aborted java ${JVM_OPTS} -cp 'lib/*' ${LAUNCH_CLASS} $@
上面提到的
hs_err_pid3556.log
给了我下面的信息,当将内存块设置为0时,sun.misc.Unsafe.setMemory
出了问题:
Current thread (0x00007fcaf0274fd0): JavaThread "ForkJoinPool-1-worker-7" daemon [_thread_in_vm, id=3199, stack(0x00007fcb15fda000,0x00007fcb160db000)]
Stack: [0x00007fcb15fda000,0x00007fcb160db000], sp=0x00007fcb160d8ce8, free space=1019k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x602e73] Copy::fill_to_memory_atomic(void*, unsigned long, unsigned char)+0x103
j jdk.internal.misc.Unsafe.setMemory0(Ljava/lang/Object;JJB)V+0 [email protected]
j jdk.internal.misc.Unsafe.setMemory(Ljava/lang/Object;JJB)V+25 [email protected]
j jdk.internal.misc.Unsafe.setMemory(JJB)V+6 [email protected]
j sun.misc.Unsafe.setMemory(JJB)V+7 [email protected]
j example.com.buffer.MemoryMappedBuffer.set(JJB)V+58
j example.com.buffer.Buffer.zeroed()Lexample/com/buffer/Buffer;+9
j example.com.collections.BufferSupplierMapped.supplyBuffers(JJ)Lorg/apache/commons/lang3/tuple/Pair;+37
j example.com.collections.ConcurrentOffheapLongObjMap$MapImpl.<init>(Ljava/lang/String;Lexample/com/collections/OffheapMapBufferSupplier;Lexample/com/collections/OffheapValueSerDe;JJF)V+65
j example.com.collections.ConcurrentOffheapLongObjMap.<init>(Ljava/lang/String;JJLexample/com/collections/OffheapMapBufferSupplier;Lexample/com/collections/OffheapValueSerDe;F)V+64
j example.com.collections.ConcurrentOffheapLongObjMap.<init>(Ljava/lang/String;JJLexample/com/collections/OffheapMapBufferSupplier;Lexample/com/collections/OffheapValueSerDe;)V+11
j example.com.collections.OffheapMapUtil.readToMapped(Ljava/lang/String;Lexample/com/collections/OffheapValueSerDe;Ljava/lang/String;Ljava/lang/String;)Lexample/com/collections/ConcurrentOffheapLongObjMap;+99
j example.com.index.job.WritableSiteIndex.lambda$snapshotLoad$21(Lorg/apache/commons/lang3/mutable/MutableObject;Lexample/com/model/Site;Ljava/lang/String;Ljava/lang/String;)V+20
j example.com.index.job.WritableSiteIndex$$Lambda$362+0x0000000801044f58.run()V+16
j example.com.thread.AsyncTaskList.lambda$add$0(Ljava/lang/String;Lexample/com/function/ThrowingRunnable;)Ljava/lang/Void;+19
j example.com.thread.AsyncTaskList$$Lambda$223+0x0000000800e31428.call()Ljava/lang/Object;+8
j example.com.thread.AsyncTaskList.lambda$execute$1(Ljava/util/concurrent/Callable;)Ljava/lang/Boolean;+1
j example.com.thread.AsyncTaskList$$Lambda$230+0x0000000800e30400.get()Ljava/lang/Object;+4
j example.com.thread.WorkerService$$Lambda$231+0x0000000800e38000.call()Ljava/lang/Object;+4
j java.util.concurrent.ForkJoinTask$AdaptedCallable.exec()Z+5 [email protected]
j java.util.concurrent.ForkJoinTask.doExec()I+10 [email protected]
j java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Ljava/util/concurrent/ForkJoinTask;Ljava/util/concurrent/ForkJoinPool$WorkQueue;)V+13 [email protected]
j java.util.concurrent.ForkJoinPool.scan(Ljava/util/concurrent/ForkJoinPool$WorkQueue;II)I+193 [email protected]
j java.util.concurrent.ForkJoinPool.runWorker(Ljava/util/concurrent/ForkJoinPool$WorkQueue;)V+53 [email protected]
j java.util.concurrent.ForkJoinWorkerThread.run()V+31 [email protected]
v ~StubRoutines::call_stub
V [libjvm.so+0x822715] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, JavaThread*)+0x315
V [libjvm.so+0x823f0b] JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, JavaThread*)+0x1cb
V [libjvm.so+0x8eda53] thread_entry(JavaThread*, JavaThread*)+0xa3
V [libjvm.so+0xe5e974] JavaThread::thread_main_inner()+0x184
V [libjvm.so+0xe62020] Thread::call_run()+0xc0
V [libjvm.so+0xc187e1] thread_native_entry(Thread*)+0xe1
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j jdk.internal.misc.Unsafe.setMemory0(Ljava/lang/Object;JJB)V+0 [email protected]
j jdk.internal.misc.Unsafe.setMemory(Ljava/lang/Object;JJB)V+25 [email protected]
j jdk.internal.misc.Unsafe.setMemory(JJB)V+6 [email protected]
j sun.misc.Unsafe.setMemory(JJB)V+7 [email protected]
j example.com.buffer.MemoryMappedBuffer.set(JJB)V+58
j example.com.buffer.Buffer.zeroed()Lexample/com/buffer/Buffer;+9
j example.com.collections.BufferSupplierMapped.supplyBuffers(JJ)Lorg/apache/commons/lang3/tuple/Pair;+37
j example.com.collections.ConcurrentOffheapLongObjMap$MapImpl.<init>(Ljava/lang/String;Lexample/com/collections/OffheapMapBufferSupplier;Lexample/com/collections/OffheapValueSerDe;JJF)V+65
j example.com.collections.ConcurrentOffheapLongObjMap.<init>(Ljava/lang/String;JJLexample/com/collections/OffheapMapBufferSupplier;Lexample/com/collections/OffheapValueSerDe;F)V+64
j example.com.collections.ConcurrentOffheapLongObjMap.<init>(Ljava/lang/String;JJLexample/com/collections/OffheapMapBufferSupplier;Lexample/com/collections/OffheapValueSerDe;)V+11
j example.com.collections.OffheapMapUtil.readToMapped(Ljava/lang/String;Lexample/com/collections/OffheapValueSerDe;Ljava/lang/String;Ljava/lang/String;)Lexample/com/collections/ConcurrentOffheapLongObjMap;+99
j example.com.index.job.WritableSiteIndex.lambda$snapshotLoad$21(Lorg/apache/commons/lang3/mutable/MutableObject;Lexample/com/model/Site;Ljava/lang/String;Ljava/lang/String;)V+20
j example.com.index.job.WritableSiteIndex$$Lambda$362+0x0000000801044f58.run()V+16
j example.com.thread.AsyncTaskList.lambda$add$0(Ljava/lang/String;Lexample/com/function/ThrowingRunnable;)Ljava/lang/Void;+19
j example.com.thread.AsyncTaskList$$Lambda$223+0x0000000800e31428.call()Ljava/lang/Object;+8
j example.com.thread.AsyncTaskList.lambda$execute$1(Ljava/util/concurrent/Callable;)Ljava/lang/Boolean;+1
j example.com.thread.AsyncTaskList$$Lambda$230+0x0000000800e30400.get()Ljava/lang/Object;+4
j example.com.thread.WorkerService$$Lambda$231+0x0000000800e38000.call()Ljava/lang/Object;+4
j java.util.concurrent.ForkJoinTask$AdaptedCallable.exec()Z+5 [email protected]
j java.util.concurrent.ForkJoinTask.doExec()I+10 [email protected]
j java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Ljava/util/concurrent/ForkJoinTask;Ljava/util/concurrent/ForkJoinPool$WorkQueue;)V+13 [email protected]
j java.util.concurrent.ForkJoinPool.scan(Ljava/util/concurrent/ForkJoinPool$WorkQueue;II)I+193 [email protected]
j java.util.concurrent.ForkJoinPool.runWorker(Ljava/util/concurrent/ForkJoinPool$WorkQueue;)V+53 [email protected]
j java.util.concurrent.ForkJoinWorkerThread.run()V+31 [email protected]
v ~StubRoutines::call_stub
siginfo: si_signo: 7 (SIGBUS), si_code: 2 (BUS_ADRERR), si_addr: 0x00007fc979848000
有趣的是,在我的笔记本电脑(带有 64g 内存的 Ubuntu)中运行相同的程序内存映射相同的文件没有问题。 AWS 加载非常相似但较小(560MB 与 800MB)的文件没有问题。所以我非常确定 Java 程序正在按预期工作,并且要映射的文件的完整性也是如此。
您遇到的 SIGBUS(总线错误)通常表示与内存相关的问题,尤其是内存对齐问题。在给定的日志中,它似乎发生在 Copy::fill_to_memory_atomic 函数中,有问题的帧位于 Unsafe.setMemory0 中。这表明内存映射或对齐可能存在问题。
尝试尝试不同的 JVM 选项。例如,您可以尝试使用 -XX:MaxDirectMemorySize 选项来限制直接缓冲区内存量,并检查 EC2 实例上的 ulimit 设置,并检查您如何指定内存映射大小。
如果问题仍然存在,请考虑联系 AWS 支持寻求帮助。他们也许能够提供针对您的 AWS 环境的见解。