我具有以下DSE群集配置:
6 nodes with 6 cores/16GB ram for each node.
我的应用程序是使用pyspark构建的,可从Cassandra DB读取数据。
我们在cassandra db 320.000.000行上加载并运行具有完整内存和内核的python spark应用程序,并出现此错误:
Lost task 97.0 in stage 299.0 (TID 14680, 11.218.78.15): java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at org.apache.spark.storage.TimeTrackingOutputStream.write(TimeTrackingOutputStream.java:58)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at net.jpountz.lz4.LZ4BlockOutputStream.flushBufferedData(LZ4BlockOutputStream.java:205)
at net.jpountz.lz4.LZ4BlockOutputStream.write(LZ4BlockOutputStream.java:158)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.spark.sql.catalyst.expressions.UnsafeRow.writeToStream(UnsafeRow.java:562)
at org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$2.writeValue(UnsafeRowSerializer.scala:69)
at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:185)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:150)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
at org.apache.spark.scheduler.Task.run(Task.scala:86)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
您能帮我吗?每个节点上大约有20GB。
此异常与节点上的磁盘空间有关。检查它并找出剩余的空间,然后检查代码以考虑您记录了多少及其磁盘使用情况。但是第一个解决方案是从磁盘上释放一些空间。如果您检查并发现还有足够的空间,请检查spark master将可执行的spark作业文件上传到的空间。如果以前提交的作业没有正常完成,并且作业文件旁边的临时文件保留在每次提交作业的临时目录中,则可能性更大。然后,您有两个解决方案:
这个错误也同时在我们以本地模式运行spark时出现(我也遇到了与在本地模式下运行spark查询相同的问题),如果您在yarn模式下运行spark,此错误可能会得到解决。 。