Spark 在 kubernetes 上提交 java.nio.file.NoSuchFileException

问题描述 投票:0回答:1

我正在尝试在 kubernetes 管理的集群中运行 scala/spark 应用程序。

  1. 我构建了 scala/spark 应用程序的 jar 文件:scala-spark-1.0-jar-with-dependency.jar

  2. 我从 apache/spark 构建了自己的 docker 镜像:最新添加了我的 jar 文件:

    `Dockerfile:

    来自 apache/spark:最新

    复制 scala-spark-1.0-jar-with-dependency.jar /opt/spark/work-dir/

    构建:

    docker 构建。 -t xxxx.cloud/spark/testavg-spark `

  3. 我将 docker 映像推送到我的 docker 注册表:xxxx.cloud/spark/testavg-spark

    docker push xxxx.cloud/spark/testavg-spark 

  4. 在本地运行容器以验证它是否实际包含 jar 文件:

    `docker run -it xxxx.cloud/spark/testavg-spark:最新的 bash Spark@c6ae887a6c93:/opt/spark/work-dir$ ls -lrt

    总计247160 -rw-rw-r-- 1根根253087621 3月7日08:37 scala-spark-1.0-jar-with-dependency.jar `

  5. 使用spark Submit命令在k8s集群上执行POD和应用程序:

    spark-submit --class TestAvg --master k8s://https://yyyyy:6443 --deploy-mode cluster --name     SparkTestAvg --conf spark.executor.instances=3 --conf spark.kubernetes.container.image.pullSecrets=pull-secret --conf spark.kubernetes.container.image=xxxx.cloud/spark/testavg-spark:latest --conf spark.kubernetes.authenticate=${AUTH_KEY} local:///opt/spark/work-dir/scala-spark-1.0-jar-with-dependencies.jar 

执行开始并结束时出现错误:

`24/03/07 10:05:08 WARN Utils: Your hostname, xxxxxx resolves to a loopback address:      127.0.1.1; using 192.168.99.159 instead (on interface enp0s31f6)
24/03/07 10:05:08 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
24/03/07 10:05:08 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
24/03/07 10:05:08 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file
24/03/07 10:05:09 INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file     locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image.
24/03/07 10:05:10 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
 pod name: sparktestavg-41462f8e1828c47c-driver
 namespace: default

....

24/03/07 10:05:28 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
 pod name: sparktestavg-41462f8e1828c47c-driver
 namespace: default
 labels: spark-app-name -> sparktestavg, spark-app-selector -> spark-3e394a89a4b64a41ab92267b25b00d29, spark-role -> driver, spark-version -> 3.5.0
 pod uid: 493eaaf9-c319-4eb9-bdf2-c04a30fd5498
 creation time: 2024-03-07T09:05:09Z
 service account name: default
 volumes: spark-local-dir-1, spark-conf-volume-driver, kube-api-access-2xq88
 node name: xxxxxxxx-f49be65dc8d94931852164
 start time: 2024-03-07T09:05:09Z
 phase: Running
 container status: 
     container name: spark-kubernetes-driver
     container image: xxxx.cloud/spark/testavg-spark:latest
     container state: terminated
     container started at: 2024-03-07T09:05:23Z
     container finished at: 2024-03-07T09:05:26Z
     exit code: 1
     termination reason: Error
`

错误是:

`kubectl logs sparktestavg-41462f8e1828c47c-driver
Files local:///opt/spark/work-dir/scala-spark-1.0-jar-with-dependencies.jar from /opt/spark/work-dir/scala-spark-1.0-jar-with-dependencies.jar to /opt/spark/work-dir/scala-spark-1.0-jar-with-dependencies.jar
Exception in thread "main" java.nio.file.NoSuchFileException: /opt/spark/work-dir/scala-spark-1.0-jar-with-dependencies.jar
at java.base/sun.nio.fs.UnixException.translateToIOException(Unknown Source)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
at java.base/sun.nio.fs.UnixCopyFile.copy(Unknown Source)
at java.base/sun.nio.fs.UnixFileSystemProvider.copy(Unknown Source)
at java.base/java.nio.file.Files.copy(Unknown Source)
`

为什么我会收到此错误?文件 /opt/spark/work-dir/scala-spark-1.0-jar-with-dependency.jar 包含在我的 docker 映像中

提前致谢。

apache-spark kubernetes spark-submit
1个回答
0
投票

我也遇到同样的问题。我不知道为什么会发生这种情况,但我设法找到了一个解决方法,即使用不同的工作目录。

例如,如果我的工作目录在work-dir/中,我会将jar放入jar-dir/并在work-dir中运行spark-submit xxxx local:///jar-dir/xxx.jar。 它将把 xxx.jar 移动到工作目录...

如果有人知道为什么会发生这种情况,请也告诉我

© www.soinside.com 2019 - 2024. All rights reserved.