spark-3.5.0-bin-without-hadoop :: 无法启动 thriftserver.sh

问题描述 投票:0回答:1

在 RHEL-8 Linux 服务器上,带有

  • Hadoop 3.3.6
  • jdk 1.8 和
  • spark-3.5.0-bin-without-hadoop,

尝试从 spark-3.5.0-bin-without-hadoop 目录启动 ./sbin/start-thriftserver.sh 时,会抛出以下错误 ->

 You need to build Spark with -Phive and -Phive-thriftserver.

HADOOP_HOME 设置为 hadoop-3.3.6 并且 SPARK_HOME 也设置为spark-3.5.0-bin-without-hadoop

请协助和建议 -

  • 如果已经有任何分步设置页面可供参考
  • 相关其他组件,例如 蜂巢 自由节俭 postgresql 驱动程序 哪个版本与 Spark 3.5.0 一起使用所有这些映射页面?

谢谢!

点击 start-thriftserver.sh 后的完整消息 ->


    starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /***/spark-3.5.0-bin-without-hadoop/logs/spark-mss-magnus-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-linux-server.out
    failed to launch: nice -n 0 bash /***/spark-3.5.0-bin-without-hadoop/bin/spark-submit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --name Thrift JDBC/ODBC Server
      Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/opt/disk1/tmp
      Spark Command: /***/java-1.8.0-openjdk-1.8.0.232.b09-0.el7_7.x86_64/bin/java -cp /***/spark-3.5.0-bin-without-hadoop/conf/:/***/spark-3.5.0-bin-without-hadoop/jars/*:/***/hadoop-3.3.6/etc/hadoop/:/***/hadoop-3.3.6/share/hadoop/common/lib/*:/***/hadoop-3.3.6/share/hadoop/common/*:/***/hadoop-3.3.6/share/hadoop/hdfs/:/***/hadoop-3.3.6/share/hadoop/hdfs/lib/*:/***/hadoop-3.3.6/share/hadoop/hdfs/*:/***/hadoop-3.3.6/share/hadoop/mapreduce/*:/***/hadoop-3.3.6/share/hadoop/yarn/:/***/hadoop-3.3.6/share/hadoop/yarn/lib/*:/***/hadoop-3.3.6/share/hadoop/yarn/*:/***/hadoop-3.3.6/contrib/capacity-scheduler/*.jar -Djava.security.krb5.conf=/home/mss-magnus/krb5_sparkv3.conf -Xmx1g -XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED -Djdk.reflect.useDirectMethodHandle=false org.apache.spark.deploy.SparkSubmit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --name Thrift JDBC/ODBC Server spark-internal
      ========================================
      Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/opt/disk1/tmp
      Error: Failed to load class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.
      Failed to load main class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.
      You need to build Spark with -Phive and -Phive-thriftserver.


尝试使用 -v 即 ./sbin/start-thriftserver.sh -v 然后出现以下错误 ->

 ./sbin/start-thriftserver.sh -v
starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /***/spark-3.5.0-bin-without-hadoop/logs/spark-mss-magnus-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-linux-server.out
failed to launch: nice -n 0 bash /***/spark-3.5.0-bin-without-hadoop/bin/spark-submit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --name Thrift JDBC/ODBC Server -v
  (spark.submit.deployMode,client)
  (spark.submit.pyFiles,)
  (spark.yarn.historyServer.address,18480)
  Classpath elements:



  Error: Failed to load class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.
  Failed to load main class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.
  You need to build Spark with -Phive and -Phive-thriftserver.
full log in /***/spark-3.5.0-bin-without-hadoop/logs/spark-mss-magnus-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-linux-server.out

检查完整日志文件时,它包含以下信息 ->

Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/opt/disk1/tmp
Spark Command: /***/java-1.8.0-openjdk-1.8.0.232.b09-0.el7_7.x86_64/bin/java -cp /***/spark-3.5.0-bin-without-hadoop/conf/:/***/spark-3.5.0-bin-without-hadoop/jars/*:/***/hadoop-3.3.6/etc/hadoop/:/***/hadoop-3.3.6/share/hadoop/common/lib/*:/***/hadoop-3.3.6/share/hadoop/common/*:/***/hadoop-3.3.6/share/hadoop/hdfs/:/***/hadoop-3.3.6/share/hadoop/hdfs/lib/*:/***/hadoop-3.3.6/share/hadoop/hdfs/*:/***/hadoop-3.3.6/share/hadoop/mapreduce/*:/***/hadoop-3.3.6/share/hadoop/yarn/:/***/hadoop-3.3.6/share/hadoop/yarn/lib/*:/***/hadoop-3.3.6/share/hadoop/yarn/*:/***/hadoop-3.3.6/contrib/capacity-scheduler/*.jar -Djava.security.krb5.conf=/home/mss-magnus/krb5_sparkv3.conf -Xmx1g -XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED -Djdk.reflect.useDirectMethodHandle=false org.apache.spark.deploy.SparkSubmit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --name Thrift JDBC/ODBC Server --verbose spark-internal
========================================
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/opt/disk1/tmp
Using properties file: /***/spark-3.5.0-bin-without-hadoop/conf/spark-defaults.conf
Adding default property: spark.eventLog.enabled=true
Adding default property: spark.yarn.historyServer.address=18480
Adding default property: spark.sql.crossJoin.enabled=true
Adding default property: spark.rpc.askTimeout=360
Adding default property: spark.driver.cores=2
Adding default property: spark.master=yarn
Adding default property: spark.eventLog.dir=hdfs://mycluster/spark/event-log
Adding default property: spark.eventLog.compress=true
Adding default property: spark.executor.cores=2
Adding default property: spark.history.ui.port=18080
Parsed arguments:
  master                  yarn
  remote                  null
  deployMode              null
  executorMemory          null
  executorCores           2
  totalExecutorCores      null
  propertiesFile          /***/spark-3.5.0-bin-without-hadoop/conf/spark-defaults.conf
  driverMemory            null
  driverCores             2
  driverExtraClassPath    null
  driverExtraLibraryPath  null
  driverExtraJavaOptions  null
  supervise               false
  queue                   null
  numExecutors            null
  files                   null
  pyFiles                 null
  archives                null
  mainClass               org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
  primaryResource         spark-internal
  name                    Thrift JDBC/ODBC Server
  childArgs               []
  jars                    null
  packages                null
  packagesExclusions      null
  repositories            null
  verbose                 true

Spark properties used, including those specified through
 --conf and those from the properties file /***/spark-3.5.0-bin-without-hadoop/conf/spark-defaults.conf:
  (spark.driver.cores,2)
  (spark.eventLog.compress,true)
  (spark.eventLog.dir,hdfs://mycluster/spark/event-log)
  (spark.eventLog.enabled,true)
  (spark.executor.cores,2)
  (spark.history.ui.port,18080)
  (spark.master,yarn)
  (spark.rpc.askTimeout,360)
  (spark.sql.crossJoin.enabled,true)
  (spark.yarn.historyServer.address,18480)


Main class:
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
Arguments:

Spark config:
(spark.app.name,Thrift JDBC/ODBC Server)
(spark.app.submitTime,1729772683511)
(spark.driver.cores,2)
(spark.eventLog.compress,true)
(spark.eventLog.dir,hdfs://mycluster/spark/event-log)
(spark.eventLog.enabled,true)
(spark.executor.cores,2)
(spark.history.ui.port,18080)
(spark.jars,)
(spark.master,yarn)
(spark.rpc.askTimeout,360)
(spark.sql.crossJoin.enabled,true)
(spark.submit.deployMode,client)
(spark.submit.pyFiles,)
(spark.yarn.historyServer.address,18480)
Classpath elements:



Error: Failed to load class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.
Failed to load main class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.
You need to build Spark with -Phive and -Phive-thriftserver.

apache-spark thrift spark-thriftserver
1个回答
0
投票

此错误表明您的 Spark 安装缺少 Hive 和 Hive-ThriftServer 组件,这对于启动 HiveServer2 服务至关重要。该服务允许远程客户端与 Spark SQL 引擎交互。

尝试使用 Hive 和 Hive-ThriftServer 构建 Spark:

对于源代码构建: 如果您从源代码构建 Spark,则需要在构建过程中包含 Hive 和 Hive-ThriftServer 模块。使用以下命令:

重击 ./build/sbt 清理程序集 -Phive -Phive-thriftserver

© www.soinside.com 2019 - 2024. All rights reserved.