Apache Spark 和 Spring Boot 的依赖冲突

问题描述 投票:0回答:5

因此,我正在构建一个用于交易策略的全栈回测应用程序,目前我考虑使用 Spring Boot 作为服务器,并使用 Apache Spark 进行数据处理。

我尝试在同一个项目中创建 Spring Boot 和 Apache Spark,但运气不佳,我无法解决依赖项冲突,出现错误:

Exception in thread "main" java.lang.NoClassDefFoundError: javax/servlet/Servlet
    at org.apache.spark.ui.SparkUI$.create(SparkUI.scala:223)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:484)
    at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2704)
    at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
    at scala.Option.getOrElse(Option.scala:201)
    at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947)
    at com.samurai.lab.marketdata.MarketDataApplication.main(MarketDataApplication.java:12)
Caused by: java.lang.ClassNotFoundException: javax.servlet.Servlet
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
    at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
    ... 7 more

如果我添加 hadoop dependentecise 那么我会得到一个不同的错误:

Exception in thread "main" javax.servlet.UnavailableException: Servlet class org.glassfish.jersey.servlet.ServletContainer is not a javax.servlet.Servlet
    at org.sparkproject.jetty.servlet.ServletHolder.checkServletType(ServletHolder.java:514)
    at org.sparkproject.jetty.servlet.ServletHolder.doStart(ServletHolder.java:386)
    at org.sparkproject.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
    at org.sparkproject.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:749)
    at java.base/java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:357)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:510)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
    at java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:310)
    at java.base/java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:735)
    at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:762)
    at org.sparkproject.jetty.servlet.ServletHandler.initialize(ServletHandler.java:774)
    at org.sparkproject.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:379)
    at org.sparkproject.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:916)
    at org.sparkproject.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:288)
    at org.sparkproject.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
    at org.apache.spark.ui.ServerInfo.addHandler(JettyUtils.scala:491)
    at org.apache.spark.ui.SparkUI.$anonfun$attachAllHandler$2(SparkUI.scala:76)
    at org.apache.spark.ui.SparkUI.$anonfun$attachAllHandler$2$adapted(SparkUI.scala:76)
    at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563)
    at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:926)
    at org.apache.spark.ui.SparkUI.$anonfun$attachAllHandler$1(SparkUI.scala:76)
    at org.apache.spark.ui.SparkUI.$anonfun$attachAllHandler$1$adapted(SparkUI.scala:74)
    at scala.Option.foreach(Option.scala:437)
    at org.apache.spark.ui.SparkUI.attachAllHandler(SparkUI.scala:74)
    at org.apache.spark.SparkContext.$anonfun$new$31(SparkContext.scala:648)
    at org.apache.spark.SparkContext.$anonfun$new$31$adapted(SparkContext.scala:648)
    at scala.Option.foreach(Option.scala:437)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:648)
    at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2704)
    at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
    at scala.Option.getOrElse(Option.scala:201)
    at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947)
    at com.samurai.lab.marketdata.MarketDataApplication.main(MarketDataApplication.java:12)

Process finished with exit code 1

我将 Servlet 从 Spark 中排除,如下所示:

            <exclusions>
                <exclusion>
                    <groupId>javax.servlet</groupId>
                    <artifactId>javax.servlet-api</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.glassfish</groupId>
                    <artifactId>javax.servlet</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.eclipse.jetty.orbit</groupId>
                    <artifactId>javax.servlet</artifactId>
                </exclusion>
            </exclusions>

没有帮助

这是我现在正在使用的干净的 pom.xml 文件:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>3.0.2</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>com.samurai.lab</groupId>
    <artifactId>market-data</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>market-data</name>
    <description>Demo project for Spring Boot</description>
    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
        <java.version>17</java.version>
        <apache-spark.version>3.3.1</apache-spark.version>
        <hadoop.version>3.3.2</hadoop.version>
    </properties>
    <dependencies>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
            <exclusions>
                <exclusion>
                    <groupId>log4j</groupId>
                    <artifactId>*</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.slf4j</groupId>
                    <artifactId>*</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.apache.logging.log4j</groupId>
                    <artifactId>*</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.13</artifactId>
            <version>${apache-spark.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.13</artifactId>
            <version>${apache-spark.version}</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>

关于如何使其发挥作用并解决冲突有什么想法吗?

更新v1

我有作为 Spark-Core 一部分的依赖项: enter image description here

java spring spring-boot apache-spark
5个回答
6
投票

我有一个类似的问题:看起来像 Spring Boot(在我的例子中为 v.3.0.5)和 Spark(3.4.0)之间的 jakarta 版本冲突 我能够解决与这些属性的冲突:

<properties>
<jakarta-servlet.version>4.0.3</jakarta-servlet.version>
<jersey.version>2.36</jersey.version>
</properties>

1
投票

如果您使用 IntelliJ 并且我没有看到您的 javax 的版本,您可以重新加载所有 Maven 项目,首先,您可以检查该库的项目依赖关系


1
投票
  • 根据上面的评论

    移至

    spring-boot
    2.7

    对我有用

  • 以下是设置 Spark + Java + Spring 的完整(粗略)指南

    (截至目前)

    (可能有效,也可能无效,但对我有用;我对它的实际工作原理没有足够的了解)

代码配置(供参考)

*轮廓视图

code config

Maven Pom

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.7.1</version>
        <relativePath /> <!-- lookup parent from repository -->
    </parent>
    <groupId>LSpark</groupId>
    <artifactId>LSpark_ch08a1</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>LSpark_ch08a1</name>
    <description>Demo project for Spring Boot</description>
    <properties>
        <java.version>17</java.version>
    </properties>
    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
            <exclusions>
                <exclusion>
                    <groupId>org.springframework.boot</groupId>
                    <artifactId>spring-boot-starter-logging</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-log4j2</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-devtools</artifactId>
            <scope>runtime</scope>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.13</artifactId>
            <version>3.3.2</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.13</artifactId>
            <version>3.3.2</version>
            <scope>provided</scope>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming_2.13</artifactId>
            <version>3.3.2</version>
            <scope>provided</scope>
        </dependency>

        <dependency>
            <groupId>org.codehaus.janino</groupId>
            <artifactId>commons-compiler</artifactId>
            <version>3.0.8</version> <!--must-->
        </dependency>
        <dependency>
            <groupId>org.codehaus.janino</groupId>
            <artifactId>janino</artifactId>
            <version>3.0.8</version> <!--must-->
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>

VM 参数(运行配置)

--add-exports java.base/sun.nio.ch=ALL-UNNAMED

示例代码

package com.ex.main;

import java.util.ArrayList;

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Encoder;
import org.apache.spark.sql.Encoders;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;


public class LSpark_ch08a1_Intro {

  public static void demo_Intro() {

    ArrayList<User> arr_user = new ArrayList<>();
    arr_user.add(new User(1, "AA", "NA", 1000));
    arr_user.add(new User(2, "BB", "NA", 1010));
    arr_user.add(new User(3, "CC", "NA", 1020));
    arr_user.add(new User(4, "DD", "NA", 1030));
    arr_user.add(new User(5, "EE", "NA", 1100));
    arr_user.add(new User(6, "FF", "NA", 1600));
    arr_user.add(new User(7, "GG", "NA", 1800));
    arr_user.add(new User(8, "HH", "NA", 2000));
    arr_user.add(new User(9, "II", "NA", 3000));

    //    Encoder<User> UserEncoder = Encoders.bean(User.class);

    SparkSession spark = SparkSession.builder()
                                     .appName("IntroLearn")
                                     .config("spark.master", "local")
                                     .getOrCreate();

    Dataset<Row> df = spark.createDataFrame(arr_user, User.class);
    df.show();
        
    //    df = df.select("id");
    //    df.show();

  }

  public static void main(String[] args) {
    LSpark_ch08a1_Intro.demo_Intro();

  }

}

输出

Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
23/04/08 23:26:47 INFO SparkContext: Running Spark version 3.3.2
23/04/08 23:26:47 WARN Shell: Did not find winutils.exe: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems
23/04/08 23:26:47 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
23/04/08 23:26:47 INFO ResourceUtils: ==============================================================
23/04/08 23:26:47 INFO ResourceUtils: No custom resources configured for spark.driver.
23/04/08 23:26:47 INFO ResourceUtils: ==============================================================
23/04/08 23:26:47 INFO SparkContext: Submitted application: IntroLearn
23/04/08 23:26:47 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
23/04/08 23:26:47 INFO ResourceProfile: Limiting resource is cpu
23/04/08 23:26:47 INFO ResourceProfileManager: Added ResourceProfile id: 0
23/04/08 23:26:47 INFO SecurityManager: Changing view acls to: Zlgtx
23/04/08 23:26:47 INFO SecurityManager: Changing modify acls to: Zlgtx
23/04/08 23:26:47 INFO SecurityManager: Changing view acls groups to: 
23/04/08 23:26:47 INFO SecurityManager: Changing modify acls groups to: 
23/04/08 23:26:47 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(Zlgtx); groups with view permissions: Set(); users  with modify permissions: Set(Zlgtx); groups with modify permissions: Set()
23/04/08 23:26:48 INFO Utils: Successfully started service 'sparkDriver' on port 52048.
23/04/08 23:26:48 INFO SparkEnv: Registering MapOutputTracker
23/04/08 23:26:48 INFO SparkEnv: Registering BlockManagerMaster
23/04/08 23:26:48 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
23/04/08 23:26:48 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
23/04/08 23:26:48 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
23/04/08 23:26:48 INFO DiskBlockManager: Created local directory at C:\Users\Zlgtx\AppData\Local\Temp\blockmgr-200c582b-4f3e-49b1-b674-bbaacb62b443
23/04/08 23:26:48 INFO MemoryStore: MemoryStore started with capacity 1658.4 MiB
23/04/08 23:26:48 INFO SparkEnv: Registering OutputCommitCoordinator
23/04/08 23:26:49 INFO Utils: Successfully started service 'SparkUI' on port 4040.
23/04/08 23:26:49 INFO Executor: Starting executor ID driver on host ZLT6DHSRC
23/04/08 23:26:49 INFO Executor: Starting executor with user classpath (userClassPathFirst = false): ''
23/04/08 23:26:49 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 52049.
23/04/08 23:26:49 INFO NettyBlockTransferService: Server created on ZLT6DHSRC:52049
23/04/08 23:26:49 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
23/04/08 23:26:49 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, ZLT6DHSRC, 52049, None)
23/04/08 23:26:49 INFO BlockManagerMasterEndpoint: Registering block manager ZLT6DHSRC:52049 with 1658.4 MiB RAM, BlockManagerId(driver, ZLT6DHSRC, 52049, None)
23/04/08 23:26:49 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, ZLT6DHSRC, 52049, None)
23/04/08 23:26:49 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, ZLT6DHSRC, 52049, None)
23/04/08 23:26:49 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir.
23/04/08 23:26:49 INFO SharedState: Warehouse path is 'file:/G:/wsp/eclipse/LSpark_ch08a1/spark-warehouse'.
23/04/08 23:26:52 INFO CodeGenerator: Code generated in 151.5239 ms
23/04/08 23:26:52 INFO CodeGenerator: Code generated in 12.3755 ms
+---+---+----+
| id|job|name|
+---+---+----+
|  1| NA|  AA|
|  2| NA|  BB|
|  3| NA|  CC|
|  4| NA|  DD|
|  5| NA|  EE|
|  6| NA|  FF|
|  7| NA|  GG|
|  8| NA|  HH|
|  9| NA|  II|
+---+---+----+

23/04/08 23:26:52 INFO SparkContext: Invoking stop() from shutdown hook
23/04/08 23:26:52 INFO SparkUI: Stopped Spark web UI at http://ZLT6DHSRC:4040
23/04/08 23:26:52 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
23/04/08 23:26:52 INFO MemoryStore: MemoryStore cleared
23/04/08 23:26:52 INFO BlockManager: BlockManager stopped
23/04/08 23:26:52 INFO BlockManagerMaster: BlockManagerMaster stopped
23/04/08 23:26:52 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
23/04/08 23:26:52 INFO SparkContext: Successfully stopped SparkContext
23/04/08 23:26:52 INFO ShutdownHookManager: Shutdown hook called
23/04/08 23:26:52 INFO ShutdownHookManager: Deleting directory C:\Users\Zlgtx\AppData\Local\Temp\spark-f1d38682-d750-4fdb-891f-7f1f8b007bc3


参考

/**
@setting::

[]
.config("spark.master", "local")
<>
https://stackoverflow.com/questions/38008330/spark-error-a-master-url-must-be-set-in-your-configuration-when-submitting-a


;not_working; spring-spark-example/pom.xml at master · Zhuinden/spring-spark-example
;not_working; https://github.com/Zhuinden/spring-spark-example/blob/master/pom.xml
;not_working; 
;not_working; java - Use Spring together with Spark - Stack Overflow
;not_working; https://stackoverflow.com/questions/30053449/use-spring-together-with-spark

[]
In other case, if you use the starters for assembling dependencies, you have to exclude Logback and then include log4j 2 instead:
<>
https://stackoverflow.com/questions/59629214/caused-by-org-apache-logging-log4j-loggingexception-log4j-slf4j-impl-cannot-be

[]
Add the JVM option "--add-exports java.base/sun.nio.ch=ALL-UNNAMED"
<>
https://stackoverflow.com/questions/73465937/apache-spark-3-3-0-breaks-on-java-17-with-cannot-access-class-sun-nio-ch-direct

;no_longer; []
;no_longer; <!-- http://mvnrepository.com/artifact/javax.servlet/javax.servlet-api -->
;no_longer; <dependency>
;no_longer;     <groupId>javax.servlet</groupId>
;no_longer;     <artifactId>javax.servlet-api</artifactId>
;no_longer;     <version>3.1.0</version>
;no_longer; </dependency>
;no_longer; <>
;no_longer; https://stackoverflow.com/questions/36196086/java-lang-classnotfoundexception-javax-servlet-http-httpsessionidlistener
;no_longer; 
;no_longer; spring - Getting java.lang.ClassNotFoundException: javax.servlet.ServletContext in JUnit - Stack Overflow
;no_longer; https://stackoverflow.com/questions/19690267/getting-java-lang-classnotfoundexception-javax-servlet-servletcontext-in-junit
;no_longer; 
;no_longer; Spring Boot java.lang.NoClassDefFoundError: javax/servlet/Filter - Stack Overflow
;no_longer; https://stackoverflow.com/questions/30374316/spring-boot-java-lang-noclassdeffounderror-javax-servlet-filter
;no_longer; 
;no_longer; maven - java.lang.ClassNotFoundException: javax.servlet.http.HttpSessionIdListener - Stack Overflow
;no_longer; https://stackoverflow.com/questions/36196086/java-lang-classnotfoundexception-javax-servlet-http-httpsessionidlistener
;no_longer; 
;no_longer; // that feel old class jarkatar idk // or the ProSpring5 annotation


[]
spark heavily [depends on `Java Servlet 4.0` (part of Java EE 8)](https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.13/3.3.1): , `spring-boot 3` and `spring 6` deliberately do not support `Java EE 8` - their developers have switched to `Jakarta EE 9` which is not backward compatible with `Java EE 8` due to namespace change
<>
https://stackoverflow.com/questions/75350944/dependecy-conflict-apache-spark-and-spring-boot


[]
    <dependency>
        <groupId>org.codehaus.janino</groupId>
        <artifactId>commons-compiler</artifactId>
        <version>3.0.8</version>
    </dependency>
    <dependency>
        <groupId>org.codehaus.janino</groupId>
        <artifactId>janino</artifactId>
        <version>3.0.8</version>
    </dependency>
<>
https://stackoverflow.com/questions/42352091/spark-sql-fails-with-java-lang-noclassdeffounderror-org-codehaus-commons-compil

~// still , should I download Spark installer or just Maven is good ? // k Maven is good 

*/

0
投票

将范围添加到您的依赖项并重试;

<dependency>
    <groupId>javax.servlet</groupId>
    <artifactId>javax.servlet-api</artifactId>
    <scope>provided</scope>
</dependency>

或使用

test
范围。


0
投票

请参考此https://issues.apache.org/jira/browse/SPARK-45897 我正在使用 Java 17、Spring-Boot 3.2.10、Spark 3.5.0 和 Scala 2.13,在我在 Gradle 文件中添加以下内容后,问题得到了解决(已经被这个问题困扰了近 3 天)。

implementation('org.apache.spark:spark-sql_2.13:3.5.0'){
    exclude group: 'org.apache.logging.log4j', module: 'log4j-slf4j2-impl'

}
implementation('org.glassfish.jersey.containers:jersey-container-servlet-core:2.41')
implementation('jakarta.servlet:jakarta.servlet-api:4.0.4')
© www.soinside.com 2019 - 2024. All rights reserved.