Apache Spark-无法将数据从MS Access表读取到Spark数据集中

问题描述 投票:0回答:1

[当我尝试将.accdb数据读入我的spark数据集时,我得到

Exception in thread "main" java.lang.NoClassDefFoundError: Could not initialize class net.ucanaccess.jdbc.UcanaccessDriver
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:398)
at java.sql/java.sql.DriverManager.isDriverAllowed(DriverManager.java:555)
at java.sql/java.sql.DriverManager.isDriverAllowed(DriverManager.java:547)
at java.sql/java.sql.DriverManager.getDriver(DriverManager.java:280)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.$anonfun$driverClass$2(JDBCOptions.scala:105)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:105)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:35)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
at business.extract.DataExtractorImpl.loadFromAccessTable(DataExtractorImpl.java:62)
at application.Orchestrator.initializeJob(Orchestrator.java:52)
at application.ETLEngine.main(ETLEngine.java:15)

这是我的代码:

//DataExtractorImpl.java
    public Dataset<Row> loadFromAccessTable(String url, String tableName) throws IOException, CustomValidationException {
    return ETLContext.getETLContext().getSession()
            .read()
            .format("jdbc")
            .option("URL", "jdbc:ucanaccess://C:/Users/KE926ES/Documents/db/Creditcard_default.accdb")
            .option("dbtable", "CC_SOURCE_1")
            .load();

我有以下罐子

  • ucanaccess-5.0.0.jar
  • jackcess-3.0.1.jar
  • commons-lang3-3.10.jar
  • commons-logging-1.2.jar

我也尝试将以下内容添加到选项列表中

.option("driver", "net.ucanaccess.jdbc.UcanaccessDriver")
apache-spark apache-spark-dataset
1个回答
0
投票
库可能在类路径或胖罐中不可用。

尝试使用如下所示的spark-submit将必需的jar传递到您的应用程序。

© www.soinside.com 2019 - 2024. All rights reserved.