AnalysisException:在要保存的数据中找到重复的列

问题描述 投票:0回答:1

我正在尝试将数据帧的值插入 Databricks 上的 SQL 表中。

问题是,数据框中没有(明显的)重复列。我检查了。这可能是什么?

 |-- nr_cpf_cnpj: string (nullable = true)
 |-- tp_pess: string (nullable = true)
 |-- am_bacen: long (nullable = true)
 |-- cd_moda: long (nullable = true)
 |-- cd_sub_moda: long (nullable = true)
 |-- vl_bacen: decimal(29,2) (nullable = true)
 |-- clivenc: string (nullable = true)
 |-- vl_envio: decimal(28,2) (nullable = true)
 |-- nm_pess_empr: string (nullable = true)
 |-- nr_cnae_prin: long (nullable = true)


spark.sql("INSERT INTO TABLE db.tb_jul_bcn  SELECT * FROM tmpBcnView")

数据帧在 tmpBcnView 中作为临时视图

错误:

AnalysisException: Found duplicate column(s) in the data to save: nr_cnae_prin
---------------------------------------------------------------------------
AnalysisException                         Traceback (most recent call last)
<command-2987275027841731> in <cell line: 1>()
----> 1 spark.sql("INSERT INTO TABLE db.tb_jul_bcn  SELECT * FROM tmpBcnView")
/databricks/spark/python/pyspark/instrumentation_utils.py in wrapper(*args, **kwargs)
     46             start = time.perf_counter()
     47             try:
---> 48                 res = func(*args, **kwargs)
     49                 logger.log_success(
     50                     module_name, class_name, function_name, time.perf_counter() - start, signature
/databricks/spark/python/pyspark/sql/session.py in sql(self, sqlQuery, **kwargs)
   1117             sqlQuery = formatter.format(sqlQuery, **kwargs)
   1118         try:
-> 1119             return DataFrame(self._jsparkSession.sql(sqlQuery), self)
   1120         finally:
   1121             if len(kwargs) > 0:
apache-spark pyspark apache-spark-sql databricks
1个回答
0
投票

我解决了!

发生的情况不是有重复的列,而是有一个多余的列。错误是声称有重复项,因此我正在搜索重复项。就是这样!

最新问题
© www.soinside.com 2019 - 2025. All rights reserved.