如何处理#DIV/0！在 Polars 中使用炉甘石/fastexcel 引擎时，pl.read_excel() 出现错误？

Question

我正在处理一个凌乱的 Excel 文件，并尝试使用 Polars 中的

pl.read_excel()

方法以及 fastexcel 或炉甘石引擎来读取它。我的目标是仅加载 3 个特定列：“

apple_column

”、“

banana_column

”和“

kiwi_column

”。

这是我尝试过的：

pl.read_excel(
    source=xlsx_file_path,
    sheet_name="name_of_the_sheet",
    columns=["apple_column", "banana_column", "kiwi_column"],
)

还有：

pl.read_excel(
    source=xlsx_file_path,
    sheet_name="name_of_the_sheet",
    read_options={
        "use_columns": ["apple_column", "banana_column", "kiwi_column"],
    },
)

不幸的是，这两种方法都会导致相同的错误：

CalamineCellError: calamine cell error: #DIV/0!
Context:
    0: could not determine dtype for column __UNNAMED__25

看来，即使我需要的列 (

["apple_column", "banana_column", "kiwi_column"]

) 与有问题的列 (

__UNNAMED__25

) 无关，引擎也会尝试读取整个工作表，在未使用的列之一中遇到

#DIV/0!

错误。

这是否意味着炉甘石/fastexcel 引擎始终读取整个工作表，即使指定了特定列？另外，建议的解决方法是什么？

Answer 1

如果将 calamine/fastexcel 数据类型定义为字符串，如下所示，然后选择指定的列并在

pl.select

中转换为所需的数据类型，那么它是有效的，但也许有比这更好的方法。

pl.read_excel(
    source=xlsx_file_path,
    sheet_name="name_of_the_sheet",
    read_options={
        "dtypes": "string", # Read all excel columns as strings
    },
).select(
    pl.col("apple_column"),
    pl.col("banana_column"),
    pl.col("kiwi_column"),
)

如何处理#DIV/0！在 Polars 中使用炉甘石/fastexcel 引擎时，pl.read_excel() 出现错误？

问题描述投票：0回答：1

1个回答

最新问题

如何处理#DIV/0！在 Polars 中使用炉甘石/fastexcel 引擎时，pl.read_excel() 出现错误？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1