Polars + Psycopg2：将列表列写入 PostgreSQL

Question

我正在尝试编写一个包含一列列表的表。在编写时，我最初收到一个错误，因为 psycopg2 无法将

np.array

数据类型转换为 postgre 类型。

读完这个问题后，我了解了适配器并尝试了

AsIs

和

QuotedString

，试图模仿在SQL中使用的字符串。

让我们看看 MWE：在 SQL 中，以下工作

CREATE TABLE test_arrays (
    id serial PRIMARY KEY,
    foo REAL [],
    bar VARCHAR (24)
);

INSERT INTO test_arrays (foo, bar)
VALUES('{1.0, 4.0}', 'baz');

INSERT INTO test_arrays (foo, bar)
VALUES('{2.0, 42.0}', 'qux');

我尝试在Python中编写相同的表，但

foo

列是文本类型：

import polars as pl


def register_psycopg_adapters():
    import numpy as np
    from psycopg2.extensions import register_adapter, QuotedString

    def addapt_numpy_array(numpy_array):

        # should return e.g. '{1.0, 4.0}'
        return QuotedString("{" + ", ".join(map(str, numpy_array)) + "}")

    register_adapter(np.ndarray, addapt_numpy_array)


def test_writing_array_to_postgres():
    conn = "..." # the connection string
    register_psycopg_adapters()

    df = pl.DataFrame(
        dict(
            foo=[[1.0, 42.0], [4.0, 7.0]],
            bar=["baz", "qux"],
        )
    )

    df.write_database("test_arrays", conn, if_exists="replace")

我想知道如何将 foo 列编写为 ARRAY 类型。

Answer 1

极地

write_database

默认使用 df 到 pandas 的转换，然后使用 pandas to_sql 方法。

看来你已经（至少隐含地）检测到了这一点，因为你正在寻找

np.array

s，这只是因为转换为 pandas 的情况。

无论如何，问题似乎是你需要像

this

问题一样明确你的dtypes。

write_database

不会转发该参数，因此您需要复制粘贴 Polars 正在执行的操作，以便您可以访问 dtypes。

from sqlalchemy import create_engine
engine_sa = create_engine(connection)
df.to_pandas(use_pyarrow_extension_array=True).to_sql(
    name="test_arrays",
    con=engine_sa,
    if_exists="replace",
    index=False,
    dtypes={
        'foo': types.ARRAY(types.DOUBLE())
        'bar': types.TEXT()
    }
)

write_database

的另一个非默认替代方案是使用 adbc，但它目前不支持 ARRAY 类型，因此在这种情况下它对您没有帮助。

Polars + Psycopg2：将列表列写入 PostgreSQL

问题描述投票：0回答：1

1个回答

最新问题

Polars + Psycopg2：将列表列写入 PostgreSQL

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1