如何在 DuckDB 的关系 API 中使用“case when”?

问题描述 投票:0回答:1

说我有

data = {'id': [1, 1, 1, 2, 2, 2],
 'd': [1, 2, 3, 1, 2, 3],
 'sales': [1, 4, 2, 3, 1, 2]}

我现在的最终目标是能够翻译

import duckdb
import polars as pl
df = pl.DataFrame(data)
duckdb.sql("""
select *, case when count(sales) over w then sum(sales) over w else null end as rolling_sales
from df
window w as (partition by id order by d rows between 1 preceding and current row)
""")

我已经做到了:

rel = duckdb.table("df")
rel.sum(
    "sales",
    projected_columns="*",
    window_spec="over (partition by id order by d rows between 1 preceding and current row) as rolling_sales",
)

我认为这比巨大的 SQL 字符串更具可读性

但是我怎样才能把

case when then
部分放在那里呢?我看过https://duckdb.org/docs/api/python/relational_api.html并且没有提到“case”

python duckdb
1个回答
0
投票

我不确定这是实现它的最佳方法,我认为只编写简单的 SQL(或使用 Polars API)会更具可读性,但你可以这样做:

rel = rel.count(
    "*",
    projected_columns="*",
    window_spec="over (partition by id order by d rows between 1 preceding and current row) as rolling_sale",
)
rel = rel.select(
    "* exclude(rolling_sale), case when rolling_sale then sales else null end as rolling_sale"
)
rel = rel.sum(
    "rolling_sale",
    projected_columns="* exclude(rolling_sale)",
    window_spec="over (partition by id order by d rows between 1 preceding and current row) as rolling_sale",
)
┌───────┬───────┬───────┬──────────────┐
│  id   │   d   │ sales │ rolling_sale │
│ int64 │ int64 │ int64 │    int128    │
├───────┼───────┼───────┼──────────────┤
│     1 │     1 │     1 │            1 │
│     1 │     2 │     4 │            5 │
│     1 │     3 │     2 │            6 │
│     2 │     1 │     3 │            3 │
│     2 │     2 │     1 │            4 │
│     2 │     3 │     2 │            3 │
└───────┴───────┴───────┴──────────────┘
© www.soinside.com 2019 - 2024. All rights reserved.