使用when->then的极坐标聚合警告

Question

考虑以下因素：

In [9]: df
Out[9]: 
shape: (2, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1   ┆ 1   │
│ 2   ┆ 1   │
└─────┴─────┘

In [10]: df.group_by("a").agg(pl.when(pl.col("b") == 1).then(pl.col("b")))
The predicate '[(col("b")) == (1)]' in 'when->then->otherwise' is not a valid aggregation and might produce a different number of rows than the groupby operation would. This behavior is experimental and may be subject to change
Out[10]: 
shape: (2, 2)
┌─────┬───────────┐
│ a   ┆ b         │
│ --- ┆ ---       │
│ i64 ┆ list[i64] │
╞═════╪═══════════╡
│ 2   ┆ [1]       │
│ 1   ┆ [1]       │
└─────┴───────────┘

有什么需要担心的吗？ when->then 必须产生一个值，即使它是 null。

Answer 1

聚合中的问题源于将组

"B"

与文字

进行比较并替换为组

"B"

。我们还没有正式确定该表达式的矢量化规则，因此发出警告。

在

with_columns

中应用三元表达式，然后进行聚合，更明确且（更易于使用理解）：

df = pl.from_repr("""shape: (2, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1   ┆ 1   │
│ 2   ┆ 1   │
└─────┴─────┘
""")

(df.with_columns(
    pl.when(pl.col("b") == 1).then(pl.col("b"))
).group_by("a").all())

shape: (2, 2)
┌─────┬───────────┐
│ a   ┆ b         │
│ --- ┆ ---       │
│ i64 ┆ list[i64] │
╞═════╪═══════════╡
│ 1   ┆ [1]       │
│ 2   ┆ [1]       │
└─────┴───────────┘

使用when->then的极坐标聚合警告

问题描述投票：0回答：1

1个回答

最新问题

使用when->then的极坐标聚合警告

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1