Python Polars 无法将 f64 列转换为 str 并聚合为列表

Question

使用以下类型的数据框：

df = pl.DataFrame({
'order': [38681.0, 38692.0, 38680.0, 38693.0],
'shipto': ["471433", "239269", "471433","239269"],
'value': [10,20,30,None]
})

需要按“Shipto”列进行分组，对“值”求和并将“订单”聚合到列表中。已经尝试了一些方法，但无法正常工作。

基本脚本：

df = (df
     .with_columns([
         pl.col('order').cast(pl.Utf8), 
         pl.col(pl.Float64).fill_null(0)
     ])
   .groupby('shipto')
   .agg([
        pl.col('order').apply(lambda x: str(x)).alias('order_list'),
        pl.sum('value')
    ])
    )

退货：

运送至	订单列表	价值
str	str	i64
471433	形状：(2,)	40
	系列：'' [f64]
	[
	...
239269	形状：(2,)	20
	系列：'' [f64]
	[
	...

我希望在“order_list”列中得到的是 ([38681.0,38680.0],[38692.0,38693.0]) 或 (['38681.0','38680.0'],['38692.0','38693.0'])

我猜测“order”列需要从 f64 值转换为字符串（Utf8），但无法使其工作。

我迄今为止尝试过的行 'pl.col('order').cast(pl.Utf8), #.cast(pl.Float64)' 的变体：

pl.col('order').cast(pl.Float64).cast(pl.Utf8),

pl.col('order').cast(pl.Int64).cast(pl.Utf8),

pl.col('order').map(lambda x: str(x)),

pl.col('order').apply(lambda x: str(int(x))),

pl
  .when(pl.col('order').is_null())
  .then(pl.lit(''))
  .otherwise(pl.col('order').cast(pl.Float64).cast(pl.Utf8)).alias('order'),

当然存在一些基本错误，但继续努力解决这个问题，我们将不胜感激任何帮助。

Answer 1

如果你没有在 agg 中指定聚合函数，结果将是一个列表。

这可能就是您正在寻找的。

df.group_by('shipto').agg(
    'order',
    pl.sum('value')
)

# Result

shape: (2, 3)
┌────────┬────────────────────┬───────┐
│ shipto ┆ order              ┆ value │
│ ---    ┆ ---                ┆ ---   │
│ str    ┆ list[f64]          ┆ i64   │
╞════════╪════════════════════╪═══════╡
│ 471433 ┆ [38681.0, 38680.0] ┆ 40    │
│ 239269 ┆ [38692.0, 38693.0] ┆ 20    │
└────────┴────────────────────┴───────┘

Python Polars 无法将 f64 列转换为 str 并聚合为列表

问题描述投票：0回答：1

1个回答

最新问题

Python Polars 无法将 f64 列转换为 str 并聚合为列表

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1