将总行添加到极坐标数据框（到列的子集）

Question

我有以下代码：

import polars as pl

df = pl.DataFrame({
    'name':      ['CHECK', 'CASH', 'BAR', 'SET'],
    'category1': ['AM', 'EU', 'EU', 'AS'],
    'category2': ['CA', 'FR', 'DE', 'CX'],
    'quantity':  [100, -20, 10, -70],
    'exposure':  [11, -3, 2, 8]
})
FLT_COLS   = ['quantity', 'exposure']
OTHER_COLS = [c for c in df.columns if c not in FLT_COLS]

df_temp = df.select(pl.col(FLT_COLS)).sum()\
            .with_columns(pl.lit('TOTAL').alias(OTHER_COLS[0]))\
            .with_columns([pl.lit('').alias(c) for c in OTHER_COLS[1:]])[df.columns]

pl.concat([df, df_temp])

这给了我想要的输出

shape: (5, 5)
┌───────┬───────────┬───────────┬──────────┬──────────┐
│ name  ┆ category1 ┆ category2 ┆ quantity ┆ exposure │
│ ---   ┆ ---       ┆ ---       ┆ ---      ┆ ---      │
│ str   ┆ str       ┆ str       ┆ i64      ┆ i64      │
╞═══════╪═══════════╪═══════════╪══════════╪══════════╡
│ CHECK ┆ AM        ┆ CA        ┆ 100      ┆ 11       │
│ CASH  ┆ EU        ┆ FR        ┆ -20      ┆ -3       │
│ BAR   ┆ EU        ┆ DE        ┆ 10       ┆ 2        │
│ SET   ┆ AS        ┆ CX        ┆ -70      ┆ 8        │
│ TOTAL ┆           ┆           ┆ 20       ┆ 18       │
└───────┴───────────┴───────────┴──────────┴──────────┘

也就是说，添加一行，其中包含特定列列表中的总和

FLT_COLS

，标记另一个第一列

TOTAL

，然后将

""

放入其余未求和的列中。

有更好的方法来添加这一行吗？我觉得我的代码看起来很笨拙。我也不喜欢必须指定

[df.columns]

来重新排序列，因为这感觉非常低效。

Answer 1

这可能总体上比较笨重，但它解决了：

不在
```
FLT_COLS
```
不使用
```
df.columns
```

pl.concat([
    df,
    df.select(pl.exclude(FLT_COLS)) # pick all columns that aren't FLT_COLS
      .select(
          pl.first() # this is the first column [that isn't in FLT_COLS]
          .first() # this is first row
          .str.replace(r".*","TOTAL")) # uses regex to replace the first value 
                                       # to TOTAL as workaround for dynamic alias
      .hstack( # add columns of sums
          df.select(pl.col(FLT_COLS).sum())
          )
], how='diagonal' # the diagonal makes any extra columns default None 
                  # and orders them by the first df of the list)

如果您知道第一列总体不在

FLT_COLS

中，那么您可以这样做

pl.concat([
    df,
    df.select(
          pl.first().first().str.replace(r".*","TOTAL"),
          pl.col(FLT_COLS).sum()
          )
], how='diagonal')

将总行添加到极坐标数据框（到列的子集）

问题描述投票：0回答：1

1个回答

最新问题

将总行添加到极坐标数据框（到列的子集）

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1