有没有好的方法在极坐标中进行“zfill”？

Question

使用

pl.Expr.map_elements

向我的数据抛出python函数

zfill

是否正确？我不是在寻找高性能的解决方案。

pl.col("column").map_elements(lambda x: str(x).zfill(5))

有更好的方法吗？

接下来，如果您有一些见解（假设目前不存在），我很乐意讨论一下在不和谐中一个好的实现会是什么样子。

Answer 1

编辑：Polars

0.13.43

及更高版本

在

0.13.43

及更高版本中，Polars 有一个

str.zfill

表达式来完成此操作。

str.zfill

会比下面的答案更快，因此

str.zfill

应该是首选。

从你的问题来看，我假设你从一列整数开始。

lambda x: str(x).zfill(5)

如果是这样，这里有一个严格遵守 pandas 的方法：

import polars as pl
df = pl.DataFrame({"num": [-10, -1, 0, 1, 10, 100, 1000, 10000, 100000, 1000000, None]})

z = 5
df.with_columns(
    pl.when(pl.col("num").cast(pl.String).str.len_chars() > z)
    .then(pl.col("num").cast(pl.String))
    .otherwise(pl.concat_str(pl.lit("0" * z), pl.col("num").cast(pl.String)).str.slice(-z))
    .alias("result")
)

shape: (11, 2)
┌─────────┬─────────┐
│ num     ┆ result  │
│ ---     ┆ ---     │
│ i64     ┆ str     │
╞═════════╪═════════╡
│ -10     ┆ 00-10   │
│ -1      ┆ 000-1   │
│ 0       ┆ 00000   │
│ 1       ┆ 00001   │
│ 10      ┆ 00010   │
│ …       ┆ …       │
│ 1000    ┆ 01000   │
│ 10000   ┆ 10000   │
│ 100000  ┆ 100000  │
│ 1000000 ┆ 1000000 │
│ null    ┆ null    │
└─────────┴─────────┘

将输出与 pandas 进行比较：

df.with_columns(pl.col('num').cast(pl.String)).get_column('num').to_pandas().str.zfill(z)

0       00-10
1       000-1
2       00000
3       00001
4       00010
5       00100
6       01000
7       10000
8      100000
9     1000000
10       None
dtype: object

如果您从字符串开始，那么您可以通过删除对

cast

的任何调用来简化代码。

编辑： 在包含 5.5 亿条记录的数据集上，这在我的机器上花费了大约 50 秒。（注意：这运行单线程）

编辑2：要节省一些时间，您可以使用以下方法：

result = df.lazy().with_columns(
    pl.col('num').cast(pl.String).alias('tmp')
).with_columns(
    pl.when(pl.col("tmp").str.len_chars() > z)
    .then(pl.col("tmp"))
    .otherwise(pl.concat_str(pl.lit("0" * z), pl.col("tmp")).str.slice(-z))
    .alias("result")
).drop('tmp').collect()

但是并没有节省那么多时间。

有没有好的方法在极坐标中进行“zfill”？

问题描述投票：0回答：1

1个回答

编辑：Polars
`0.13.43`
及更高版本

最新问题

有没有好的方法在极坐标中进行“zfill”？

问题描述 投票：0回答：1

1个回答

编辑：Polars 0.13.43及更高版本

最新问题

问题描述投票：0回答：1

编辑：Polars
`0.13.43`
及更高版本