将字符添加到 Polars 数据框中的所有字符串列中

Question

对于 DataFrame，我想在所有字符串前面和后面添加额外的字符。

import polars as pl
import polars.selectors as cs

df = pl.DataFrame(
    {
        "A": [1, 2, 3],
        "B": ["apple", "orange", "grape"],
        "C": ["a", "b", "c"],
    }
)

要附加字符串，我可以这样做

df.with_columns(cs.string() + "||")

shape: (3, 3)
┌─────┬──────────┬─────┐
│ A   ┆ B        ┆ C   │
│ --- ┆ ---      ┆ --- │
│ i64 ┆ str      ┆ str │
╞═════╪══════════╪═════╡
│ 1   ┆ apple||  ┆ a|| │
│ 2   ┆ orange|| ┆ b|| │
│ 3   ┆ grape||  ┆ c|| │
└─────┴──────────┴─────┘

但是当我尝试在前面添加相同的内容时，出现错误

df.with_columns("||" + cs.string() + "||")

ComputeError：传递给
LazyFrame.with_columns
的名称“literal”是重复的

多个表达式可能返回相同的默认列名称。如果是这种情况，请尝试使用
.alias("new_name")
重命名列以避免重复的列名称。

Answer 1

想出了两个解决方案

反转、追加、反转、追加

df.with_columns((cs.string().str.reverse() + "||").str.reverse() + "||")

正则表达式替换

df.with_columns(cs.string().str.replace("(.*)", "||${1}||"))

正则表达式替换方法比逆向方法快得多。

两者都达到了预期的结果

┌─────┬────────────┬───────┐
│ A   ┆ B          ┆ C     │
│ --- ┆ ---        ┆ ---   │
│ i64 ┆ str        ┆ str   │
╞═════╪════════════╪═══════╡
│ 1   ┆ ||apple||  ┆ ||a|| │
│ 2   ┆ ||orange|| ┆ ||b|| │
│ 3   ┆ ||grape||  ┆ ||c|| │
└─────┴────────────┴───────┘

将字符添加到 Polars 数据框中的所有字符串列中

问题描述投票：0回答：1

1个回答

最新问题

将字符添加到 Polars 数据框中的所有字符串列中

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1