截断 Polars 中的字符串长度，使其不超过 Excel 字符限制

Question

我目前能够将单个列的 str 值截断到低于 Excel 字符限制（如果不这样做，则意味着使用 xlsxwriter 以下单元格为空白），但我希望在满足该条件的每个/任何列中都这样做- 可能使用

pl.all()

.

到目前为止，这是我必须应用于值超过限制的每一列：

df.with_columns(pl.when(pl.col('col1').str.len_chars() >= 32_767)
                  .then(pl.lit('Too many to display - ') + pl.col('col1').str.slice(0, 10_000))
                  .otherwise(pl.col('col1'))
                )

Answer 1

如果您使用

pl.all()

，

.name.keep()

应该可以工作

df = pl.from_repr("""
┌─────────┬─────────┬────────┬──────┐
│ col1    ┆ col2    ┆ col3   ┆ col4 │
│ ---     ┆ ---     ┆ ---    ┆ ---  │
│ str     ┆ str     ┆ str    ┆ i64  │
╞═════════╪═════════╪════════╪══════╡
│ aaaaaaa ┆ b       ┆ c      ┆ 1    │
│ a       ┆ bb      ┆ cccccc ┆ 2    │
│ aa      ┆ bbbbbbb ┆ cccc   ┆ 3    │
└─────────┴─────────┴────────┴──────┘
""")

您可以使用

pl.col(pl.String)

仅选择字符串列（如果有其他类型）：

df.with_columns(
   pl.when(pl.col(pl.String).str.len_chars() >= 3)
     .then('[*] ' + pl.col(pl.String).str.slice(0, 3))
     .otherwise(pl.col(pl.String))
     .name.keep()
)

shape: (3, 4)
┌─────────┬─────────┬─────────┬──────┐
│ col1    ┆ col2    ┆ col3    ┆ col4 │
│ ---     ┆ ---     ┆ ---     ┆ ---  │
│ str     ┆ str     ┆ str     ┆ i64  │
╞═════════╪═════════╪═════════╪══════╡
│ [*] aaa ┆ b       ┆ c       ┆ 1    │
│ a       ┆ bb      ┆ [*] ccc ┆ 2    │
│ aa      ┆ [*] bbb ┆ [*] ccc ┆ 3    │
└─────────┴─────────┴─────────┴──────┘

截断 Polars 中的字符串长度，使其不超过 Excel 字符限制

问题描述投票：0回答：1

1个回答

最新问题

截断 Polars 中的字符串长度，使其不超过 Excel 字符限制

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1