如何在极坐标数据帧上重新排序重复答案

问题描述 投票:0回答:1

我有一个包含多个问题和答案的 Polars 数据框。问题是每个答案都包含在自己的列中,这意味着我有很多冗余信息。因此,我希望只有一栏用于问题,另一栏用于答案。

以下是数据示例:

data = {
    "ID" : [1,1,1],
    "Question" : ["A","B","C"],
    "Answer A" : ["Answer A", "Answer A", "Answer A"],
    "Answer B" : ["Answer B", "Answer B", "Answer B"],
    "Answer C" : ["Answer C", "Answer C", "Answer C"]
}

df = pl.DataFrame(data)
df

我的方法是创建其他过滤器数据帧,然后将它们连接起来,但是我想要一种更奇特的方法来解决这个问题

我目前的做法:

A_df = (
    df
    .drop(["Answer B","Answer C"])
    .filter(pl.col("Question") == "A")
    .rename({"Answer A" : "Answer"})
)

B_df = (
    df
    .drop(["Answer A","Answer C"])
    .filter(pl.col("Question") == "B")
    .rename({"Answer B" : "Answer"})
)

C_df = (
    df
    .drop(["Answer A","Answer B"])
    .filter(pl.col("Question") == "C")
    .rename({"Answer C" : "Answer"})
)

df_final = pl.concat([A_df,B_df,C_df])
python dataframe python-polars
1个回答
0
投票

如果答案栏数量有限,可以通过简单的

pl.when().then()
链来完成:

df2 = df.select(
    "ID",
    "Question",
    pl.when(pl.col("Question") == "A")
    .then("Answer A")
    .when(pl.col("Question") == "B")
    .then("Answer B")
    .when(pl.col("Question") == "C")
    .then("Answer C")
    .alias("Answer"),
)

这会将

Answer A
列中的值设置为
Answer
(如果
Question == "A"
等等)。

© www.soinside.com 2019 - 2024. All rights reserved.