我想使用字典通过唯一列值将单个 df 拆分为多个 df。下面的代码展示了如何使用 pandas 来完成此操作。我怎样才能在极地中执行以下操作?
import pandas as pd
#Favorite color of 10 people
df = pd.DataFrame({"Favorite_Color":["Blue","Yellow","Black","Red","Blue","Blue","Green","Red","Red","Blue"]})
#split df into many dfs by Favorite_Color using dict
dict_of_dfs={key: df.loc[value] for key, value in df.groupby(["Favorite_Color"]).groups.items()}
print(dict_of_dfs)
Polars 有一个用于此目的的 DataFrame 方法:
partition_by
。 使用 as_dict
关键字创建 DataFrame 字典。
df.partition_by("Favorite_Color", as_dict=True)
{('Blue',): shape: (4, 1)
┌────────────────┐
│ Favorite_Color │
│ --- │
│ str │
╞════════════════╡
│ Blue │
│ Blue │
│ Blue │
│ Blue │
└────────────────┘,
('Yellow',): shape: (1, 1)
┌────────────────┐
│ Favorite_Color │
│ --- │
│ str │
╞════════════════╡
│ Yellow │
└────────────────┘,
('Black',): shape: (1, 1)
┌────────────────┐
│ Favorite_Color │
│ --- │
│ str │
╞════════════════╡
│ Black │
└────────────────┘,
('Red',): shape: (3, 1)
┌────────────────┐
│ Favorite_Color │
│ --- │
│ str │
╞════════════════╡
│ Red │
│ Red │
│ Red │
└────────────────┘,
('Green',): shape: (1, 1)
┌────────────────┐
│ Favorite_Color │
│ --- │
│ str │
╞════════════════╡
│ Green │
└────────────────┘}