如何制作新的 pandas DataFrame,其中列作为旧的索引_列对

问题描述 投票:0回答:1

我有两个 pandas DataFrame:

object_1df = pd.DataFrame([['a', 1], ['b', 2]],
                   columns=['letter', 'number'])
object_2df = pd.DataFrame([['b', 3, 'cat'], ['c', 4, 'dog']],
                   columns=['letter', 'number', 'animal'])

screenshot of dataframes

我需要为每个 DataFrame 制作一行目录,列数等于元素数量。最终的形式应该是每个 df 一行,并包含以下列:

example columns

我尝试过非常丑陋的:

objects = [object_1df, object_2df]

catalog = pd.DataFrame()
for objectdf in objects:
    object_row = pd.DataFrame()
    for letter in objectdf['letter']:
        for column in objectdf.columns:
            object_row[f'{letter}_{column}'] = objectdf[column].loc[objectdf['letter']==letter]
    catalog = pd.concat([catalog, object_row], ignore_index=True)
display(catalog)

输出不需要的结果:

undesired result

这个结果本质上只计算每个 df 的第一行,并在其他地方给出 NaN。这样做的正确方法是什么?

python pandas dataframe
1个回答
0
投票

回答我自己的问题:像这样展平 df 可以得到期望的结果:

object_1df = pd.DataFrame([['a', 1], ['b', 2]],
                   columns=['letter', 'number'])
object_2df = pd.DataFrame([['b', 3, 'cat'], ['c', 4, 'dog']],
                   columns=['letter', 'number', 'animal'])
objects = [object_1df, object_2df]

catalog = pd.DataFrame()
for df in objects: 
    df.set_index('letter', inplace=True)
    flattened_data = {f'{index}_{col}': df.loc[index, col] for index in df.index for col in df.columns}
    flattened_df = pd.DataFrame([flattened_data])
    display(flattened_df)
    catalog = pd.concat([catalog, flattened_df], ignore_index=True)
display(catalog)

产量 desired result

© www.soinside.com 2019 - 2024. All rights reserved.