我有一个数据框,我正在尝试将其转换为长格式。可执行代码如下。但输出并没有按照我想要的方式输出。我希望金额和数量一个低于另一个
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Year':[2022,2022,2023,2023,2024,2024],
'Month':[1,12,11,12,1,1],
'Code':[None,'John Johnson',np.nan,'John Smith','Mary Williams','ted bundy'],
'Unit Price':[np.nan,200,None,56,75,65],
'Quantity':[1500, 140000, 1400000, 455, 648, 759],
'Amount':[100, 10000, 100000, 5, 48, 59],
'Invoice':['soccer','basketball','baseball','football','baseball','ice hockey'],
'energy':[100.,100,100,54,98,3],
'Category':['alpha','bravo','kappa','alpha','bravo','bravo']
})
index_to_use = ['Category','Code','Invoice','Unit Price']
values_to_use = ['Amount','Quantity']
columns_to_use = ['Year','Month']
df2 = df.pivot_table(index=index_to_use,
values=values_to_use,
columns=columns_to_use)
我想要这个输出,以便数量不在金额的右侧,而是在下方,然后我可以对其进行排序,以便相同的数量和金额(类别、COde、发票、单价)出现在金额的下方其他
2022
12
category code invoice Unit Price
200 Amount 10000
200 Quantity 140000
其想法是,同类商品(具有相同单价)的销售金额和数量出现在后续行中。
stack
您的结果:
out = df2.stack(level=0)
输出:
Year 2022 2023 2024
Month 12 12 1
Category Code Invoice Unit Price
alpha John Smith football 56.0 Amount NaN 5.0 NaN
Quantity NaN 455.0 NaN
bravo John Johnson basketball 200.0 Amount 10000.0 NaN NaN
Quantity 140000.0 NaN NaN
Mary Williams baseball 75.0 Amount NaN NaN 48.0
Quantity NaN NaN 648.0
ted bundy ice hockey 65.0 Amount NaN NaN 59.0
Quantity NaN NaN 759.0