嗨,我想知道是否可以在python中的熊猫数据框中进行以下计算。我有一个带有以下列的数据框
city zone b_s total
0 cardiff 1 buy 1000
1 cardiff 1 sell 500
2 cardiff 2 buy 100
3 bristol 1 buy 200
4 bristol 1 sell 100
What I need if possible is, when City and Zone match and there is both a Buy and Sell for that pair, then do a calculation. So in the above case I would like to do a calculation only on Cardiff Zone 1 and Bristol Zone 1 (as Cardiff Zone 2 only has one line). The calculation is to aggregate the two, so if there is more sells than buys, I only want the sell line but want to do Total Sell - Total Buy to get a net of the two.
输出将是
加的夫| 1 |买| 500
加的夫| 2 |买| 100
布里斯托尔| 1 |买| 100
这是我认为您希望实现的目标:
import pandas as pd, numpy as np
df.loc[df['b_s'] == 'sell', 'total'] *= -1
df = df.groupby(['city', 'zone'], as_index=False)['total'].sum()
df['b_s'] = np.where(df['total'] >= 0, 'buy', 'sell')
# city zone total b_s
# 0 bristol 1 100 buy
# 1 cardiff 1 500 buy
# 2 cardiff 2 100 buy