基于两列聚合Pandas数据框中的行

问题描述 投票:-3回答:1

嗨,我想知道是否可以在python中的熊猫数据框中进行以下计算。我有一个带有以下列的数据框

      city  zone   b_s  total
0  cardiff     1   buy   1000
1  cardiff     1  sell    500
2  cardiff     2   buy    100
3  bristol     1   buy    200
4  bristol     1  sell    100

What I need if possible is, when City and Zone match and there is both a Buy and Sell for that pair, then do a calculation. So in the above case I would like to do a calculation only on Cardiff Zone 1 and Bristol Zone 1 (as Cardiff Zone 2 only has one line). The calculation is to aggregate the two, so if there is more sells than buys, I only want the sell line but want to do Total Sell - Total Buy to get a net of the two.

输出将是

加的夫| 1 |买| 500

加的夫| 2 |买| 100

布里斯托尔| 1 |买| 100

python pandas
1个回答
0
投票

这是我认为您希望实现的目标:

import pandas as pd, numpy as np

df.loc[df['b_s'] == 'sell', 'total'] *= -1

df = df.groupby(['city', 'zone'], as_index=False)['total'].sum()

df['b_s'] = np.where(df['total'] >= 0, 'buy', 'sell')

#       city  zone  total  b_s
# 0  bristol     1    100  buy
# 1  cardiff     1    500  buy
# 2  cardiff     2    100  buy
© www.soinside.com 2019 - 2024. All rights reserved.