熊猫：计算平均值，留下自己行的值

Question

我想按组计算均值，省略行本身的值。

import pandas as pd

d = {'col1': ["a", "a", "b", "a", "b", "a"], 'col2': [0, 4, 3, -5, 3, 4]}
df = pd.DataFrame(data=d)

我知道如何按组返回：

df.groupby('col1').agg({'col2': 'mean'})

哪个回报：

Out[247]: 
  col1  col2
1    a     4
3    a    -5
5    a     4

但我想要的是按组的意思，省略行的值。例如。第一行：

df.query('col1 == "a"')[1:4].mean()

返回：

Out[251]: 
col2    1.0
dtype: float64

编辑：预期输出是与上面的df格式相同的数据框，其中列mean_excl_own是组中所有其他成员的平均值，不包括行自己的值。

Answer 1

你可以GroupBy col1and transform的意思。然后从平均值中减去给定行的值：

df['col2'] = df.groupby('col1').col2.transform('mean').sub(df.col2)

Answer 2

感谢您的输入。我最终使用@VnC链接的方法。

这是我解决它的方式：

import pandas as pd

d = {'col1': ["a", "a", "b", "a", "b", "a"], 'col2': [0, 4, 3, -5, 3, 4]}
df = pd.DataFrame(data=d)

group_summary = df.groupby('col1', as_index=False)['col2'].agg(['mean', 'count'])
df = pd.merge(df, group_summary, on = 'col1')

df['other_sum'] = df['col2'] * df['mean'] - df['col2'] 
df['result'] = df['other_sum'] / (df['count']  - 1)

查看最终结果：

df['result']

哪个印刷品：

Out: 
0    1.000000
1   -0.333333
2    2.666667
3   -0.333333
4    3.000000
5    3.000000
Name: result, dtype: float64

编辑：我之前在列名方面遇到了一些麻烦，但我使用this回答修复了它。

熊猫：计算平均值，留下自己行的值

问题描述投票：1回答：2

2个回答

最新问题

熊猫：计算平均值，留下自己行的值

问题描述 投票：1回答：2

2个回答

最新问题

问题描述投票：1回答：2