获取 Pandas 中 groupby 操作的大小

Question

我一直在对数据框执行 groupby 操作，该数据框根据“名称”列将列聚合在一起：

Name | As | Bs | Cs   |  Note
Mark   3     4     7     Good
Luke   2     1     12    Well
Mark   5     6     8     Ok
John   1     18    3     Great

因此，在这种情况下，使用以下代码将带有“Mark”的行聚合到 A、B 和 C 列上：

temp_df = temp_df.groupby(['Name'], as_index=False).agg({'As': np.sum, 'Bs': np.sum,'Cs': np.sum})

我需要添加的一件事是对“名称”中具有相同值的行数进行计数。这会给我这样的输出：

Name | As | Bs | Cs   |  Note   | Count
Mark   8     10    15    Good      2
Luke   2     1     12    Well      1
John   1     18    3     Great     1

如何修改上面的代码行来完成我需要的操作？

Answer 1

创建群组并进行聚合：

the_group = temp_df.groupby(['Name'], as_index=False)
temp_df = the_group.agg({'As': np.sum, 'Bs': np.sum,'Cs': np.sum})

然后根据

size

 计算

the_group

temp_df['count'] = the_group.count()['Note']

给出：

   Name  Cs  As  Bs  count
0  John   3   1  18      1
1  Luke  12   2   1      1
2  Mark  15   8  10      2

编辑：

正如评论中所建议的，如果数据包含

size()

，则使用

NaN

更安全：

temp_df['count'] = the_group.size().reset_index()[0]

Answer 2

使用

first

+

size

然后是必要的

rename

字典列：

temp_df = temp_df.groupby('Name', sort=False) \
                .agg({'As':np.sum,'Bs':np.sum,'Cs':np.sum,'Note':'first','Name':'size'}) \
                .rename(columns={'Name':'Count'}) \
                .reset_index() \
                .reindex_axis(temp_df.columns.tolist() + ['Count'], axis=1) 
print (temp_df)
   Name  As  Bs  Cs   Note  Count
0  Mark   8  10  15   Good      2
1  Luke   2   1  12   Well      1
2  John   1  18   3  Great      1

请勿使用

count

，仅使用

size

或

len

。

大熊猫的大小和数量有什么区别？

Answer 3

与现代熊猫：

temp_df.groupby(['Name'], as_index=False).agg({'As': np.sum, 'Bs': np.sum,'Cs': np.sum, 'count':'size'})

获取 Pandas 中 groupby 操作的大小

问题描述投票：0回答：3

3个回答

编辑：

最新问题

获取 Pandas 中 groupby 操作的大小

问题描述 投票：0回答：3

3个回答

编辑：

最新问题

问题描述投票：0回答：3