向数据框中每个组的最后一行添加新行[重复]

Question

我的原始数据框如下：

List = [['2024-05-25', 'Group 1', 'Year 1', 23466882], ['2024-05-25', 'Group 1', 'Year 2', 458397284], ['2024-05-25', 'Group 1', 'Year 3', 2344545], ['2024-05-25', 'Group 2', 'Year 1', 6662345], ['2024-05-25', 'Group 2', 'Year 2', 46342], ['2024-05-25', 'Group 3', 'Year 1', 34234], ['2024-05-25', 'Group 3', 'Year 2', 45222]]
df = pd.DataFrame(List, columns = ['Report_date', 'Product_group', 'Year', 'Sales'])

对于每个产品组，如果“第 3 年”不存在，则应在末尾添加销售额为 11 000 的新行。

输出应如下所示：

我最初的想法是将数据框拆分为每个产品组，如果子数据框没有第 3 年的任何信息，则添加一个新行，但这种方法似乎不是最佳的。

任何评论表示赞赏。预先感谢您！

Answer 1

如果只需要为每个组添加缺失的年份

Year 3

，请使用

pd.concat

以及过滤后的行，其中第一个不存在的组添加了新的

Year

和

Sales

值：

g = df.loc[df['Year'].eq('Year 3'), 'Product_group']

out = (pd.concat([df, 
                  df.loc[~df['Product_group'].isin(g)]
                    .drop_duplicates('Product_group').assign(Year='Year 3', Sales=11000)])
          .sort_values(['Product_group','Year'], ignore_index=True))
print (out)
  Report_date Product_group    Year      Sales
0  2024-05-25       Group 1  Year 1   23466882
1  2024-05-25       Group 1  Year 2  458397284
2  2024-05-25       Group 1  Year 3    2344545
3  2024-05-25       Group 2  Year 1    6662345
4  2024-05-25       Group 2  Year 2      46342
5  2024-05-25       Group 2  Year 3      11000
6  2024-05-25       Group 3  Year 1      34234
7  2024-05-25       Group 3  Year 2      45222
8  2024-05-25       Group 3  Year 3      11000

向数据框中每个组的最后一行添加新行[重复]

问题描述投票：0回答：1

1个回答

最新问题

向数据框中每个组的最后一行添加新行[重复]

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1