获取一个热编码值的Propportions，同时聚合 - Pandas - Powered by Discuz!

Question

我有一个这样的DF。

    Date        Value

0   2019-03-01  0
1   2019-04-01  1
2   2019-09-01  0
3   2019-10-01  1
4   2019-12-01  0
5   2019-12-20  0
6   2019-12-20  0
7   2020-01-01  0

现在，我需要按季度对它们进行分组，并得到1和0的比例，所以，我的最终输出是这样的。

  Date          Value1 Value0

0 2019-03-31    0      1     
1 2019-06-30    1      0
2 2019-09-30    0      1
3 2019-12-31    0.25   0.75
4 2020-03-31    0      1

我试了下面的代码，好像不行。

def custom_resampler(array):
    import numpy as np

    return array/np.sum(array)

>>df.set_index('Date').resample('Q')['Value'].apply(custom_resampler)

有没有一种pandastic的方法可以让我达到我想要的输出？

Answer 1

重新取样季度，得到数值，并取消堆栈。接下来，使用列的名称属性，重命名列。最后分化每行的值乘以每行的总和。

df = pd.read_clipboard(sep='\s{2,}', parse_dates = ['Date'])


res = (df
       .resample(rule="Q",on="Date")
       .Value
       .value_counts()
       .unstack("Value",fill_value=0)
      )


res.columns = [f"{res.columns.name}{ent}" for ent in res.columns]

res = res.div(res.sum(axis=1),axis=0)
res


          Value0   Value1
Date        
2019-03-31  1.00    0.00
2019-06-30  0.00    1.00
2019-09-30  1.00    0.00
2019-12-31  0.75    0.25
2020-03-31  1.00    0.00

获取一个热编码值的Propportions，同时聚合 - Pandas - Powered by Discuz!

问题描述投票：0回答：1

1个回答

最新问题

获取一个热编码值的Propportions，同时聚合 - Pandas - Powered by Discuz!

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1