我在pandas中有以下数据帧
Date tank hose quantity count set flow
01-01-2018 1 1 20 100 211 12.32
01-01-2018 1 2 20 200 111 22.32
01-01-2018 1 3 20 200 123 42.32
02-01-2018 1 1 10 100 211 12.32
02-01-2018 1 2 10 200 111 22.32
02-01-2018 1 3 10 200 123 42.32
我想用quantity
和count
来计算Date
和tank
分组的百分比。我想要的数据帧
Date tank hose quantity count set flow perc_quant perc_count
01-01-2018 1 1 20 100 211 12.32 33.33 20
01-01-2018 1 2 20 200 111 22.32 33.33 40
01-01-2018 1 3 20 200 123 42.32 33.33 40
02-01-2018 1 1 10 100 211 12.32 25 20
02-01-2018 1 2 20 200 111 22.32 50 40
02-01-2018 1 3 10 200 123 42.32 25 40
我正在做以下事实
test = df.groupby(['Date','tank']).apply(lambda x:
100 * x / float(x.sum()))
使用GroupBy.transform
与lambda函数,add_prefix
和join
到原始:
f = lambda x: 100 * x / float(x.sum())
df = df.join(df.groupby(['Date','tank'])['quantity','count'].transform(f).add_prefix('perc_'))
或者指定新列名称:
df[['perc_quantity','perc_count']] = (df.groupby(['Date','tank'])['quantity','count']
.transform(f))
print (df)
Date tank hose quantity count set flow perc_quantity \
0 01-01-2018 1 1 20 100 211 12.32 33.333333
1 01-01-2018 1 2 20 200 111 22.32 33.333333
2 01-01-2018 1 3 20 200 123 42.32 33.333333
3 02-01-2018 1 1 10 100 211 12.32 33.333333
4 02-01-2018 1 2 10 200 111 22.32 33.333333
5 02-01-2018 1 3 10 200 123 42.32 33.333333
perc_count
0 20.0
1 40.0
2 40.0
3 20.0
4 40.0
5 40.0