我正在尝试根据合同的年数和评分创建一个变量。这个复式表包含两个主要变量:
得分 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
---|---|---|---|---|---|---|---|---|---|---|---|
9.3 | 0.00805% | 0.0283431% | 0.0283431% | 0.0464484% | 0.0464484% | 0.0464484% | 0.1196343% | 0.1196343% | 0.1562273% | 0.1562273% | 0.2595363% |
9.2 | 0.0102% | 0.0102% | 0.0476431% | 0.0875944% | 0.0875944% | 0.0875944% | 0.0875944% | 0.1434739% | 0.1434739% | 0.2072495% | 0.2072495% |
9.1 | 0.0383297% | 0.0835863% | 0.0835863% | 0.0835863% | 0.0835863% | 0.0835863% | 0.1810753% | 0.1810753% | 0.2259184% | 0.2259184% | 0.2259184% |
预期结果:
我需要对此数据库执行相同的操作:
import pandas as pd
df = pd.DataFrame({'id':[80887618
,80887619
,80887620
,80887621
],
'years': [9,
1,
5,
12
],
'scoring': [9.3
,9.2
,9.1
,9.3
]
})
IIUC,您
unstack
带有百分比的数据框,然后将其合并到df
:
percentages = (
percentages.set_index("scoring")
.unstack()
.reset_index(name="data")
.rename(columns={"level_0": "years"})
)
df = df.merge(percentages, how="left")
id years scoring data
0 80887618 9 9.3 0.1562273%
1 80887619 1 9.2 0.0102%
2 80887620 5 9.1 0.0835863%
3 80887621 12 9.3 NaN