我正在尝试根据与另一个数据框的比较将值插入数据框。这是一个例子:
>>> import pandas as pd
>>> import numpy as np
>>> print(df)
>>> df
name
0 richard Finn, Tim Maltby
1 Fernando Lebrija
>>> df2
Fullname id
0 richard Finn 500
1 Tim Maltby 699
2 Fernando Lebrija 300
所需的输出是:
>>> df
name id
0 richard Finn, Tim Maltby 500,699
1 Fernando Lebrija 300
我尝试使用:
df['id'] = np.where((df['name']==df2['Fullname']), df2['id]', df['id'])
但是它给了我以下错误: `SyntaxError:无效语法
您可以进行拆分,爆炸,然后映射和分组:
df['id'] = (df['name'].str.split(',\s*')
.explode()
.map(df2.set_index('Fullname')['id'])
.groupby(level=0).agg(list)
)
输出:
name id
0 richard Finn, Tim Maltby [500, 699]
1 Fernando Lebrija [300]
另一种方法,使用列表理解
mapper = df2.set_index('Fullname')['id'].to_dict()
df['id'] = df['name'].apply(lambda x: ','.join([str(mapper.get(i.strip(), '')) for i in x.split(',')]))
name id
0 richard Finn, Tim Maltby 500,699
1 Fernando Lebrija 300