我有两个数据帧 df1 和 df2 如下。
import pandas as pd
data1 = {'Column1': [1, 2, 3],
'Column2': ['Account', 'Biscut', 'Super'],
'Column3': ['Funny', 'Super', 'Nice']}
df1 = pd.DataFrame(data1)
data2 = {'ColumnName':['Column2','Column3'],
'ifExist':['Acc','Sup'],
'TarName':['Account_name','Super_name']}
df2 = pd.DataFrame(data2)
我想通过将 df2 中的 ifExists 值与 df2 中提到的 ColumnName 与 df1 部分匹配来将新列
TarName
添加到 df1。
我的预期输出是:
Column1 column2 column3 TarName
1 Account Funny Account_Name
2 Biscut Super Super_name
3 Super Nice
我尝试过下面的代码。此代码能够部分映射,但只能映射到一列。通过这种方法,我需要创建尽可能多的列映射的字典,并且需要应用尽可能多的列映射。
有更动态的方法吗?
df2_Column2_dict = df2[df2['ColumnName']=='Column2'].set_index(['ifExist'])['TarName'].to_dict()
pat = r'({})'.format('|'.join(df2_Column2_dict.keys()))
extracted = df1['Column2'].str.extract(pat, expand=False).dropna()
df1['TarName'] = extracted.apply(lambda x: df2_Column2_dict[x]).reindex(df2.index)
print(df1)
我希望对您的代码有所帮助。对于每个概念/行都有注释和解释。
import pandas as pd
# Create DataFrames
data1 = {'Column1': [1, 2, 3],
'Column2': ['Account', 'Biscut', 'Super'],
'Column3': ['Funny', 'Super', 'Nice']}
df1 = pd.DataFrame(data1)
data2 = {'ColumnName':['Column2','Column3'],
'ifExist':['Acc','Sup'],
'TarName':['Account_name','Super_name']}
df2 = pd.DataFrame(data2)
# Initialize new column in df1
df1['TarName'] = ''
# Iterate over each row in df2
for _, row in df2.iterrows():
# Column in df1 to search
column_name = row['ColumnName']
# Partial value to search for
if_exist = row['ifExist']
# Value to assign if there is a match
tar_name = row['TarName']
# Find partial matches and assign TarName
mask = df1[column_name].str.contains(if_exist, case=False, na=False)
df1.loc[mask, 'TarName'] = tar_name
# Show the result
print(df1)