我有一个包含名称的Pandas DataFrame1。
column1: column2:
John some_value
Steve some_value
Mark some_value
另一个包含全名的DataFrame2。
column1: column2:
John Smith some_value
Steve James some_value
Mark Taylor some_value
我需要制作一个等同于SQL的marge:
select
df1.column1
,df2.column2
from DataFrame1 df1
join DataFrame2 df2
on df1.column1 like '%' + df2.column1 + '%'
你能帮忙的话,我会很高兴。
import pandas as pd
inputdataframe1 = [['John', 4],['Steve', 5],['Mark', 6]]
inputdataframe2= [['John smith', 9],['Steve James', 8],['Mark Taylor', 4]]
dataframe1 = pd.DataFrame(inputdataframe1)
dataframe2= pd.DataFrame(inputdataframe2)
merged_dataframe = pd.merge(dataframe1, dataframe2, left_on=[0],right_on=[0],how='outer')
输出将是这样的,因为我们不能直接合并它
0 1_x 1_y
0 John 4.0 NaN
1 Steve 5.0 NaN
2 Mark 6.0 NaN
3 John smith NaN 9.0
4 Steve James NaN 8.0
5 Mark Taylor NaN 4.0
如果您需要使用以下代码加入两个数据框将对您有所帮助
import pandas as pd
inputdataframe1 = [['John', 4],['Steve', 5],['Mark', 6]]
inputdataframe2= [['John smith', 9],['Steve James', 8],['Mark Taylor', 4]]
dataframe1 = pd.DataFrame(inputdataframe1)
dataframe2= pd.DataFrame(inputdataframe2)
dataframe1_names=[key for key,value in inputdataframe1]
dataframe2_names=[key for key,value in inputdataframe2]
d=dict(inputdataframe1)
list_like_values=[[dataframe2_names[j],d[dataframe1_names[i]]] for i in range(len(dataframe2_names)) for j in range(len(dataframe1_names)) if(dataframe1_names[i] in dataframe2_names[j])]
dataframe1= pd.DataFrame(list_like_values)
merged_dataframe = pd.merge(dataframe1, dataframe2, left_on=[0],right_on=[0],how='inner')
输出将采用以下形式
0 1_x 1_y
0 John smith 4 9
1 Steve James 5 8
2 Mark Taylor 6 4