在下面的玩具示例中,我尝试根据外部合并结果添加状态列。面临的挑战是保留 tom 的博客中最好描述的链接方法。注释掉的行是我的尝试,但它不起作用
import pandas as pd
# Create sample data frames A and B
A = pd.DataFrame({
'key': ['A', 'B', 'C', 'D'],
'value': [1, 2, 3, 4]
})
B = pd.DataFrame({
'key': ['C', 'D', 'E', 'F'],
'value': [3, 4, 5, 6]
})
# Merge data frames A and B on the 'key' column and add an indicator column
merged = pd.merge(A, B, on='key', how='outer', indicator=True)
# add a status column
#{'both':'no change',
#'left_only': 'added',
#'right_only': 'removed'}
merged = (merged
.assign (status = 'no change')
#.assign(status = lambda x: x.loc[x._merge == 'left_only'], 'added')
.drop('_merge', axis=1)
)
这样的东西应该足够了 - 一般来说,对于要分配的切片,您需要使用条件(
map
,np.where
,np.select
,pd.where
等)
(A
.merge(B, on='key', how='outer', indicator=True)
.assign(status = lambda f: f._merge.map({"left_only":"added",
"both":"no change",
"right_only":"no change"}))
)