我有以下3个数据框:
截止日期数据框:
DF1:
iID data1 data2
10 blue green
11 red teal
第二个数据帧:
DF2:
iID rH repH
10 50 60
10 60 70
11 70 50
(DF2每个iID可以有1行或2行)
我希望我的输出DF在一个单元格中有一个数组用于rH和repH
输出将是这样的:
输出DF:
iID data1 data2 rH repH
10 blue green [50,60] [60,70]
11 red teal [70] [50]
达蒙
df1.merge(df2.groupby('iID').agg(lambda x : x.tolist()).reset_index())
Out[144]:
iID data1 data2 rH repH
0 10 blue green [50, 60] [60, 70]
1 11 red teal [70] [50]
值得在下面添加添加..
join,默认为left join:
df1.join(df2)
或者concat,默认情况下是外连接:
pd.concat([df1, df2], axis=1)
只是添加更多叙述:
>>> df1 = pd.DataFrame({'a':range(6),
... 'b':[5,3,6,9,2,4]}, index=list('abcdef'))
>>> df2 = pd.DataFrame({'c':range(4),
... 'd':[10,20,30, 40]}, index=list('abhi'))
>>>
>>>
>>> df1
a b
a 0 5
b 1 3
c 2 6
d 3 9
e 4 2
f 5 4
>>> df2
c d
a 0 10
b 1 20
h 2 30
i 3 40
>>> df4 = df1.join(df2)
>>> df4
a b c d
a 0 5 0.0 10.0
b 1 3 1.0 20.0
c 2 6 NaN NaN
d 3 9 NaN NaN
e 4 2 NaN NaN
f 5 4 NaN NaN