我有一个df:
string_pos = {'string': [ 'aabb', 'ddcc', ],
'position_1': [0, 1],
'position_2': [3, 4]}
输出:
string position_1 position_2
0 aabb 0 3
1 ddcc 1 4
然后我写了我认为将添加一个带有'string'列的子字符串的新列:
df['short_string'] = df.string.str[df['position_1'], df['position_2']]
但它返回:
string position_1 position_2 short_string
0 aabb 0 3 NaN
1 ddcc 1 4 NaN
我想得到:
string position_1 position_2 short_string
0 aabb 0 3 aab
1 ddcc 1 4 dcc
我想你需要DataFrame.apply
和lambda function
每一行的流程:
df['short_string'] = df.apply(lambda x: x['string'][x['position_1']:x['position_2']], axis=1)
或者使用zip
列表理解:
zipped = zip(df['string'], df['position_1'], df['position_2'])
df['short_string'] = [a[b:c] for a,b,c in zipped]
print (df)
string position_1 position_2 short_string
0 aabb 0 3 aab
1 ddcc 1 4 dcc