我有一个Pandas Dataframe,我希望从中比较两列,并根据比较结果创建一个新列。逻辑如下:
If df['column1']>df['column2'] :
df['New column']=(df['column1']+df['column2'])
else :
df['New column']=(df['column1']+df['column2']+1)
我对Pandas和Python相当新,所以我确定我的结构错了。你们能指出我正确的方向吗?
我认为这应该做你的工作。虽然您可以在stackoverflow上找到类似的问题而无需启动新问题。无论如何。
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,10,size=(10, 2)), columns=list('AB'))
def get_new_col_val(row):
if row['A'] > row['B']:
return row['A'] + row['B']
else:
return row['A'] + row['B'] + 1
df['new_col'] = df.apply(get_new_col_val, axis=1)
这是最简单的方法,但可以通过其他方式完成此操作。
请享用!!!
你可以为更多的自由而努力。如果你只有一个if else语句,那么使用np.where
。您可以使用pd.np
从大熊猫访问numpy libarary
如果您有数据框:
df = pd.DataFrame({'col1':[1,2,3,4,5],'col2':[1,3,4,5,2]})
df['where'] = pd.np.where(df['col1']>df['col2'], df['col1']+df['col2'], df['col1']+df['col2']+1)
col1 col2 where
0 1 1 3
1 2 3 6
2 3 4 8
3 4 5 10
4 5 2 7
# Not exactly by much like
#if df['col1']>df['col2']:
# return df['col1']+df['col2']
#else:
# return df['col1']+df['col2']+1
如果你有一个以上的其他很像if else梯子那么去np.select
m1 = df['col1']<df['col2']
m2 = df['col1']>df['col2']
df['select'] = pd.np.select([m1,m2], [df['col1']+df['col2'],0], 'equal')
# all conditions, all executables, final else
col1 col2 where select
0 1 1 3 equal
1 2 3 6 5
2 3 4 8 7
3 4 5 10 9
4 5 2 7 0
#Which is much like
#if df['col1']< df['col2']:
# return df['col1'] + df['col2']
#elif df['col1']>df['col2']:
# return 0
#else
# return 'equal'