我有一个pandas
数据帧,我想使用apply函数根据现有数据生成两个新列。我收到这个错误:ValueError: Wrong number of items passed 2, placement implies 1
import pandas as pd
import numpy as np
def myfunc1(row):
C = row['A'] + 10
D = row['A'] + 50
return [C, D]
df = pd.DataFrame(np.random.randint(0,10,size=(2, 2)), columns=list('AB'))
df['C', 'D'] = df.apply(myfunc1 ,axis=1)
启动DF:
A B
0 6 1
1 8 4
期望的DF:
A B C D
0 6 1 16 56
1 8 4 18 58
根据您的最新错误,您可以通过将新列作为系列返回来避免错误
def myfunc1(row):
C = row['A'] + 10
D = row['A'] + 50
return pd.Series([C, D])
df[['C', 'D']] = df.apply(myfunc1 ,axis=1)
df['C','D']
被认为是1列而不是2.因此,对于2列,您需要切片数据帧,因此请使用df[['C','D']]
df[['C', 'D']] = df.apply(myfunc1 ,axis=1)
A B C D
0 4 6 14 54
1 5 1 15 55
或者你可以使用链分配,即
df['C'], df['D'] = df.apply(myfunc1 ,axis=1)
查询多列时添加额外的括号。
import pandas as pd
import numpy as np
def myfunc1(row):
C = row['A'] + 10
D = row['A'] + 50
return [C, D]
df = pd.DataFrame(np.random.randint(0,10,size=(2, 2)), columns=list('AB'))
df[['C', 'D']] = df.apply(myfunc1 ,axis=1)