A
0 31.353
1 28.945
2 17.377
我想创建一个新的df [“B”],其A列值最多为5。所需的输出:
A B
0 31.353 35.0
1 28.945 30.0
2 17.377 20.0
我试过了:
def roundup5(x):
return int(math.ceil(x / 5.0)) * 5
df["B"] = df["A"].apply(roundup5)
我明白了:
TypeError: unsupported operand type(s) for /: 'str' and 'float'
我认为你首先需要将值转换为float
s,然后将numpy.ceil
除以并使用多个:
df["B"] = df["A"].astype(float).div(5.0).apply(np.ceil).mul(5)
df["B"] = np.ceil(df["A"].astype(float).div(5.0)).mul(5)
循环版:
def roundup5(x):
return int(math.ceil(float(x) / 5.0)) * 5.0
df["B"] = df["A"].apply(roundup5)
print (df)
A B
0 31.353 35.0
1 28.945 30.0
2 17.377 20.0
时序:
[30000 rows x 1 columns]
df = pd.concat([df] * 10000, ignore_index=True)
In [327]: %timeit df["B1"] = df["A"].apply(roundup5)
35.7 ms ± 4.54 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [328]: %timeit df["B2"] = df["A"].astype(float).div(5.0).apply(np.ceil).mul(5)
1.25 ms ± 76.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [329]: %timeit df["B3"] = np.ceil(df["A"].astype(float).div(5.0)).mul(5)
1.19 ms ± 22.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)