尝试将输入文本映射到关联的数值,允许使用
defaultdict
丢失未表示的条目。
运行下面的代码可以工作,但是当我使用
lambda
添加默认值时,出现错误。
z
Out[216]:
0 $1,001
1 $1,001
2 $50,001
3 $15,001
4 $50,001
586 $1,001
587 $1,001
588 $1,001
589 $1,001
590 $1,001
Name: 0, Length: 591, dtype: object
amt_map = {
"$1": 500,
"$1,001": 2500,
"$15,001": 32500,
"$50,001": 75000,
"$100,001": 175000,
"$250,001": 350000,
"$500,001": 750000,
"$1,000,001": 2000000,
"$5,000,001": 10000000,
"25,000,001": 25000000
}
z.map(amt_map)
Out[220]:
0 2500.0
1 2500.0
2 75000.0
3 32500.0
4 75000.0
使用默认的 lambda 抛出会导致错误:
from collections import defaultdict
d = {
"$1": 500,
"$1,001": 2500,
"$15,001": 32500,
"$50,001": 75000,
"$100,001": 175000,
"$250,001": 350000,
"$500,001": 750000,
"$1,000,001": 2000000,
"$5,000,001": 10000000,
"25,000,001": 25000000
}
amt_map = defaultdict(lambda x: x.replace('$',''), d)
z.map(amt_map)
Traceback (most recent call last):
File "/tmp/ipykernel_470946/75548175.py", line 1, in <module>
z.map(amt_map)
File "/home/chris/anaconda3/lib/python3.9/site-packages/pandas/core/base.py", line 825, in <lambda>
mapper = lambda x: dict_with_default[x]
TypeError: <lambda>() missing 1 required positional argument: 'x'
搜索表明这是由于函数使用的参数数量不同并且包含 lambda 造成的,但我不知道这会如何/为什么会导致这里出现问题。
根据 @wim 的建议,创建一个子类来定义丢失键的处理方法。下面返回我正在寻找的内容,但如果有办法实现它,我希望不需要子类化:
d = {
"$1": 500,
"$1,001": 2500,
"$15,001": 32500,
"$50,001": 75000,
"$100,001": 175000,
"$250,001": 350000,
"$500,001": 750000,
"$1,000,001": 2000000,
"$5,000,001": 10000000,
"25,000,001": 25000000
}
class CleanAmt(dict):
def __missing__(self, key):
return pd.to_numeric(key.replace('$',''))
z.map(CleanAmt(d))