更改阈值范围时出现类型错误:不可哈希类型:“系列”

问题描述 投票:0回答:1

当我尝试重新排列计数阈值时,出现错误。你能帮我吗?

dfff["employee_count"].value_counts()
employee_count
1-10          56091
11-50         55892
51-100        14377
101-250       12217
251-500        6187
501-1000       4384
1001-5000      4080
unknown        3335
10000+         1941
5001-10000     1362
Name: count, dtype: int64
employee_count_unknown = dfff[(dfff['employee_count'] == 'unknown')].index
dfff.drop(employee_count_unknown, inplace=True)
dfff.loc[df['employee_count'].isin( ["251-500", "501-1000"]), 
       ["employee_count"]] = "251-1000"
dfff.loc[df['employee_count'].isin( ["1001-5000", "5001-10000", "10000+"]), 
       ["employee_count"]] = "1001+"
TypeError: unhashable type: 'Series'
python
1个回答
0
投票

您的代码在我的情况下运行得很好:

import pandas as pd
data = ["1-10","11-50","51-100","101-250","251-500","501-1000","1001-5000","unknown","10000+","5001-10000"]*10
df = pd.DataFrame(data=data, columns=["employee_count"])
employee_count_unknown = df[(df['employee_count'] == 'unknown')].index
df.drop(employee_count_unknown, inplace=True)

df.loc[df['employee_count'].isin( ["251-500", "501-1000"]),  ["employee_count"]] = "251-1000"
df.loc[df['employee_count'].isin( ["1001-5000", "5001-10000", "10000+"]), ["employee_count"]] = "1001+"

print(df["employee_count"].value_counts()

输出:

员工人数
1001+ 30
251-1000 20
1-10 10
11-50 10
51-100 10
101-250 10
© www.soinside.com 2019 - 2024. All rights reserved.