我习惯了能够做这样的事情:
import pandas as pd
df = pd.DataFrame( pd.Categorical(['a','b','b'],['a','b']),columns=['x'])
df.loc[:,'x'] = df['x'].replace({'a':1, 'b':2})
但是,对于较新的 pandas,它会发出警告:
/tmp/ipykernel_1721527/1018712932.py:4: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1, 2, 2]
Categories (2, object): [1, 2]' has dtype incompatible with category, please explicitly cast to a compatible dtype first.
df.loc[:,'x'] = df['x'].replace({'a':1, 'b':2})
我能想到的最短的解决方法是:
ncol = df['x'].replace({'a':1, 'b':2}).astype('float')
df['x'] = None
df = df.astype({'x':'float'})
df.loc[:,'x'] = ncol
但这对于表面上非常简单的操作来说似乎太长且不优雅。我错过了一些明显的东西吗?
讽刺的是,你问题的第一部分是在几分钟前提出的。
cat.rename_categories
而不是 rename
:
df['x'] = df['x'].cat.rename_categories({'a':1, 'b':2})