Pandas:将 Float64 系列分配给 Int32 系列失败,类型错误

问题描述 投票:0回答:0

我正在使用 Pandas 1.5.1 并遇到一些带有条件设置项的奇怪类型行为:

>>> b = pd.Series([1.2, 2.2, 3.2], dtype="float64")
>>> a = pd.Series([1,2,3], dtype="int32")
>>> a[a < 2] = b
>>> a
0    1.2
1    2.0
2    3.0
dtype: float64

当两个系列都使用 numpy 类型时,赋值成功,但是,如果我将 numpy 类型更改为 Pandas 扩展类型:pd.Float64Dtype() 和 pd.Int32Dtype(),并使用相同的数据,则赋值失败:

>>> b = pd.Series([1.2, 2.2, 3.2], dtype="Float64")
>>> a = pd.Series([1,2,3], dtype="Int32")
>>> a[a < 2] = b

TypeError: Cannot cast array data from dtype('O') to dtype('int32') according to the rule 'safe' The above exception was the direct cause of the following exception: Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\ca2_ps_env_3810\lib\site-packages\pandas\core\series.py", line 1162, in __setitem__
    self._where(~key, value, inplace=True)
  File "C:\ca2_ps_env_3810\lib\site-packages\pandas\core\generic.py", line 9733, in _where
    new_data = self._mgr.putmask(mask=cond, new=other, align=align)
  File "C:\ca2_ps_env_3810\lib\site-packages\pandas\core\internals\managers.py", line 407, in putmask
    return self.apply(
  File "C:\ca2_ps_env_3810\lib\site-packages\pandas\core\internals\managers.py", line 347, in apply
    applied = getattr(b, f)(**kwargs)
  File "C:\ca2_ps_env_3810\lib\site-packages\pandas\core\internals\blocks.py", line 1517, in putmask
    values._putmask(mask, new)
  File "C:\ca2_ps_env_3810\lib\site-packages\pandas\core\arrays\base.py", line 1523, in _putmask
    self[mask] = val
  File "C:\ca2_ps_env_3810\lib\site-packages\pandas\core\arrays\masked.py", line 237, in __setitem__
    value, mask = self._coerce_to_array(value, dtype=self.dtype)
  File "C:\ca2_ps_env_3810\lib\site-packages\pandas\core\arrays\numeric.py", line 258, in _coerce_to_array
    values, mask, _, _ = _coerce_to_data_and_mask(
  File "C:\ca2_ps_env_3810\lib\site-packages\pandas\core\arrays\numeric.py", line 214, in _coerce_to_data_and_mask
    values = dtype_cls._safe_cast(values, dtype, copy=False)
  File "C:\ca2_ps_env_3810\lib\site-packages\pandas\core\arrays\integer.py", line 57, in _safe_cast
    raise TypeError(
TypeError: cannot safely cast non-equivalent object to int32

此代码是否会出现此错误?我不清楚为什么它在 Pandas 扩展类型时将 Float64Dtype() 视为对象数据类型,以及为什么使用扩展类型的相同数据会导致分配失败。如果我使用 concat 而不是 setitem,它会成功。

python pandas dataframe numpy series
© www.soinside.com 2019 - 2024. All rights reserved.