数据帧定位分配:copy_on_write 的奇怪错误消息

问题描述 投票:0回答:1

使用 panda 2.2.3 版本数据框

以下代码会导致形状不匹配错误:

pd.options.mode.copy_on_write = True
dftest = pd.DataFrame({"A":[1,4,1,5], "B":[2,5,2,6], "C":[3,6,1,7]})
df=dftest[["B","C"]]
# print(df)
bads=df.T.mean() > 4
# print(bads)

tmp=df.loc[bads,:]
print(tmp)
df.loc[bads,:]=tmp

所以同一区域的重新分配是不可能的。也许我错过了一些东西,但这对我来说看起来像是一个错误?

使用 copy_on_write=False 我收到警告并且 使用 copy_on_write="warn" 代码运行时没有任何警告!

这是此代码的输出:

        B  C
     1  5  6
     3  6  7
     
     ---------------------------------------------------------------------------
     ValueError                                Traceback (most recent call last)
     File ~/jupyter/venv/lib/python3.12/site-packages/pandas/core/internals/blocks.py:1429,
 in Block.setitem(self, indexer, value, using_cow)
        1428 try:
     -> 1429     values[indexer] = casted
        1430 except (TypeError, ValueError) as err:
     
     ValueError: shape mismatch: value array of shape (2,2) could not be broadcast to indexing result of shape (2,)
     
     The above exception was the direct cause of the following exception:
     
     ValueError                                Traceback (most recent call last)
     Cell In[379], line 10
           8 tmp=df.loc[bads,:]
           9 print(tmp)
     ---> 10 df.loc[bads,:]=tmp
          
     
     File ~/jupyter/venv/lib/python3.12/site-packages/pandas/core/indexing.py:911,
 in _LocationIndexer.__setitem__(self, key, value)
         908 self._has_valid_setitem_indexer(key)
         910 iloc = self if self.name == "iloc" else self.obj.iloc
     --> 911 iloc._setitem_with_indexer(indexer, value, self.name)
     
     File ~/jupyter/venv/lib/python3.12/site-packages/pandas/core/indexing.py:1944,
 in _iLocIndexer._setitem_with_indexer(self, indexer, value, name)
        1942     self._setitem_with_indexer_split_path(indexer, value, name)
        1943 else:
     -> 1944     self._setitem_single_block(indexer, value, name)
     
     File ~/jupyter/venv/lib/python3.12/site-packages/pandas/core/indexing.py:2218,
 in _iLocIndexer._setitem_single_block(self, indexer, value, name)
        2215 self.obj._check_is_chained_assignment_possible()
        2217 # actually do the set
     -> 2218 self.obj._mgr = self.obj._mgr.setitem(indexer=indexer, value=value)
        2219 self.obj._maybe_update_cacher(clear=True, inplace=True)
     
     File ~/jupyter/venv/lib/python3.12/site-packages/pandas/core/internals/managers.py:409,
 in BaseBlockManager.setitem(self, indexer, value, warn)
         405         self._iset_split_block(  # type: ignore[attr-defined]
         406             0, blk_loc, values
         407         )
         408         # first block equals values
     --> 409         self.blocks[0].setitem((indexer[0], np.arange(len(blk_loc))), value)
         410         return self
         411 # No need to split if we either set all columns or on a single block
         412 # manager
     
     File ~/jupyter/venv/lib/python3.12/site-packages/pandas/core/internals/blocks.py:1432,
 in Block.setitem(self, indexer, value, using_cow)
        1430     except (TypeError, ValueError) as err:
        1431         if is_list_like(casted):
     -> 1432             raise ValueError(
        1433                 "setting an array element with a sequence."
        1434             ) from err
        1435         raise
        1436 return self
     
     ValueError: setting an array element with a sequence.
python-3.x pandas dataframe
1个回答
0
投票

确实这是 copy_on_write 模式的一个错误。在此确认:

https://github.com/pandas-dev/pandas/issues/60309

© www.soinside.com 2019 - 2024. All rights reserved.