ValueError:通过 9 的项目数量错误,放置意味着 1

问题描述 投票:0回答:1

我正在尝试计算数据框中两列之间的余弦相似度。它的代码片段如下:

def cal_cosine_similarity(row):
    vec1 = np.array(row['sup_vec'])
    vec2 = np.array(row['vector'])
    return cosine_similarity([vec1], [vec2])[0][0]
cross_join_df['cos_sim'] = cross_join_df.apply(cal_cosine_similarity,axis = 1)

这在大多数情况下都可以正常工作,但有时我会收到如下错误:

Traceback (most recent call last):
 File "/usr/local/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 2898, in get_loc
   return self._engine.get_loc(casted_key)
 File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
 File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
 File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item
 File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'cos_sim'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
 File "/usr/local/lib/python3.8/site-packages/pandas/core/generic.py", line 3576, in _set_item
   loc = self._info_axis.get_loc(key)
 File "/usr/local/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 2900, in get_loc
   raise KeyError(key) from err
KeyError: 'cos_sim'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
 File "/opt/prism/src/main.py", line 79, in <module>
   res = job.run()
 File "/opt/prism/src/jobs/v2/SparkJob.py", line 45, in run
   self.start()
 File "/opt/prism/src/jobs/v2/SparkJob.py", line 71, in start
   raise e
 File "/opt/prism/src/jobs/v2/SparkJob.py", line 68, in start
   self.execute(self.input_data, 1)
 File "/opt/prism/src/jobs/v2/DprmMappingInference.py", line 289, in execute
   cross_join_df['cos_sim'] = cross_join_df.apply(cal_cosine_similarity,axis = 1)
 File "/usr/local/lib/python3.8/site-packages/pandas/core/frame.py", line 3044, in __setitem__
   self._set_item(key, value)
 File "/usr/local/lib/python3.8/site-packages/pandas/core/frame.py", line 3121, in _set_item
   NDFrame._set_item(self, key, value)
 File "/usr/local/lib/python3.8/site-packages/pandas/core/generic.py", line 3579, in _set_item
   self._mgr.insert(len(self._info_axis), key, value)
 File "/usr/local/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 1198, in insert
   block = make_block(values=value, ndim=self.ndim, placement=slice(loc, loc + 1))
 File "/usr/local/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 2744, in make_block
   return klass(values, ndim=ndim, placement=placement)
 File "/usr/local/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 2400, in __init__
   super().__init__(values, ndim=ndim, placement=placement)
 File "/usr/local/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 130, in __init__
   raise ValueError(
ValueError: Wrong number of items passed 9, placement implies 1

我无法找到此错误。这个错误是由于余弦相似函数的某些功能造成的吗?

python dataframe machine-learning recommendation-engine cosine-similarity
1个回答
0
投票

您说过有时代码不起作用,但并非总是如此。最明显的原因是

cross_join_df
不知何故没有密钥
'cos_sim'
并且 不允许使用新密钥作为创建新条目的手段。我不太确定
cross_join_df
是什么类型的对象,但通常您可以使用以下函数来确定是否存在
cos_sim
条目:

def check_have_entry(object,key):
    if issubclass(object, dict):
        return key in object.keys()  # .keys() not really needed here, just for clarity
    elif isinstance(object, pandas.DataFrame):
        return key in object.index
    return False

但是,如果您的数据框是其他对象,则上述功能将不起作用。

© www.soinside.com 2019 - 2024. All rights reserved.