我正在开发一个聊天机器人项目,在该项目中,我通过 MongoDB 将数据集保存到 cosmosDB 中,并在 python 中使用它来检查用户查询嵌入和块嵌入的相似性。 我收到如下错误:OperationFailure:未找到矢量相似性搜索查询的相似性索引。完整错误:{'ok':0.0,'errmsg':'未找到矢量相似性搜索查询的相似性索引.', '代码': 2, '代码名称': 'BadValue'}.
这是我正在使用的代码:
# simple function to assist with vector search
def vector_search(query, num_result=5):
query_embedding = get_embedding(query)
print(len(query_embedding))
pipeline = [
{
'$search':{
'cosmosSearch':{
'vector': query_embedding,
'path': '/text_embedding',
'k':num_result,
},
'returnStoredSource':True
}
},
{'$project': {'similarityScore': {'$meta': 'searchScore'},
'document': '$$ROOT'}}
]
results = collection.aggregate(pipeline)
return list(results)
还有 这是我用于创建新索引的代码:如下所示:
# Here we are using the 1st method.
db.command({
'createIndexes': 'de_iceing_test',
'indexes':[
{
'name': 'text_embedding',
'key':{
'contentVector':1,
},
'cosmosSearchOptions':{
'kind': 'vector-ivf',
'numLists':5, # number of lists to use in the IVF index
'similarity': 'COS', # cosine similarity metric
'dimensions': 1536 # dimensional vector
}
}
]
})
你能帮我解决这个错误吗?
请帮我删除上面和下面提到的这个错误:
OperationFailure: Similarity index was not found for a vector similarity search query., full error: {'ok': 0.0, 'errmsg': 'Similarity index was not found for a vector similarity search query.', 'code': 2, 'codeName': 'BadValue'}
根据微软提供的官方示例,需要设置
cosmosSearch
为索引内容,
{
"createIndexes": "<collection_name>",
"indexes": [
{
"name": "<index_name>",
"key": {
"<path_to_property>": "cosmosSearch"
},
"cosmosSearchOptions": {
"kind": "vector-hnsw",
"m": <integer_value>,
"efConstruction": <integer_value>,
"similarity": "<string_value>",
"dimensions": <integer_value>
}
}
]
}
因此,对于您的场景,您可以预期索引定义如下:
db.command({
'createIndexes': 'de_iceing_test',
'indexes':[
{
'name': 'text_embedding',
'key':{
'contentVector': "cosmosSearch", # here
},
'cosmosSearchOptions':{
'kind': 'vector-ivf',
'numLists':5, # number of lists to use in the IVF index
'similarity': 'COS', # cosine similarity metric
'dimensions': 1536 # dimensional vector
}
}
]
})