Elastic Search - 密集向量搜索速度极慢

问题描述 投票:0回答:1

我正在使用弹性搜索来索引电子商务零售数据以获取类似产品/匹配产品,我正在使用 openai CLIP 从文本/图像生成密集向量,但查询速度非常慢,我正在使用 Elastic Search 8

这是我在 ES 数据库中创建密集向量的方法

                   'image_vector' => [
                       'type' => 'dense_vector',
                       'dims' => 512,
                       'index' => true,
                       'similarity' => 'cosine',
                       "index_options"=> [
                              "type"=> "hnsw",
                              "m"=> 16,
                              "ef_construction"=> 100
                            ]
                   ],



                   'text_vector' => [
                       'type' => 'dense_vector',
                       'dims' => 512,
                       'index' => true,
                       'similarity' => 'cosine',
                         "index_options"=> [
                              "type"=> "hnsw",
                              "m"=> 16,
                              "ef_construction"=> 100
                            ]
                   ],

这是我如何创建应该条件

                            $should[] = [
                            'script_score' => [
                                'query' => [
                                    'bool' => [
                                        'filter' => [
                                            // Only consider documents with image_vector
                                            ['exists' => ['field' => 'image_vector']]
                                        ]
                                    ]
                                ],
                                'script' => [
                                    'source' => "
                                        double similarity = doc['image_vector'].size() > 0 ? cosineSimilarity(params.query_vector, 'image_vector') : 0;
                                        return similarity > 0.5 ? (similarity + 1.0) * 1000.0 : 0;
                                    ",
                                    'params' => ['query_vector' => $vector['vector'][0]]
                                ]
                            ]
                        ];

                        // Add script_score for text_vector similarity
                        $should[] = [
                            'script_score' => [
                                'query' => [
                                    'bool' => [
                                        'filter' => [
                                            // Only consider documents with text_vector
                                            ['exists' => ['field' => 'text_vector']]
                                        ]
                                    ]
                                ],
                                'script' => [
                                    'source' => "
                                        double similarity = doc['text_vector'].size() > 0 ? cosineSimilarity(params.query_vector, 'text_vector') : 0;
                                        return similarity > 0.5 ? (similarity + 1.0) * 500.0 : 0;
                                    ",
                                    'params' => ['query_vector' => $vector['vector_text'][0]]
                                ]
                            ]
                        ];


                }
elasticsearch elastic-stack
1个回答
0
投票

Elasticsearch 支持两种 kNN 搜索方法:

  1. 使用 knn 搜索选项或 knn 查询近似 kNN
  2. 使用带有向量函数的 script_score 查询的精确、强力 kNN https://www.elastic.co/guide/en/elasticsearch/reference/current/knn-search.html#knn-methods

精确搜索成本太高。请改用 kNN 搜索。

我在下面分享一个 kNN 搜索示例。

POST image-index/_search
{
  "knn": {
    "field": "image-vector",
    "query_vector": [-5, 9, -12],
    "k": 10,
    "num_candidates": 100
  },
  "fields": [ "title", "file-type" ]
}

图片搜索可以阅读以下文章。 https://www.elastic.co/search-labs/blog/implement-image-similarity-search-elastic

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.