在 DB-GPT 中初始化 BM25Assembler 时出现连接错误

问题描述 投票:0回答:1

我正在尝试在 DB-GPT 中设置一个简单的示例,它使用 Elasticsearch 作为矢量存储后端。这是知识库初始化过程的一部分,其中 BM25Assembler 用于文档检索和排名。

我已使用 Ollama 运行 DB-GPT,并且 Elasticsearch 均使用 Docker 部署。一切都很好,如下所示 enter image description here

enter image description here 我在使用 BM25Assembler 在 DB-GPT 中设置知识库时遇到连接错误。该错误发生在汇编器初始化期间。跑完后

python examples/rag/bm25_retriever_example.py

Traceback (most recent call last):
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/urllib3/connectionpool.py", line 536, in _make_request
    response = conn.getresponse()
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/urllib3/connection.py", line 507, in getresponse
    httplib_response = super().getresponse()
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/http/client.py", line 1375, in getresponse
    response.begin()
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/elastic_transport/_node/_http_urllib3.py", line 167, in perform_request
    response = self.pool.urlopen(
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/urllib3/connectionpool.py", line 843, in urlopen
    retries = retries.increment(
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/urllib3/util/retry.py", line 449, in increment
    raise reraise(type(error), error, _stacktrace)
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/urllib3/util/util.py", line 38, in reraise
    raise value.with_traceback(tb)
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/urllib3/connectionpool.py", line 536, in _make_request
    response = conn.getresponse()
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/urllib3/connection.py", line 507, in getresponse
    httplib_response = super().getresponse()
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/http/client.py", line 1375, in getresponse
    response.begin()
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/media/manhdt4/sda1/db-gpt/DB-GPT/examples/rag/bm25_retriever_example.py", line 50, in <module>
    asyncio.run(main())
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/media/manhdt4/sda1/db-gpt/DB-GPT/examples/rag/bm25_retriever_example.py", line 37, in main
    assembler = BM25Assembler.load_from_knowledge(
  File "/media/manhdt4/sda1/db-gpt/DB-GPT/dbgpt/rag/assembler/bm25.py", line 144, in load_from_knowledge
    return cls(
  File "/media/manhdt4/sda1/db-gpt/DB-GPT/dbgpt/rag/assembler/bm25.py", line 110, in __init__
    if not self._es_client.indices.exists(index=self._index_name):
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/elasticsearch/_sync/client/utils.py", line 446, in wrapped
    return api(*args, **kwargs)
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/elasticsearch/_sync/client/indices.py", line 1227, in exists
    return self.perform_request(  # type: ignore[return-value]
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/elasticsearch/_sync/client/_base.py", line 423, in perform_request
    return self._client.perform_request(
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/elasticsearch/_sync/client/_base.py", line 271, in perform_request
    response = self._perform_request(
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/elasticsearch/_sync/client/_base.py", line 316, in _perform_request
    meta, resp_body = self.transport.perform_request(
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/elastic_transport/_transport.py", line 342, in perform_request
    resp = node.perform_request(
  File "/media/manhdt4/sda1/miniconda3/envs/dbgpt/lib/python3.10/site-packages/elastic_transport/_node/_http_urllib3.py", line 202, in perform_request
    raise err from e
elastic_transport.ConnectionError: Connection error caused by: ConnectionError(Connection error caused by: ProtocolError(('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))))

这是与错误相关的简单示例中的代码:

    # create bm25 assembler
    assembler = BM25Assembler.load_from_knowledge(
        knowledge=knowledge,
        es_config=es_config,
        chunk_parameters=chunk_parameters,
    )

配置:

def _create_es_config():
    """Create vector connector."""
    return ElasticsearchVectorConfig(
        name="bm25_es_dbgpt",
        uri="localhost",
        port="9200",
        user="elastic",
        password="changeme",
    )

我尝试过的

  1. 已验证 Elasticsearch 容器是否正常运行
  2. 检查 Elasticsearch Docker 日志
  3. 已验证 Ollama 运行正确
  4. 我已检查与 ELK 的连接。没关系。没有错误
from elasticsearch import Elasticsearch

es = Elasticsearch(['http://localhost:9200'], basic_auth=('elastic', 'changeme'))

在 DB-GPT 上下文中导致此连接错误的原因是什么?我只想运行一个简单的例子。

抱歉,我无法将此问题分配给 db-gpt 标签,因为它不存在。

python docker elasticsearch large-language-model ollama
1个回答
0
投票

因为我用Docker构建ELK并启用了SSL/TLS

   environment:
     - node.name=es01
     - cluster.name=${CLUSTER_NAME}
     - discovery.type=single-node
     - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
     - bootstrap.memory_lock=true
     - xpack.security.enabled=true
     - xpack.security.http.ssl.enabled=true
     - xpack.security.http.ssl.key=certs/es01/es01.key
     - xpack.security.http.ssl.certificate=certs/es01/es01.crt
     - xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt
     - xpack.security.transport.ssl.enabled=true
     - xpack.security.transport.ssl.key=certs/es01/es01.key
     - xpack.security.transport.ssl.certificate=certs/es01/es01.crt
     - xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt
     - xpack.security.transport.ssl.verification_mode=certificate
     - xpack.license.self_generated.type=${LICENSE}

而DB_GPT的BM25Assembler不支持ssl=true连接ELK。以下是我成功连接ELK的一些修改

  1. bm25_retriever_example.pyElasticsearchVectorConfig
def _create_es_config():
    """Create vector connector."""
    return ElasticsearchVectorConfig(
        name="bm25_es_dbgpt",
        url="127.0.0.1",
        port="9200",
        user="elastic",
        password="changeme",
        ca_certs="/path/to/cert/ca.crt",
    )
    BM25Assembler(BaseAssembler) 类的
  1. __init__
    方法。代码是从库中提取的,有注释的部分是我修改的
...
...
        self._es_config = es_config
        self._es_url = es_config.uri
        self._es_port = es_config.port
        self._es_username = es_config.user
        self._es_password = es_config.password
        self._index_name = es_config.name
        self._k1 = k1
        self._b = b
        self._ca_certs = es_config.ca_certs # my changes
        if self._es_username and self._es_password and self._ca_certs: # my changes
            self._es_client = Elasticsearch( # my changes
                hosts=f"https://{self._es_url}:{self._es_port}", # my changes
                basic_auth=(self._es_username, self._es_password), # my changes
                verify_certs=True, # my changes
                ca_certs=self._ca_certs # my changes
                
            )          
        elif self._es_username and self._es_password and not self._ca_certs: # my changes
            self._es_client = Elasticsearch(
                hosts=[f"http://{self._es_url}:{self._es_port}"],
                basic_auth=(self._es_username, self._es_password),
            )          
        else:
            self._es_client = Elasticsearch(
                hosts=[f"http://{self._es_url}:{self._es_port}"],
            )
...
...
© www.soinside.com 2019 - 2024. All rights reserved.