重新创建 ChromaDB 嵌入:sqlite3.OperationalError:无法打开数据库文件

问题描述 投票:0回答:1

我正在开发提供自定义 GPT Bot 服务的 RAG 应用程序,我正在存储 GPT 用于回答用户查询的文件 URL。

enter image description here

我分别存储每个 bot_id 的嵌入。以下是单独存储的每个机器人的嵌入,这些嵌入是根据使用中的 bot_id 检索的。

enter image description here

当用户更改文件 URL 时,我删除该机器人的现有 ChromaDB 文件夹,并在新文件 URL 上重新创建嵌入,并且在重新创建嵌入时显示以下错误:

    Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/api/client.py", line 438, in _validate_tenant_database
    self._admin_client.get_tenant(name=tenant)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/api/client.py", line 486, in get_tenant
    return self._server.get_tenant(name=name)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/api/segment.py", line 140, in get_tenant
    return self._sysdb.get_tenant(name=name)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/db/mixins/sysdb.py", line 125, in get_tenant
    with self.tx() as cur:
  File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/db/impl/sqlite.py", line 131, in tx
    return TxWrapper(self._conn_pool, stack=self._tx_stack)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/db/impl/sqlite.py", line 31, in __init__
    self._conn = conn_pool.connect()
  File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/db/impl/sqlite_pool.py", line 141, in connect
    new_connection = Connection(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/db/impl/sqlite_pool.py", line 20, in __init__
    self._conn = sqlite3.connect(
sqlite3.OperationalError: unable to open database file

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.10/site-packages/flask/app.py", line 1463, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/ubuntu/.local/lib/python3.10/site-packages/flask/app.py", line 872, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/flask/app.py", line 870, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/ubuntu/.local/lib/python3.10/site-packages/flask/app.py", line 855, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
  File "/home/ubuntu/chatbot/main.py", line 460, in qa
    message = storeEmbeddings(embedding_model, raw_text, bot_id)
  File "/home/ubuntu/chatbot/embeddings.py", line 12, in storeEmbeddings
    db = Chroma.from_documents(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/langchain_community/vectorstores/chroma.py", line 778, in from_documents
    return cls.from_texts(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/langchain_community/vectorstores/chroma.py", line 714, in from_texts
    chroma_collection = cls(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/langchain_community/vectorstores/chroma.py", line 120, in __init__
    self._client = chromadb.Client(_client_settings)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/__init__.py", line 274, in Client
    return ClientCreator(tenant=tenant, database=database, settings=settings)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/api/client.py", line 144, in __init__
    self._validate_tenant_database(tenant=tenant, database=database)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/api/client.py", line 447, in _validate_tenant_database
    raise ValueError(
ValueError: Could not connect to tenant default_tenant. Are you sure it exists?

即使文件夹已成功删除,它似乎仍在尝试访问该机器人的旧 ChromaDB。我已使用以下方法按文件夹删除:

import shutil
shutil.rmtree("Embeddings/1001")

创建和存储嵌入的函数:

def storeEmbeddings(embedding, text, bot_id, embedding_folder):

    try:

        text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)

        texts = text_splitter.create_documents([text])

        db = Chroma.from_documents(
            texts,
            embedding,
            persist_directory=embedding_folder+"//"+bot_id,
            client_settings=Settings(anonymized_telemetry=False,is_persistent=True,),
        )

        return sucessMessage
    
    except Exception as e:
        return str(e)

最奇怪的是,当我此时停止并启动 python 应用程序时,它会重新创建该机器人的嵌入。

删除现有 ChromaDB 嵌入并创建新文档的最佳方法是什么?

python sqlite langchain chromadb
1个回答
0
投票

我面临着同样的错误:

ValueError: Could not connect to tenant default_tenant. Are you sure it exists?

为了解决这个问题,我安装了旧版本的 Chroma,特别是 chromadb==0.4.9,它解决了我的问题。

© www.soinside.com 2019 - 2024. All rights reserved.