我正在 GPU 上运行以下代码:
streamlit run replicate_lama2.py
import os
import traceback
import sys
import streamlit as st
os.environ["REPLICATE_API_TOKEN"] = "my_key"
from llama_index.llms import Replicate
llama2_7b_chat = "meta/llama-2-7b-chat:8e6975e5ed6174911a6ff3d60540dfd4844201974602551e10e9e87ab143d81e"
llm = Replicate(
model=llama2_7b_chat,
temperature=0.01,
additional_kwargs={"top_p": 1, "max_new_tokens": 300},
)
from llama_index import VectorStoreIndex, SimpleDirectoryReader
from llama_index.embeddings import HuggingFaceEmbedding
from llama_index import ServiceContext
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
service_context = ServiceContext.from_defaults(
llm=llm, embed_model=embed_model
)
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(
documents, service_context=service_context
)
#index = VectorStoreIndex.from_documents(documents)
# Get the query engine
query_engine = index.as_query_engine(streaming=True)
# Create a Streamlit web app
#st.title("LLM Query Interface")
query = st.text_input("Enter your query:")
submit_button = st.button("Submit")
if submit_button:
# Query the engine with the defined query
response = query_engine.query(query)
st.write("### Query Result:")
st.write(response)
我的目录“data”包含我想要执行查询的不同文件,与replicate_lama2.py脚本位于同一目录中。当我运行这个程序时,我会在网络浏览器(firefox)中打开此聊天,并且我可以提出一个问题(这肯定可以从 /data 目录中的文档回答),但我在附件中得到输出,而不是任何答案,这表示“没有可用的文档”。如何做到这一点?
对象类型是 - 类 llama_index.core.base.response.schema.StreamingResponse(response_gen: ~typing.Generator[str, None, None], source_nodes: ~typing.List[~llama_index.core.schema.NodeWithScore] = ,元数据:~typing.可选[~typing.Dict[str,~typing.Any]] =无,response_txt:~typing.Optional[str] =无) 所以使用 - st.write(response.response)