如何通过 LlamaIndex 向 FastAPI 端点发送流响应？

Question

我需要使用 LlamaIndex 将流响应发送到我的 FastAPI 端点。以下是我迄今为止编写的代码：

@bot_router.post("/bot/pdf_convo")
async def pdf_convo(query: QuestionInput):
    chat_engine = cache["chat_engine"]
    user_question = query.content
    streaming_response = chat_engine.stream_chat(user_question)
    for token in streaming_response.response_gen:
        print(token, end="")

如果您能提供有关如何使用 LlamaIndex 正确实现流响应的指导，我将不胜感激。谢谢！

Answer 1

参考

FastAPI - StreamingResponse

解决方案

为了使用提供的 StreamingResponse 类，您需要创建一个异步 generator 或普通生成器/迭代器，然后将其传递到 StreamingResponse 对象中。在您的情况下，您希望将

streaming_response.response_gen

传递到将返回生成器的函数中。例如，我就是这样做的：

async def response_streamer(response):
    for token in response:
        yield f"{token}"

@bot_router.post("/bot/pdf_convo")
async def pdf_convo(query: QuestionInput):
    chat_engine = cache["chat_engine"]
    user_question = query.content
    streaming_response = chat_engine.stream_chat(user_question)
    return StreamingResponse(response_streamer(streaming_response.response_gen))

如何通过 LlamaIndex 向 FastAPI 端点发送流响应？

问题描述投票：0回答：1

1个回答

参考

解决方案

最新问题

如何通过 LlamaIndex 向 FastAPI 端点发送流响应？

问题描述 投票：0回答：1

1个回答

参考

解决方案

最新问题

问题描述投票：0回答：1