使用 langchain csv 或 pandas dataframe 创建一个简单的事件指南聊天机器人,两者都有相同的问题。
为了获得良好的答复,您必须使用数据集中的确切措辞提出一个非常具体的问题。 这对于简单的聊天来说是有问题的,因为用户并不真正知道数据集中包含的确切措辞。
我还没有找到一种方法可以调整 pandas 查询以查看提示中“包含”关键字的“相似”单词...
包含维加斯直升机之旅信息的行的数据集示例:
Las Vegas Strip Helicopter tour,Self-guided,3500 Las Vegas blvd,Fun in Vegas,March-24,7:00 AM,9:00 AM,
创建代理:
88 df = pd.read_csv("csv-files/csv.csv")
89
90
91 agent = create_pandas_dataframe_agent(
92 local_llm,
93 df,
94 verbose=True,
95 # agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION, ### only zero shot available, otherwise must use openAI functions
96 )
97
asking a general question about an event related to helicopter tours.
115 res = agent.invoke("Is there any event related to helicopter tours?")
116 print(res)
117
118
回复:
> Entering new AgentExecutor chain...
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Thought: I need to check if there is any event in the dataframe with the word 'helicopter' in the Event Name column.
Action: python_repl_ast
Action Input: df[df['Event Name'].str.contains('helicopter')]Empty DataFrame
Columns: [Event Name, Host, Location, Event Category, Date, Time start, Time end, Unnamed: 7]
Index: []Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
There is no event related to helicopter tours in the dataframe.
Final Answer: No, there is no event related to helicopter tours in the dataframe.
^^^^^^^^^^ 所以这是不正确的,str.contains 查询实际上包含单词直升机..
如果我专门问它一个带有确切措辞或关于其他列的问题,那么它会返回一个响应,但提示必须非常仔细地计划...它可以找到实际的事件..
> Entering new AgentExecutor chain...
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Thought: I need to find the event that has the location '3500 Las Vegas blvd'.
Action: python_repl_ast
Action Input: df[df['Location'] == '3500 Las Vegas blvd'].head() Event Name Host Location Event Category Date Time start Time end Unnamed: 7
37 Las Vegas Strip Helicopter tour Self-guided 3500 Las Vegas blvd Fun in Vegas March-24 7:00 AM 9:00 AM NaNSetting `pad_token_id` to `eos_token_id`:2 for open-end generation.
I now know the final answer
Final Answer: The event 'Las Vegas Strip Helicopter tour' is located at 3500 Las Vegas blvd.
> Finished chain.
{'input': 'Which event is located at 3500 Las Vegas blvd?', 'output': "The event 'Las Vegas Strip Helicopter tour' is located at 3500 Las Vegas blvd."}
问题是这需要精确的措辞。任何人都知道如何更好地指导 CSV / Pandas 数据框代理搜索类似或“包含”查询吗?
谢谢!!
你找到方法了吗?我在这个代理上遇到了类似的问题。