目前我使用 Astro Airflow 将文档插入数据库向量。问题是当我想调用instructor-xl时,总是出错:
Downloading (…)7f436/tokenizer.json: 100%|##########| 2.42M/2.42M [00:00<00:00, 3.82MB/s]
[2023-08-09, 02:30:12 UTC] {logging_mixin.py:149} WARNING -
Downloading (…)okenizer_config.json: 0%| | 0.00/2.40k [00:00<?, ?B/s]
[2023-08-09, 02:30:12 UTC] {logging_mixin.py:149} WARNING -
Downloading (…)okenizer_config.json: 100%|##########| 2.40k/2.40k [00:00<00:00, 7.34MB/s]
[2023-08-09, 02:30:13 UTC] {logging_mixin.py:149} WARNING -
Downloading (…)f57f436/modules.json: 0%| | 0.00/461 [00:00<?, ?B/s]
[2023-08-09, 02:30:13 UTC] {logging_mixin.py:149} WARNING -
Downloading (…)f57f436/modules.json: 100%|##########| 461/461 [00:00<00:00, 1.71MB/s]
[2023-08-09, 02:30:13 UTC] {logging_mixin.py:149} INFO - load INSTRUCTOR_Transformer
[2023-08-09, 02:30:27 UTC] {local_task_job_runner.py:225} INFO - Task exited with return code Negsignal.SIGKILL
[2023-08-09, 02:30:28 UTC] {taskinstance.py:2653} INFO - 0 downstream tasks scheduled from follow-on schedule check
我想通过使用计算引擎将 coach-xl 作为服务来解决这个问题。但我不确定如何使用/调用嵌入。
这是我将 LLM 作为服务的示例,我可以使用 langchain.llms.HuggingFaceTextGenInference 来调用它
from langchain.chains import RetrievalQA
from langchain.llms import HuggingFaceTextGenInference
retriever = db.as_retriever()
llm = HuggingFaceTextGenInference(
inference_server_url="http://localhost:8081/",
max_new_tokens=1024,
top_k = 5,
top_p = 0.8,
typical_p = 0.95,
temperature = 0.5,
repetition_penalty = 1
)
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever)
我找不到在langchain中嵌入相同的方法
气流任务的问题是因为instructor-xl尺寸较大。当我换成教练大时就解决了