使用OpenAI的结构化输出时,one-shot或few-shot提示中的示例在解析之前是否应该遵循结构化输出的JSON格式?
例如,请考虑使用一次性提示的以下代码:
from pydantic import BaseModel
from openai import OpenAI
client = OpenAI()
class CalendarEvent(BaseModel):
name: str
date: str
participants: list[str]
completion = client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "Extract the event information."},
{"role": "system", "content": "Alice and Bob are going to a science fair on Friday."},
{"role": "system", "content": '{"name":"Science Fair","date":"Friday","participants":["Alice","Bob"]}'},
{"role": "user", "content": "Charlie is going to a concert on Saturday night."},
],
response_format=CalendarEvent,
)
event = completion.choices[0].message.parsed
在此示例中,提示“Alice 和 Bob 将在周五参加科学博览会”。伴随着相应的输出:
{"name":"Science Fair","date":"Friday","participants":["Alice","Bob"]}
我的问题是:这些示例是否需要以这样的 JSON 格式构建,或者可以以其他格式呈现?最佳实践是什么?
使用结构化输出时,您通常不需要提供一次性或几次示例。通过将模型作为
response_format=CalendarEvent
传递,您已经将模型的信息包含到最终提示中。
尽管您可以通过向模型及其字段添加描述来改进它,因为它们也包含在最终提示中。
class CalendarEvent(BaseModel):
"""Calender event description"""
name: str = Field(..., description="The name or title of the event")
date: str = Field(..., description="The date of the event as a weekday")
participants: List[str] = Field(..., description="A list of participants' names")
虽然如果您确实想使用一次性或几次提示,则上述 JSON 格式是合适的,但请考虑使用如下标签改进您的系统消息
messages=[
{"role": "system", "content": "Extract the event information."},
{"role": "system", "content": """Examples:
User: Alice and Bob are going to a science fair on Friday.
Response: {"name":"Science Fair","date":"Friday","participants":["Alice","Bob"]}"""},
{"role": "user", "content": "Charlie is going to a concert on Saturday night."},
]