有没有办法使用 InvokeAgent API 将图像文件作为 base64 或 s3 存储库链接发送到 BedRock Agent?
我注意到在语法中我们有这部分:
response = client.invoke_agent(
agentAliasId='string',
agentId='string',
enableTrace=True|False,
endSession=True|False,
inputText='string',
memoryId='string',
sessionId='string',
sessionState={
'files': [
{
'name': 'string',
'source': {
'byteContent': {
'data': b'bytes',
'mediaType': 'string'
},
's3Location': {
'uri': 'string'
},
'sourceType': 'S3'|'BYTE_CONTENT'
},
'useCase': 'CODE_INTERPRETER'|'CHAT'
},
],
不幸的是我没有找到任何关于如何填写这些部分的文档或示例。
我尝试过传递一个base64图像
'files': [
{
'name': 'uploaded_picture',
'source': {
'byteContent': {
'data': b'{base64}',
'mediaType': 'image/jpeg'
},
'sourceType': 'BYTE_CONTENT'
},
'useCase': 'CHAT'
},
],
但是 Bedrock 给我返回了错误:
EventStreamError: An error occurred (validationException) when calling the InvokeAgent operation: The overridden prompt that you provided is incorrectly formatted. Check the format for errors, such as invalid JSON, and retry your request.
基岩 UI 中也会出现同样的错误。
这是完整的Python代码:
import boto3
# Initialize the boto3 client
agents_runtime_client = boto3.client('bedrock-agent-runtime', region_name='sa-east-1')
def invoke_agent(agent_id, agent_alias_id, session_id, prompt, image_path = None):
image_data = None
if image_path:
with open(image_path, "rb") as image_file:
image_data = base64.b64encode(image_file.read()).decode('utf-8')
# Note: The execution time depends on the foundation model, complexity of the agent,
# and the length of the prompt. In some cases, it can take up to a minute or more to
# generate a response.
response = agents_runtime_client.invoke_agent(
agentId=agent_id,
agentAliasId=agent_alias_id,
sessionId=session_id,
inputText=input_text,
sessionState={
'files': [
{
'name': 'picture',
'source': {
'byteContent': {
'data': image_data,
'mediaType': 'image/jpeg'
},
'sourceType': 'BYTE_CONTENT'
},
'useCase': 'CHAT'
}
]
}
)
completion = ""
for event in response.get("completion"):
chunk = event["chunk"]
completion = completion + chunk["bytes"].decode()
return completion
# Define the chat parameters
agent_alias_id = "XXXXX"
agent_id = "XXXXX"
session_id = str(uuid.uuid4())
input_text = "Hello"
我尝试阅读文档并使用提供的参数。
我希望我可以发送图像供特工解释。
看来由于这是一个相当新的 api,所以错误很普遍。为了传递文件,只需以二进制模式读取文件而不是使用 Base64 进行编码和解码即可解决问题。还要确保您传递的文件的 MIMEType (mediaType) 是正确的。例如:
改变:
with open(image_path, "rb") as image_file: image_data =
base64.b64encode(image_file.read()).decode('utf-8')
致:
with open(image_path, "rb") as image_file:
image_data = image_file.read()