InvokeAgent API - 将图像发送到 AWS BEDROCK 代理

问题描述 投票:0回答:1

有没有办法使用 InvokeAgent API 将图像文件作为 base64 或 s3 存储库链接发送到 BedRock Agent?

我注意到在语法中我们有这部分:

response = client.invoke_agent(
    agentAliasId='string',
    agentId='string',
    enableTrace=True|False,
    endSession=True|False,
    inputText='string',
    memoryId='string',
    sessionId='string',
    sessionState={
        'files': [
            {
                'name': 'string',
                'source': {
                    'byteContent': {
                        'data': b'bytes',
                        'mediaType': 'string'
                    },
                    's3Location': {
                        'uri': 'string'
                    },
                    'sourceType': 'S3'|'BYTE_CONTENT'
                },
                'useCase': 'CODE_INTERPRETER'|'CHAT'
            },
        ],

不幸的是我没有找到任何关于如何填写这些部分的文档或示例。

我尝试过传递一个base64图像

 'files': [
            {
                'name': 'uploaded_picture',
                'source': {
                    'byteContent': {
                        'data': b'{base64}',
                        'mediaType': 'image/jpeg'
                    },

                    'sourceType': 'BYTE_CONTENT'
                },
                'useCase': 'CHAT'
            },
        ],

但是 Bedrock 给我返回了错误:

EventStreamError: An error occurred (validationException) when calling the InvokeAgent operation: The overridden prompt that you provided is incorrectly formatted. Check the format for errors, such as invalid JSON, and retry your request.

基岩 UI 中也会出现同样的错误。

错误


这是完整的Python代码:

import boto3

# Initialize the boto3 client
agents_runtime_client = boto3.client('bedrock-agent-runtime', region_name='sa-east-1')


def invoke_agent(agent_id, agent_alias_id, session_id, prompt, image_path = None):    
        
        image_data = None
        if image_path:
            with open(image_path, "rb") as image_file:
                image_data = base64.b64encode(image_file.read()).decode('utf-8')

        # Note: The execution time depends on the foundation model, complexity of the agent,
        # and the length of the prompt. In some cases, it can take up to a minute or more to
        # generate a response.
        response = agents_runtime_client.invoke_agent(
            agentId=agent_id,
            agentAliasId=agent_alias_id,
            sessionId=session_id,
            inputText=input_text,
            sessionState={
            'files': [
            {
                'name': 'picture',
                'source': {
                    'byteContent': {
                        'data': image_data,
                        'mediaType': 'image/jpeg'
                    },
                    'sourceType': 'BYTE_CONTENT'
                },
                'useCase': 'CHAT'
            }
        ]
            }
        )
            

        completion = ""

        for event in response.get("completion"):
            chunk = event["chunk"]
            completion = completion + chunk["bytes"].decode()

    return completion


# Define the chat parameters
agent_alias_id = "XXXXX"
agent_id = "XXXXX"
session_id = str(uuid.uuid4())
input_text = "Hello"

我尝试阅读文档并使用提供的参数。

我希望我可以发送图像供特工解释。

python file-upload large-language-model amazon-bedrock
1个回答
0
投票

看来由于这是一个相当新的 api,所以错误很普遍。为了传递文件,只需以二进制模式读取文件而不是使用 Base64 进行编码和解码即可解决问题。还要确保您传递的文件的 MIMEType (mediaType) 是正确的。例如:

改变:

with open(image_path, "rb") as image_file: image_data = 
    base64.b64encode(image_file.read()).decode('utf-8')

致:

with open(image_path, "rb") as image_file:
    image_data = image_file.read()
© www.soinside.com 2019 - 2024. All rights reserved.