我正在尝试使用 ConverseCommand 将图像上传到 AWS Bedrock 作为对话的一部分(在 @aws-sdk/client-bedrock-runtime v3.686 中)。 根据文档,它需要字节作为 Uint8Array,我认为这与缓冲区相同。 然而,我尝试了很多
.buffer
和 new TextEncoder().encode(imageBuffer)
的排列和组合,它总是这样说:
I'm sorry, but you haven't actually shared an image with me yet. Could you please upload an image and I'll take a look at it?
感觉这应该是微不足道的,但我无法找到任何使用 @aws-sdk/client-bedrock-runtime 的工作代码,其中包含图像或文档作为消息的一部分。有什么想法吗?
const fs = require("fs");
const {
BedrockRuntimeClient,
ConverseCommand,
} = require("@aws-sdk/client-bedrock-runtime");
const modelId = "anthropic.claude-3-haiku-20240307-v1:0";
let conversation = [];
const askQuestion = async () => {
const client = new BedrockRuntimeClient({
region: "us-east-1",
credentials: {
accessKeyId: process.env.AWS_ACCESS_KEY_ID,
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY,
},
});
let content = {};
const imageBuffer = fs.readFileSync("./butterfly.png");
content.image = {
format: "png",
source: {
bytes: imageBuffer.buffer,
},
};
content.text = "What's in this image?";
conversation.push({
role: "user",
content: [content],
});
try {
const response = await client.send(
new ConverseCommand({
modelId,
messages: conversation,
})
);
conversation.push(response.output.message);
console.log(response.output.message.content[0].text);
return response.output.message.content[0].text;
} catch (err) {
console.log(`ERROR: Can't invoke '${modelId}'. Reason: ${err}`);
}
};
askQuestion();
ContentBlock 将作为文本处理,因为文本和图像字段均已设置。
图像内容块的类型定义不应设置文本属性。
export interface ImageMember {
text?: never;
image: ImageBlock;
document?: never;
toolUse?: never;
toolResult?: never;
guardContent?: never;
$unknown?: never;
}
制作另一个内容块来询问图像的内容。
const content1 = {text: "What's in this image?"};
const content2 = {
image: {
format: "png",
source: {
bytes: imageBuffer.buffer,
},
}
};
conversation.push({
role: "user",
content: [content1, content2],
});