我正在开发我的第一个天蓝色静态网络应用程序,
对于 Next.js,我已经有了它的工作版本。
应用程序的概念很简单,它对用户(语音)音频输入进行转录并将其返回给用户(浏览器)。我仍在努力扩展这个想法。
使用天蓝色的静态网络应用程序,我可以在家外的手机上在线测试它。
不幸的是,我在尝试弄清楚如何在静态 azure web 应用程序的 azure 函数中实现这一点时遇到了一些困难。我想我也确实需要 BLOB 存储。
我真正想知道的是如何在azure函数中实现耳语模型, 我需要什么参数。
当我在 Next.js 应用程序中实现此功能时,我需要指定音频文件的路径。因此,出于这个原因,我认为我需要一个用于 azure 函数的 blob 存储,但是文件的 url 和路径位置之间存在差异。
如果有人可以向我解释如何解决这个问题,我现在无法在文档中找到我需要的内容。如果有人可以向我指出文档或基于有一个想法,我将非常感激。
来源阅读:
在 Nextjs 应用程序中使用耳语模型 - https://learn.microsoft.com/en-us/azure/ai-services/openai/whisper-quickstart?tabs=command-line%2Cpython-new%2Ckeyless%2Cjavascript-keyless %2Ctypescript-keyless&pivots=编程语言-javascript
来源阅读试图在 Azure 函数中找出答案:
openai textCompletion - https://learn.microsoft.com/en-us/azure/azure-functions/functions-add-openai-text-completion?pivots=programming-language-javascript
BlobStorage - https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blob-upload-javascript?tabs=javascript
首先,用户通过静态 Web 应用程序(使用 Next.js 构建)上传音频文件。音频文件被发送到 Azure Function 进行处理。它将文件临时存储在 Azure Blob 存储中,并使用 Whisper 模型 处理文件。将转录返回到浏览器。
在 Azure Function 项目中安装Blob Storage 和 OpenAI 所需的库
npm install @azure/storage-blob openai
功能代码:
const { BlobServiceClient } = require("@azure/storage-blob");
const { Configuration, OpenAIApi } = require("openai");
module.exports = async function (context, req) {
const connectionString = process.env.AZURE_STORAGE_CONNECTION_STRING;
const containerName = "audio-files";
if (!req.files || !req.files.audio) {
context.res = {
status: 400,
body: "No audio file uploaded.",
};
return;
}
const audioFile = req.files.audio; // File from the HTTP request
const blobServiceClient = BlobServiceClient.fromConnectionString(connectionString);
const containerClient = blobServiceClient.getContainerClient(containerName);
// Upload file to Blob Storage
const blockBlobClient = containerClient.getBlockBlobClient(audioFile.name);
await blockBlobClient.uploadData(audioFile.data);
// Generate SAS URL for the uploaded file
const sasUrl = blockBlobClient.url;
// Use Whisper for transcription
const configuration = new Configuration({
apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);
try {
const response = await openai.createTranscription({
file: sasUrl,
model: "whisper-1", // Specify Whisper model
});
context.res = {
status: 200,
body: response.data.text,
};
} catch (error) {
context.res = {
status: 500,
body: error.message,
};
}
};
环境变量
AZURE_STORAGE_CONNECTION_STRING
用于 Azure Blob 存储帐户的连接字符串。OPENAI_API_KEY
用于 Azure OpenAI API 密钥。型号:使用“whisper-1”作为型号名称。
文件路径: Whisper 需要音频文件的 URL(例如,来自 Blob 存储的 SAS URL)。
控制台日志:
[INFO] 2025-01-16T10:45:00Z: HTTP request received. Method: POST, URL: /api/transcribe
[INFO] 2025-01-16T10:45:00Z: Validating uploaded file...
[INFO] 2025-01-16T10:45:00Z: File validation successful. File name: user-audio-12345.mp3, Size: 1.5 MB
[INFO] 2025-01-16T10:45:01Z: Establishing connection to Azure Blob Storage...
[INFO] 2025-01-16T10:45:02Z: Blob Storage connection established. Uploading file to container: audio-files
[INFO] 2025-01-16T10:45:03Z: File uploaded successfully. Blob URL: https://xxxxxxxxxxxxxxxx.blob.core.windows.net/audio-files/user-audio-12345.mp3
[INFO] 2025-01-16T10:45:03Z: Initiating transcription with Whisper model...
[INFO] 2025-01-16T10:45:05Z: Whisper model API called. Input parameters:
- File URL: https://xxxxxxxxxxxxxxxxxx.blob.core.windows.net/audio-files/user-audio-12345.mp3
- Model: whisper-1
[INFO] 2025-01-16T10:45:07Z: Transcription completed. Result: "Hello, this is a test audio for transcription."
[INFO] 2025-01-16T10:45:07Z: Sending transcription result back to client. Status: 200
---
[INFO] 2025-01-16T10:46:00Z: Authorization request initiated for SPA app.
[INFO] 2025-01-16T10:46:01Z: Redirecting user to Azure login.
[INFO] 2025-01-16T10:46:05Z: Authorization code received: "0.ABcdefGHIjklMNOPqRsTUVWXYZ..."
[INFO] 2025-01-16T10:46:06Z: Exchanging authorization code for access token...
[INFO] 2025-01-16T10:46:07Z: Access token acquired successfully. Token: "eyJhbGciOiJSUzI1NiIsIn..."
[INFO] 2025-01-16T10:46:10Z: Initiating OBO token request for Microsoft Graph API.
[INFO] 2025-01-16T10:46:12Z: OBO token acquired successfully. Token: "eyJhbGciOiJSUzI1NiIsIn..."
[INFO] 2025-01-16T10:46:12Z: Making Microsoft Graph API call to fetch user info.
[INFO] 2025-01-16T10:46:14Z: API call successful. User info retrieved:
- Display Name: Suresh surya
- Email: [email protected]