我正在尝试在 NodeJS 应用程序中使用 Cloud Speech-To-Text v2。我的代码与 v1 配合良好:
import { v1 } from '@google-cloud/speech';
const { SpeechClient } = v1;
let speechClient: v1.SpeechClient | undefined;
speechClient = new SpeechClient(); // GOOGLE_APPLICATION_CREDENTIALS is set to ./google-credentials.json, which exists, is valid and has the Roles 'Cloud Speech CLient', Cloud Speech-to-Text Service Agent' and 'Cloud Translation API User'
const googleCloudSpeechConfig = {
config: {
encoding: "LINEAR16",
sampleRateHertz: 16000,
languageCode: "EN-US"
},
interimResults: true,
};
recognizeStream = speechClient
.streamingRecognize(config)
.on('error', (err) => { ... })
.on('data', async (data) => { ... });
但是,当我使用
切换到 v2 时import { v2 } from '@google-cloud/speech';
const { SpeechClient } = v2;
let speechClient: v2.SpeechClient | undefined;
我收到错误
Error: 3 INVALID_ARGUMENT: Invalid resource field value in the request.
at callErrorFromStatus (/Users/dominik/Documents/Git Projects/event_subtitles/server/node_modules/@grpc/grpc-js/src/call.ts:82:17)
at Object.onReceiveStatus (/Users/dominik/Documents/Git Projects/event_subtitles/server/node_modules/@grpc/grpc-js/src/client.ts:705:51)
at Object.onReceiveStatus (/Users/dominik/Documents/Git Projects/event_subtitles/server/node_modules/@grpc/grpc-js/src/client-interceptors.ts:419:48)
at /Users/dominik/Documents/Git Projects/event_subtitles/server/node_modules/@grpc/grpc-js/src/resolving-call.ts:132:24
at processTicksAndRejections (node:internal/process/task_queues:77:11)
for call at
at ServiceClientImpl.makeBidiStreamRequest (/Users/dominik/Documents/Git Projects/event_subtitles/server/node_modules/@grpc/grpc-js/src/client.ts:689:42)
at ServiceClientImpl.<anonymous> (/Users/dominik/Documents/Git Projects/event_subtitles/server/node_modules/@grpc/grpc-js/src/make-client.ts:189:15)
at /Users/dominik/Documents/Git Projects/event_subtitles/server/node_modules/@google-cloud/speech/build/src/v2/speech_client.js:317:29
at /Users/dominik/Documents/Git Projects/event_subtitles/server/node_modules/google-gax/build/src/streamingCalls/streamingApiCaller.js:46:28
at /Users/dominik/Documents/Git Projects/event_subtitles/server/node_modules/google-gax/build/src/normalCalls/timeout.js:44:16
at StreamProxy.setStream (/Users/dominik/Documents/Git Projects/event_subtitles/server/node_modules/google-gax/build/src/streamingCalls/streaming.js:144:24)
at StreamingApiCaller.call (/Users/dominik/Documents/Git Projects/event_subtitles/server/node_modules/google-gax/build/src/streamingCalls/streamingApiCaller.js:54:16)
at /Users/dominik/Documents/Git Projects/event_subtitles/server/node_modules/google-gax/build/src/createApiCall.js:84:30
at processTicksAndRejections (node:internal/process/task_queues:95:5) {
code: 3,
details: 'Invalid resource field value in the request.',
metadata: Metadata {
internalRepr: Map(2) {
'google.rpc.errorinfo-bin' => [Array],
'grpc-status-details-bin' => [Array]
},
options: {}
},
statusDetails: [
ErrorInfo {
metadata: [Object],
reason: 'RESOURCE_PROJECT_INVALID',
domain: 'googleapis.com'
}
],
reason: 'RESOURCE_PROJECT_INVALID',
domain: 'googleapis.com',
errorInfoMetadata: {
service: 'speech.googleapis.com',
method: 'google.cloud.speech.v2.Speech.StreamingRecognize'
}
}
怎么会?
我还注意到这些类型看起来很奇怪。
v2.SpeechClient.streamingRecognize
期望类型
streamingConfig?: google.cloud.speech.v1.IStreamingRecognitionConfig | google.cloud.speech.v1p1beta1.IStreamingRecognitionConfig | undefined
这看起来很奇怪。
此外,文档位于
https://cloud.google.com/speech-to-text/v2/docs/streaming-recognize
有 Python 示例,但没有 NodeJS 示例。
我做错了什么?
我知道这是重复的 如何设置流式传输在 Node.js 中识别 Google Cloud Speech To Text V2?
但是那里没有有效的答案。有这样的: https://github.com/GoogleCloudPlatform/nodejs-docs-samples/issues/3578
但它对我不起作用,我收到错误:3 INVALID_ARGUMENT,正如另一个用户也报告的那样。
我终于让它发挥作用了。我遵循 TypeScript 定义提供的类型来理解所需的嵌套配置结构。如果您使用 JS 代码,只需省略类型即可。
首先要注意三件事:
这是我的有效代码:
const recognitionConfig: google.cloud.speech.v2.IRecognitionConfig = {
autoDecodingConfig: {},
explicitDecodingConfig: {
encoding: event.encoding,
sampleRateHertz: event.sampleRateHertz,
audioChannelCount: 1,
},
languageCodes: [event.languageCode],
model: 'long' // video does not exist in v2
}
const streamingRecognitionConfig: google.cloud.speech.v2.IStreamingRecognitionConfig = {
config: recognitionConfig,
streamingFeatures: {
interimResults: true,
}
}
const streamingRecognizeRequest: google.cloud.speech.v2.IStreamingRecognizeRequest = {
recognizer: `projects/${GOOGLE_PROJECT_ID}/locations/global/recognizers/_`,
streamingConfig: streamingRecognitionConfig,
};
recognizeStream = speechClient
._streamingRecognize()
.on('error', (err) => {
console.error(err);
})
.on('data', async (data) => { ... });
recognizeStream.write(streamingRecognizeRequest); // Do this once and only once
发送音频垃圾时,一定要发送
recognizeStream.write({ audio: data }); // where data is your audio chunk
记下 GOOGLE_PROJECT_ID,您在其中放置项目 ID。您可以在 Google Cloud Console 中找到它。
现在关于识别器 URL - 如果我使用其他区域,调用会失败。我怀疑你必须首先创建一个识别器才能做到这一点。更多内容请参阅本期。