speech-recognition 相关问题

语音识别(SR)是计算语言学的跨学科子领域,它将语言学,计算机科学和电气工程领域的知识和研究结合起来,开发出能够通过计算机识别和翻译口语的方法和技术。和计算机化的设备,如分类为智能技术和机器人技术的设备

无法通过Homebrew

在src/pyaudio/device_api.c中包含的文件:1: 在src/pyaudio/device_api.h中包含的文件中:7: /library/frameworks/python.framework/versions/3.13/include/python3.13/python.h:19:19:10 :.....

回答 1 投票 0

google语音识别api

https://www.google.com/speech-api/v2/recognize?...

回答 5 投票 0



使用Python自动开放Google或YouTube中的搜索结果链接

但如何打开这些搜索结果的链接?

回答 2 投票 0

android webkitspeechRevention .fimpinal变量未显示正确的值

var recognition = new webkitSpeechRecognition(); recognition.onresult = function (e) { //This is called every time the Speech Recognition hears you speak. //You may say "How's it going today", the recognition will try to //interpret what you're saying while you're speaking. For example, while //you're speaking it may go.. "house" "how's it going" "how's it going today" //as it interprets it returns an object that contains properties, one of //which is "e.results[i].isFinal" where "i" is an array of returned objects. //In this case the object with a transcript of "house" would have a //"e.results[i].isFinal" value of false. Where as the object with a transcript //of "how's it going today" would have a "e.results[i].isFinal" value of //true.. Because this is the FINAL INTERPRETATION of this particular transcript. //HOWEVER.. The problem I'm having is that when using a mobile device, the "e.results[i].isFinal" always //has a value of true, even when it's not the final interpretation. It works correctly on desktop however. Both are using Chrome. if(e.results[e.results.length-1].isFinal){ var finalTranscript = ''; for(i=0;i<e.results.length;i++){ finalTranscript += e.results[i][0].transcript; } console.log(finalTranscript); document.getElementById('output').innerHTML = finalTranscript; } }

回答 1 投票 0




用语音API

在OSX Mavericks中,现在包括语音说法,非常有用。我正在尝试使用命令能力来创建自己的数字生活助手,但找不到如何使用识别...

回答 1 投票 0

android的extrageRognizer用extra_audio_source仍在听麦克风而不是从文件

private var recorder: MediaRecorder? = null private var recognizer: SpeechRecognizer? = null private val mediaFormat = MediaRecorder.OutputFormat.MPEG_4 private val audioEncoding = MediaRecorder.AudioEncoder.DEFAULT private var currentRecordingFile: String = "recording_0.3gp" private var recordingParcel: ParcelFileDescriptor? = null // [ {"text": "speech to text result", "file": "path to clip recording"}, "time": "datetime" ] private var translations = mutableStateListOf<Map<String, String>>() private fun startTalking () { startRecording() } private fun stopTalking () { stopRecording() startRecognizing() } private fun startRecording () { val num = translations.count() currentRecordingFile = "$externalCacheDir/recording_$num.3gp" recorder = MediaRecorder(this).apply { setAudioSource(MediaRecorder.AudioSource.MIC) setOutputFormat(mediaFormat) setAudioEncoder(audioEncoding) setAudioChannels(1) setAudioSamplingRate(16000) setAudioEncodingBitRate(64000) setOutputFile(currentRecordingFile) try { prepare() } catch (e: IOException) { Log.e("startRecording", e.toString()) } start() } } private fun stopRecording () { recorder?.apply { stop() release() } recorder = null } private fun startRecognizing () { val file = File(currentRecordingFile) recordingParcel = ParcelFileDescriptor.open(file, ParcelFileDescriptor.MODE_READ_ONLY) val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH) intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "in-ID") intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_PREFERENCE, "in-ID") intent.putExtra(RecognizerIntent.EXTRA_AUDIO_SOURCE, recordingParcel) intent.putExtra(RecognizerIntent.EXTRA_AUDIO_SOURCE_ENCODING, audioEncoding) intent.putExtra(RecognizerIntent.EXTRA_AUDIO_SOURCE_CHANNEL_COUNT, 1) intent.putExtra(RecognizerIntent.EXTRA_AUDIO_SOURCE_SAMPLING_RATE, 16000) try { recognizer = SpeechRecognizer.createSpeechRecognizer(this) recognizer?.setRecognitionListener(this) recognizer?.startListening(intent) } catch (e: Exception) { Log.e("SpeechRecognizer", e.message.toString()) } } private fun stopRecognizing () { recordingParcel?.close() recognizer?.stopListening() recognizer?.destroy() recognizer = null } override fun onError(error: Int) { Log.e("Speech onError", error.toString()) stopRecognizing() } override fun onResults(results: Bundle){ val words: ArrayList<String>? = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION) if (words != null) { val sentence = words.joinToString(separator = " ") val translation = mapOf("text" to sentence, "file" to currentRecordingFile) translations.add(translation) Log.e("CURR RESULT", sentence) } stopRecognizing() }

回答 0 投票 0

Web语音识别的移动设备的解决方法不支持连续聆听

const recognition = new SpeechRecognition(); recognition.continuous = true; recognition.lang = 'en-US'; recognition.onresult = (event) => {...} recognition.start();

回答 0 投票 0



Azure 语音服务连续语音识别

我对 Azure 语音服务还很陌生,我正在使用 twilo/plivo 服务将号码与 azure stt 连接起来,并在转录后进一步处理它。 我的问题是当我说话时,它是

回答 1 投票 0

使用 azure 语音转文本时保存麦克风音频输入

我目前正在我的项目中使用 Azure 语音转文本。它直接从麦克风识别语音输入(这就是我想要的)并保存文本输出,但我也对保存感兴趣......

回答 3 投票 0

Docker Compose 和 React JS 的 WebSocket 连接问题

我遇到本地部署问题。当使用 Docker Compose 运行 Kaldi 服务器和 React JS 前端时,问题出在 WebSocket 连接上。当 Kaldi

回答 1 投票 0

如何在Python AI中添加热词检测

我正在尝试使用speech_recognition模块制作一个python AI, 我想在人工智能中添加一个热词检测功能,所以我尝试使用语音识别模块来实现它,但它不起作用......

回答 1 投票 0

当我断开外部麦克风时,Vosk 语音转文本功能停止工作

在 tauri JS 应用程序中,我正在从 JS 录制音频并对其进行处理,然后通过 rust 处理程序将数据发送到 python 子进程。在 python 脚本中,我使用 vosk 将语音转换为实时文本...

回答 1 投票 0

如何让 AI 在 Node js 中响应它的名字被调用

我正在尝试虚拟辅助,想知道如何让它响应被叫的名字。目前我有一个按钮可以激活listen(),让程序开始监听

回答 1 投票 0

最新问题
© www.soinside.com 2019 - 2025. All rights reserved.