Comments (4)
Hello,
I had to provide pipe:0 input to ffmpeg to ingest bytes on the fly and provide metadata about the bytes beforehand to make it work.
Thanks!
from sherpa-onnx.
I am getting raw audio bytes
Everything in the computer is represented in bytes.
Given that you did not describe any metadata about the bytes, it is not possible to tell you how to do with your bytes, since
the bytes can represent any thing.
from sherpa-onnx.
These bytes are from an audio stream, I am sending the bytes to be decoded by ffmpeg and then read by Sherpa_ONNX. The audio stream is from online-websocket-client-microphone.py which is sending the microphone bytes to the websocket I had given above:
recognizer = create_recognizer(args)
byte = await websocket.recv()
ffmpeg_cmd = [
"ffmpeg",
"-i",
byte,
"-f",
"s16le",
"-acodec",
"pcm_s16le",
"-ac",
"1",
"-ar",
"16000",
"-",
]
However ffmpeg does not give any output in stdout and it is stuck at the first frame
data = process.stdout.read(frames_per_read * 2)
if not data:
break
do we need to send a specific chunk of bytes as input to ffmpeg? currently it is receiving byte length of 3200
from sherpa-onnx.
I suggest that you save the output of ffmpeg to a file and check the file.
from sherpa-onnx.
Related Issues (20)
- Linux下使用流式zipformer指定配置热词失败 HOT 1
- max speakers for speaker embedding manager HOT 3
- Build error on MacOS 14.5 with go-api-example/real-time-speech-recognition-from-microphone HOT 12
- [Feature] Handling onnxrt execution provider config for various models HOT 3
- Add speech enhancement feature HOT 1
- What natural languages does this library support? HOT 9
- DartApi使用whisper模型翻译中文音频报错 HOT 1
- Buid failed on windows with cuda HOT 2
- Error when running tts model HOT 2
- 大佬有没有微信交流群或者qq群啊,,我目前还不太理解这些代码,另外我有需求转换largeV3转onnx,这个有什么方法吗 HOT 2
- Offline Recognizer - Passing the Language for Multi-Language Models HOT 5
- 希望nuget能加个cuda版本的sherpa-onnx库 HOT 1
- 将keyword-spotting-from-files改成了从麦克风读取但没有效果 HOT 2
- Some tts engines are crashing since 1.10.13 (Android) HOT 1
- Add useful whisper features
- Voice conversion HOT 1
- libtool: error: unrecognised option: '-static' on Mac M1 HOT 3
- VAD segment length cap at around 20s HOT 1
- 语音识别测试使用非流式模型比流式模型识别率更高,是否可以更换NAudio组件录音wav文件 HOT 1
- 【flutter】The UI process will stall HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sherpa-onnx.