I have stereo voice file. Both of the channels can be seperated into left and right ch

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Stereo Voice File Diarization about whisper-diarization HOT 4 CLOSED

mahmoudashraf97 commented on August 11, 2024

Stereo Voice File Diarization

from whisper-diarization.

Comments (4)

KevinGeLe commented on August 11, 2024

I might look into it and make a Pull request. I'm thinking of atlease adding ffmpeg, so it automatically converts to the correct format and adding a parser, so it will connect channel 2 with channel 1 or split them into their own output.

if you want to edit the code now, you just have to add a way to convert the audio file between the input parser and vocal_target.

I think thats what you want?
@Talhazeb

from whisper-diarization.

MahmoudAshraf97 commented on August 11, 2024

Hi @Talhazeb , do you mean that each speaker is on a separate channel? if so I don't think diarization is needed since speaker separation is already done

@KevinGeLe feel free to contribute to the repo if you see any changes that need to be done

from whisper-diarization.

Talhazeb commented on August 11, 2024

@KevinGeLe Thanks for informing
@MahmoudAshraf97 How should I proceed with it then. I am actually working on some files which some of them are mono and some of them are stereo. For mono the shared repo works flawless. For stereo, what approach should I take if I have to process it from the same process but have it diarized meaning kind of sequential pyannote rttm file having transcription. If I wouldn't be passsing it through diarization, what approach should I take?

from whisper-diarization.

MahmoudAshraf97 commented on August 11, 2024

you can pass both mono and stereo files with no problems, if you need the rttm files, comment the last line in the code that deletes the temp_outputs folder and you'll find it inside nemo/pred_rttms

from whisper-diarization.

Recommend Projects

Stereo Voice File Diarization about whisper-diarization HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs