I am experimenting a bit with lhotse integration in asteroid here: <a href="https:

Source Separation Integration: sum(sources + background_noise) != mixture with mels. about lhotse HOT 5 CLOSED

lhotse-speech commented on September 7, 2024

Source Separation Integration: sum(sources + background_noise) != mixture with mels.

from lhotse.

Comments (5)

danpovey commented on September 7, 2024

Can you please show an example? Wonder how often it's that different; which mel bin is most different; etc.

from lhotse.

popcornell commented on September 7, 2024

I don't know how much useful it can be, but here are some plots for now ( I can compute also some stats on the difference distribution). There is a difference of over 3.9 for one bin and it is very strange.

Loaded mixture feats:

On the fly np.log(np.sum(np.exp(c_sources), 0) + np.exp(c_noise)):

Abs difference between the two:

from lhotse.

danpovey commented on September 7, 2024

That looks fine to me, as long as the difference isn't too consistent one way or the other. Sometimes the signal will be exactly in or out of phase and you won't get the exact energy you expect. It's not a problem.

…

On Sun, Jun 28, 2020 at 10:13 PM Samuele Cornell ***@***.***> wrote: I don't know how much useful it can be, but here are some plots for now ( I can compute also some stats on the difference distribution). There is a difference of over 3.9 for one bin and it is very strange. Loaded mixture feats: [image: c_mix] <https://user-images.githubusercontent.com/18726713/85949807-27d94600-b959-11ea-971d-392f7b6d1c8f.png> On the fly np.log(np.sum(np.exp(c_sources), 0) + np.exp(c_noise)): [image: onthefly] <https://user-images.githubusercontent.com/18726713/85949824-3e7f9d00-b959-11ea-99f4-8ceaac90a5cb.png> Abs difference between the two: [image: difference] <https://user-images.githubusercontent.com/18726713/85949846-61aa4c80-b959-11ea-9e5e-bf01b2deb646.png> — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#38 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZFLOZS7TF6KA6BQF5YMDDRY5FW5ANCNFSM4OKRF54Q> .

from lhotse.

popcornell commented on September 7, 2024

Thank you very much.
I'll try to train two systems (when i will have some spare GPUs) for separation in feature domain.
One with mixing on-the-fly as above and one without and see what happens. In the past I have always mixed the features on-the-fly and had decent results.

My main concern is that it is sorta like using "noisy labels" for separation.
And because the separation is done on mels (and not in log-mels) those differences actually can be even more substantial and it could be difficult for the DNN to learn a mask for each speaker with that amount of "noise" in the oracle targets.

from lhotse.

pzelasko commented on September 7, 2024

I'm closing as it seems stale - if there're any new developments be sure to let us know!

from lhotse.

Recommend Projects

Source Separation Integration: sum(sources + background_noise) != mixture with mels. about lhotse HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs