GithubHelp home page GithubHelp logo

sound-separation's Introduction

sound-separation's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sound-separation's Issues

Missing train.py file ?

Hey,

Thanks for the ressources !

In the MixIT training recipes, a train file should be added with the implementation of mixit, signal_transformer, consistency, etc..

For example so that those imports are resolvable, and we can reproduce the results.

Could you please share it ?

Thanks !
Manu

Separate tarball for evaluation?

If I'm not mistaken, if we run the pipeline with data augmentation, we currently end up without evaluation data.
This means that, to evaluate the system trained with augmented data, we need to additionally download the dry or reverberated mixtures, depending on the task.

While downloading the tarball with train/valid/eval doesn't take so much space, maybe it would make sense to have separate tarball for dry and reverberated evaluation datasets.

Dataloss error

Not able to load the pretrained model using the inference.py file. Execution stops with dataloss error.. How can i solve this issue.

Problem with reverb

Hi there,

There is a problem of alignment in reverberate_and_mix that translate the audio when applying the reverb.
This is a problem about the alignment with the labels.

I'm putting this issue here since multiple people asked to use this work: https://hal.inria.fr/hal-02891700 and this work should be updated with the reverb fixed.
The best way to update https://github.com/turpaultn/dcase20_task4 is to update sound-separation repo and I pull the last version.

I want to use your model on the iPhone

But converting it with tf_coreml, it tells erros like : NotImplementedError: Unsupported Ops of type: PadV2,GatherV2,RFFT,ComplexAbs,IRFFT.

Do you have a way converting the model to core_ml format? or give me some guide lines.

appreciate .

help need linpoly

hi linpoly need your help , can you give me your contact id or somewhere where i can contact you .

No Checkpoint found in Model

After downloading the birdsong separation model using the gsutil command, I only get a meta, index, and data file. There is no checkpoint file which causes checkpoint_path = tf.train.latest_checkpoint('bird_mixit_model_checkpoints/output_sources4/') to return None.

gsutil command:

gsutil -m cp -r \
  "gs://gresearch/sound_separation/bird_mixit_model_checkpoints" .

files downloaded:

bird_mixit_model_checkpoints
    LICENSE
    README
    output_sources8
        model.ckpt-2178900.index
        inference.meta
        model.ckpt-2178900.data-00000-of-00001
    output_sources4
        model.ckpt-3223090.index
        model.ckpt-3223090.data-00000-of-00001
        inference.meta

Question about the number of masks

I already study about the sound-separation papers, like "UNIVERSAL SOUND SEPARATION" and "Conv-TasNet".
But, I have a question about the masks.
How do you decide on the number of masks?
Is always be four in this competition?
If a mixture wav only has 2 sources, is final masks only two has value, the other two masks are zero?
I don't have any idea about it.

train_sed+ss_baseline

In the dcase2020_desed_fuss_baseline ,when I try to run this command ./make_baseline_file_lists.sh, I don't know the correct value of the "DESED_ROOT_DIR" in the setup.sh.May I ask the details of the DESED_ROOT_DIR?Thank you

.

.

jams annotation order vs filenames?

Hey, I'm confused about whether the order of the FUSS jams annotations relates to the filenames of the separated sources (eg., background0, foreground0, foreground1, etc.)

I downloaded the dry ssdata from zenodo. For some examples, it seems like the order of the annotations in the JAMS file (ordered by time I believe) is different than the ordering of the foreground sounds in the filenames. For example:

>>> m = jams.load("./ssdata/train/example13537.jams")
>>> m["annotations"][0].data
SortedKeyList([Observation(time=0.0, duration=10.0, value={'label': 'sound', 'source_file': '/data/DCASE2020/fsd_data/train/sound/155571.wav', 'source_time': 1.3854725120886693, 'event_time': 0, 'event_duration': 10.0, 'snr': 0, 'role': 'background', 'pitch_shift': None, 'time_stretch': None}, confidence=1.0), 
Observation(time=0.6711880000000008, duration=9.328812, value={'label': 'sound', 'source_file': '/data/DCASE2020/fsd_data/train/sound/372821.wav', 'source_time': 0.0, 'event_time': 0.6711880000000008, 'event_duration': 9.328812, 'snr': -0.3880649923842725, 'role': 'foreground', 'pitch_shift': None, 'time_stretch': None}, confidence=1.0), 
Observation(time=1.6686094265599583, duration=4.748812, value={'label': 'sound', 'source_file': '/data/DCASE2020/fsd_data/train/sound/375026.wav', 'source_time': 0.0, 'event_time': 1.6686094265599583, 'event_duration': 4.748812, 'snr': 23.017906447229286, 'role': 'foreground', 'pitch_shift': None, 'time_stretch': None}, confidence=1.0), 
Observation(time=3.5576066287429065, duration=1.59025, value={'label': 'sound', 'source_file': '/data/DCASE2020/fsd_data/train/sound/349792.wav', 'source_time': 0.0, 'event_time': 3.5576066287429065, 'event_duration': 1.59025, 'snr': 12.441491562340214, 'role': 'foreground', 'pitch_shift': None, 'time_stretch': None}, confidence=1.0)], key=<bound method Annotation._key of <class 'jams.core.Annotation'>>)

But looking at the separated sources, it looks like foreground1 is the sound that begins at 3.5sec whereas foreground2 begins at 1.66s (whereas I expected the opposite)
image

I'm wondering if I'm missing how to order jams annotations so that they will consistently match up with the indexes in the filenames? thanks so much!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.