GithubHelp home page GithubHelp logo

Comments (4)

sw005320 avatar sw005320 commented on May 27, 2024

Thanks for raising the issue.
@Emrys365, can you answer it for me?

from espnet.

Emrys365 avatar Emrys365 commented on May 27, 2024

@AntoineBlanot Could you paste the content of run.sh and the model config file (.yaml) you used?

from espnet.

AntoineBlanot avatar AntoineBlanot commented on May 27, 2024

@AntoineBlanot Could you paste the content of run.sh and the model config file (.yaml) you used?

Sure ! Here there are:
run.sh

#!/usr/bin/env bash
# Set bash to 'debug' mode, it will exit on :
# -e 'error', -u 'undefined variable', -o ... 'error in pipeline', -x 'print commands',
set -e
set -u
set -o pipefail

sample_rate=16k # 8k or 16k
min_or_max=min  # "min" or "max". This is to determine how the mixtures are generated in local/data.sh.


train_set="train"
valid_set="dev"
test_sets="test "

CUDA_VISIBLE_DEVICES=0,1 ./enh.sh \
    --is_tse_task true \
    --train_set "${train_set}" \
    --valid_set "${valid_set}" \
    --test_sets "${test_sets}" \
    --fs "${sample_rate}" \
    --ref_num 2 \
    --local_data_opts "--sample_rate ${sample_rate} --min_or_max ${min_or_max}" \
    --lang en \
    --ngpu 2 \
    --enh_config ./conf/train.yaml \
    "$@"

train.yaml

optim: adam
max_epoch: 100
batch_type: folded
batch_size: 16
iterator_type: chunk
chunk_length: 48000
# exclude keys "enroll_ref", "enroll_ref1", "enroll_ref2", ...
# from the length consistency check in ChunkIterFactory
chunk_excluded_key_prefixes:
- "enroll_ref"
num_workers: 4
optim_conf:
    lr: 1.0e-03
    eps: 1.0e-08
    weight_decay: 0
unused_parameters: true
patience: 20
accum_grad: 1
grad_clip: 5.0
val_scheduler_criterion:
- valid
- loss
best_model_criterion:
-   - valid
    - snr
    - max
-   - valid
    - loss
    - min
keep_nbest_models: 1
scheduler: reducelronplateau
scheduler_conf:
   mode: min
   factor: 0.7
   patience: 3

model_conf:
    num_spk: 2
    share_encoder: true

train_spk2enroll: data/train-100/spk2enroll.json
enroll_segment: 48000
load_spk_embedding: false
load_all_speakers: false

encoder: conv
encoder_conf:
    channel: 256
    kernel_size: 32
    stride: 16
decoder: conv
decoder_conf:
    channel: 256
    kernel_size: 32
    stride: 16
extractor: td_speakerbeam
extractor_conf:
    layer: 8
    stack: 4
    bottleneck_dim: 256
    hidden_dim: 512
    skip_dim: 256
    kernel: 3
    causal: False
    norm_type: gLN
    pre_nonlinear: prelu
    nonlinear: relu
    # enrollment related
    i_adapt_layer: 7
    adapt_layer_type: mul
    adapt_enroll_dim: 256
    use_spk_emb: false

# A list for criterions
# The overlall loss in the multi-task learning will be:
# loss = weight_1 * loss_1 + ... + weight_N * loss_N
# The default `weight` for each sub-loss is 1.0
criterions:
  # The first criterion
  - name: snr
    conf:
      eps: 1.0e-7
    wrapper: fixed_order
    wrapper_conf:
      weight: 1.0

from espnet.

Emrys365 avatar Emrys365 commented on May 27, 2024

Thank you! I think the error is caused by the default value of the argument load_all_speakers (=false) and in TSEPreprocessor. So it will only prepare one reference signal (corresponding to one of the speakers in each mixture sample) as the target.

To avoid this error, you could modify its value to True in train.yaml.

Sorry about this mistake, I will also make a PR to update the related files.

from espnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.