Describe the bug There is a mismatch in the number of speech refe

Thanks for raising the issue. <a class="user-mention notranslate" data-hovercard-t

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

TSE with Librimix: mismatch in number of speakers about espnet HOT 4 OPEN

AntoineBlanot commented on May 27, 2024

TSE with Librimix: mismatch in number of speakers

from espnet.

Comments (4)

sw005320 commented on May 27, 2024

Thanks for raising the issue.
@Emrys365, can you answer it for me?

from espnet.

Emrys365 commented on May 27, 2024

@AntoineBlanot Could you paste the content of run.sh and the model config file (.yaml) you used?

from espnet.

AntoineBlanot commented on May 27, 2024

@AntoineBlanot Could you paste the content of run.sh and the model config file (.yaml) you used?

Sure ! Here there are:
run.sh

#!/usr/bin/env bash
# Set bash to 'debug' mode, it will exit on :
# -e 'error', -u 'undefined variable', -o ... 'error in pipeline', -x 'print commands',
set -e
set -u
set -o pipefail

sample_rate=16k # 8k or 16k
min_or_max=min  # "min" or "max". This is to determine how the mixtures are generated in local/data.sh.


train_set="train"
valid_set="dev"
test_sets="test "

CUDA_VISIBLE_DEVICES=0,1 ./enh.sh \
    --is_tse_task true \
    --train_set "${train_set}" \
    --valid_set "${valid_set}" \
    --test_sets "${test_sets}" \
    --fs "${sample_rate}" \
    --ref_num 2 \
    --local_data_opts "--sample_rate ${sample_rate} --min_or_max ${min_or_max}" \
    --lang en \
    --ngpu 2 \
    --enh_config ./conf/train.yaml \
    "$@"

train.yaml

optim: adam
max_epoch: 100
batch_type: folded
batch_size: 16
iterator_type: chunk
chunk_length: 48000
# exclude keys "enroll_ref", "enroll_ref1", "enroll_ref2", ...
# from the length consistency check in ChunkIterFactory
chunk_excluded_key_prefixes:
- "enroll_ref"
num_workers: 4
optim_conf:
    lr: 1.0e-03
    eps: 1.0e-08
    weight_decay: 0
unused_parameters: true
patience: 20
accum_grad: 1
grad_clip: 5.0
val_scheduler_criterion:
- valid
- loss
best_model_criterion:
-   - valid
    - snr
    - max
-   - valid
    - loss
    - min
keep_nbest_models: 1
scheduler: reducelronplateau
scheduler_conf:
   mode: min
   factor: 0.7
   patience: 3

model_conf:
    num_spk: 2
    share_encoder: true

train_spk2enroll: data/train-100/spk2enroll.json
enroll_segment: 48000
load_spk_embedding: false
load_all_speakers: false

encoder: conv
encoder_conf:
    channel: 256
    kernel_size: 32
    stride: 16
decoder: conv
decoder_conf:
    channel: 256
    kernel_size: 32
    stride: 16
extractor: td_speakerbeam
extractor_conf:
    layer: 8
    stack: 4
    bottleneck_dim: 256
    hidden_dim: 512
    skip_dim: 256
    kernel: 3
    causal: False
    norm_type: gLN
    pre_nonlinear: prelu
    nonlinear: relu
    # enrollment related
    i_adapt_layer: 7
    adapt_layer_type: mul
    adapt_enroll_dim: 256
    use_spk_emb: false

# A list for criterions
# The overlall loss in the multi-task learning will be:
# loss = weight_1 * loss_1 + ... + weight_N * loss_N
# The default `weight` for each sub-loss is 1.0
criterions:
  # The first criterion
  - name: snr
    conf:
      eps: 1.0e-7
    wrapper: fixed_order
    wrapper_conf:
      weight: 1.0

from espnet.

Emrys365 commented on May 27, 2024

Thank you! I think the error is caused by the default value of the argument load_all_speakers (=false) and in TSEPreprocessor. So it will only prepare one reference signal (corresponding to one of the speakers in each mixture sample) as the target.

To avoid this error, you could modify its value to True in train.yaml.

Sorry about this mistake, I will also make a PR to update the related files.

from espnet.

TSE with Librimix: mismatch in number of speakers about espnet HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs