glam-imperial / emotionalconversionstargan Goto Github PK

This repository contains code to replicate results from the ICASSP 2020 paper "StarGAN for Emotional Speech Conversion: Validated by Data Augmentation of End-to-End Emotion Recognition".

Python 99.80% Shell 0.20%

generative-adversarial-network stargan stargan-vc data-augmentation emotion-recognition speech-synthesis deep-learning deep-neural-networks icassp-2020 icassp

emotionalconversionstargan's People

Contributors

Stargazers

Watchers

emotionalconversionstargan's Issues

Detailed instructions to train

Hi,Could you please write some detailed instructions on

1.Required datasets
2.Pre-processing data for training
3.How to train

Thanks so much.

Is there a pretrained model we can download?

Thank you!

Does "<path/to/model_checkpoint.ckpt>" mean "./checkpoints/model_step2/30000",when perform conversion

I have run this code and planed to covert some samples,but I found the converted audios were noisy. Because I didn't change the code,so does "<path/to/model_checkpoint.ckpt>" mean "./checkpoints/model_step2/30000",when perform conversion.

which is the directory of Ses01F_impro01.txt?

Session1\dialog\transcriptions
Session1\dialog\MOCAP_rotated
Session1\dialog\MOCAP_head
Session1\dialog\MOCAP_hand
Session1\dialog\EmoEvaluation
which is the directory of Ses01F_impro01.txt?

UnboundLocalError: local variable 'dimensions_dis' referenced before assignment

Traceback (most recent call last):
File "/disk2/zhh/EmotionalConversionStarGAN-master/run_preprocessing.py", line 214, in
run_preprocessing(args)
File "/disk2/zhh/EmotionalConversionStarGAN-master/run_preprocessing.py", line 199, in run_preprocessing
generate_world_features(audio_filenames, data_dir)
File "/disk2/zhh/EmotionalConversionStarGAN-master/run_preprocessing.py", line 87, in generate_world_features
wav, labels = get_wav_and_labels(f, data_dir)
File "/disk2/zhh/EmotionalConversionStarGAN-master/utils/data_preprocessing_utils.py", line 111, in get_wav_and_labels
labels = concatenate_labels(category, speaker, dimensions, dimensions_dis)
UnboundLocalError: local variable 'dimensions_dis' referenced before assignment

how to solve this problem?

Learned speech pronunciation problems

Hello.

I ran the same code as the dataset, but I have a problem with the pronunciation part.

Unlike the example voice you uploaded, your pronunciation is crushed and you hear a voice that seems to have drowned.

Please let me know if you know why this problem occurred.

Thank you.

num_emos should be 4 not 3

For anyone trying to reproduce the results, edit the config files so that num_emos is 4, not 3. There are four emotions: angry, happy, sad, and neutral. Since the neutral emotion is assigned the label 3, if num_emos is 3, you will end up excluding the neutral emotion during training.

glam-imperial / emotionalconversionstargan Goto Github PK

emotionalconversionstargan's People

Contributors

Stargazers

Watchers

Forkers

emotionalconversionstargan's Issues

Detailed instructions to train

Is there a pretrained model we can download?

Does "<path/to/model_checkpoint.ckpt>" mean "./checkpoints/model_step2/30000",when perform conversion

which is the directory of Ses01F_impro01.txt?

UnboundLocalError: local variable 'dimensions_dis' referenced before assignment

Learned speech pronunciation problems

num_emos should be 4 not 3

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs