GithubHelp home page GithubHelp logo

eeskimez / emotalkingface Goto Github PK

View Code? Open in Web Editor NEW
161.0 161.0 29.0 86.27 MB

The code for the paper "Speech Driven Talking Face Generation from a Single Image and an Emotion Condition"

License: MIT License

Python 100.00%

emotalkingface's People

Contributors

eeskimez avatar joecodecreations avatar yzyouzhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

emotalkingface's Issues

about the training receipy

Thank you very much for open source your code and model. I tried to trained your code. It works fine. I just want to know how many epoch did you train for the pre-train disc emotion and pre-train the generator. Thanks.

how to preserve a better identity?

When I use an image of mine it is distorted a lot and does not look like the original face, is there any parameter to improve the identity?

Error in Data preparation

When I run the following code: python .\data_prep\prepare_data.py -i \25_fps_video_folder\ -o \output_folder --mode 1 --nw 1
I faced with this problem:
TypeError: Can't parse 'center'. Sequence item with index 0 has a wrong type
image
Can you please give your suggestions to handle it?

About personal test wav result

Thanks for your great work! I use the original wav extracted from flv to test the pretrained model, it will return the good result. However, when I use my original wav, sometime the image is blur and deformed. Can you give me some suggestion to solve it. Thanks in davace.

Error when pre-train emoticon discriminator!

Traceback (most recent call last):
File "train.py", line 204, in
train()
File "train.py", line 130, in train
train_loader = torch.utils.data.DataLoader(trainDset,
File "C:\Users\INHA\anaconda3\envs\face\lib\site-packages\torch\utils\data\dataloader.py", line 268, in init
sampler = RandomSampler(dataset, generator=generator)
File "C:\Users\INHA\anaconda3\envs\face\lib\site-packages\torch\utils\data\sampler.py", line 102, in init
raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0

How to train gan for face movements without caring about emotions

Hi everyone, is there anyway to train this model to work on lips movement without caring about emotions? I'm trying to use my dataset but it does not have emotion labels. I was wondering if there was anyway to replicate the lips movement from a video to an image without having to label the emotions for every video since they are video taken from youtube and in different languages.

how to continue training

Hi, my pc crashed while training the model. Is there anyway to continue the training from the last save?

Recreating training from scratch

Hi, I am trying to retrain everything from scratch.

Could I request further details on the training hyperparameters used for all three stages: Emotion discrimination, pretraining, training? Or perhaps on the final expected loss values for those plotted on tensorboard?

As mentioned in another issue raised using the pretrained model is fine. However, trying to train the model from scratch (as instructed via github) outputs different results. At best I have only been able to simulate opening of the mouth. I noticed that there are some discrepancies between the paper and the source code. (e.g. Discriminator's LR was said to be all 1e-4 but are defaulted to lower values inside train.py, frames per sample is stated to be 32 in the paper but is defaulted to 25 in train.py). I have also cut the training to shorter than the default 1000 epochs since 100k iterations stated in the paper seem to be around 100 epochs only if I understand this correctly.

Apologies for the nitpicking. I'm trying to recreate the retraining and am at a loss. Your response would greatly help me in the reducing the time for experimenting and recreating the training.

Training gives same frame in whole video

Hi,
We have been trying to run the code. The pretrained model which you have provided works perfectly fine, however when we train the model from scratch, the generated output is always same frame in the whole video with audio running in background and no changes in facial or lip features.
What could be the potential reasons for this observation? Was such an observation noticed by you while doing training?

The following are the changes we did to the code:

  1. We have been trying to train this model on only 2 emotions: Happy and Sad. Rest all emotions are removed when creating the dataset. Also we selected only subset of dataset for these 2 emotions (around 500 videos)
  2. Pretraining of discriminator and generator for 5 epochs and performed the joint training for 7 epochs

Is this due to incorrect dataset preparation or absence of other emotions (like Neutral face) or incomplete training (very less epochs) or any other reason?

Thanks in advance

Alignment and cropping

we aligned the ground-truth, baseline, and proposed videos into a template image and cropped them into the same size using similarity transformation.

How do u do it?

Pretrain emotion disc

The classification-loss while pre-training the emotion-disc should go down to what value?

inference error

Hi. I have the following error when I used sample image img01.png and speech file speech01.wav.

python generate.py -im ./data/image_samples/img01.png -is ./data/speech_samples/speech01.wav -m ./model/ -o ./results/

result message : Image file is not valid...

Could you guess what I miss?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.