GithubHelp home page GithubHelp logo

evp's People

Contributors

jixinya avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

evp's Issues

3DMM model

Hi,
Awesome work!

I would like know more about 3DMM work. I tried to generate them with a 3dmm model. But didn't get accuracy. Which 3DMM model did you used?

M003 results error

Running the script test_target.sh for M003 using its pretrained texture model produces a sequence of frames (shown below) in which the texture for the ears, neck and portions of the hair disappear. The output for the other test subject M030 does not contain such an anomaly. Could you explain what could possibly cause this issue and how to rectify it?
fake_B_000002
fake_B_000005
fake_B_000010
fake_B_000034

question about training

There are some problems when I running python train/disentanglement/dtw/MFCC_dtw.py.

  1. Can you tell the function in 132-140?
    for j in range(13):
    f=open(filepath+str(j)+'/'+str(i)+'.pkl','rb')
    mfcc=pickle.load(f)
    f.close()
    for k in range(len(mfcc)):
    mfcc_k = mfcc[k]
    with open(os.path.join(con_path,str(k+n)+'.pkl'), 'wb') as f:
    pickle.dump(mfcc_k, f)
    n += len(mfcc)
  2. the dimension of mfcc_k is (13, ) is right? The dimension of mfcc_k can not feed into network.(it will cause dismatch dimension)

Waiting your reply.
Thanks.

can not download pre-trained model

it shows
抱歉,您目前无法查看或下载此文件。

最近查看或下载此文件的用户过多。请稍后再尝试访问此文件。如果您尝试访问的文件特别大或由很多人一起共享,则可能需要长达 24 小时才能查看或下载该文件。如果 24 小时后您仍然无法访问文件,请与您的域管理员联系。
or maybe 503 error

About the landmark MEAN and PCA

Thank you for the great work.
Could you please share the detail of calculating one specific person's landmark mean and pca?
Take a specific person's video data in MEAD for example. We have multiple viewpoints, emotions and intensities. Could you please tell which videos are used to calculate the landmark and produce mean and pca? Like front view with intensity 3? or front view with intensity 2? or some other combination or even all the videos.

I am trying to pre-process the MEAD dataset and do some face reenactment work, I would appreciate it a lot if you may share the pre-processing details.
Looking forward to the reply!

About Dynamic Timing Warping preprocess mfcc features

Given two audio features of shape [N1, 28, 12] and [N2, 28, 12], could you please show me the demo code of align them using DTW methods?
When training the disentanglement module, lots of audio features are used. Did you align all of them to the minimum length?
I am really curious about the details of audio feature preprocessing.
Thank you for your great work and looking forward to your reply!

Ground Truth Validation Landmarks

Could you provide the ground truth landmarks used for validating the models. Specifically I' m looking for the test landmarks of the M030 identity.

The meaning of the output

Hi and thanks for sharing the great work!

I follow your instructions and successfully run the test scripts to generate output. But I am confused about the meaning of the lm2video output 'M003_01_3_output_01/', 'M003_02_3_output_01'...

Could you please briefly explain what is the driving audio for synthesizing the results and where can we find the corresponding lip-sync video?

Looking forward to your response. Thanks.

The implementation of "Cross-Reconstructed Emotion Disentanglement" module

Hi @jixinya,
Thanks for your excellent work!
I am curious about the implementation of Cross-Reconstructed Emotion Disentanglement. In the paper, you say, "Given four audio samples Xi,m, Xj,n, Xj,m, Xi,n" for disentangling. However, the implementation in this project is a little different: you sample 2 emotions and 3 contents and set X11, X21, X12, X23 as inputs, and X11, X11, X12, X12 as four targets to calculate the cross reconstruction loss and self reconstruction loss. (as below)

return {"input11": mfcc11, "target11": target11,
"target21": target11, "target22": target12,
"input12": mfcc12, "target12": target12,
"label1": label1, "label2": label2,
"input21": mfcc21, "input32": mfcc32

Could you please explain this? Hope for your response.

Pre-trained models of Test subjects

Thanks for releasing the code and pre-trained models. Could you please release the pre-trained models and preprocessed data for all the other MEAD test subjects used in EVP.

About the DECODER used in the Cross-reconstruction Emotion Disentanglement Module

Thank you for the great work and the disentanglement of content and emotion features are really novel.
When I re-product this module, I get frustrated about the decoder structure. Could you show me the demo code?
Say we get the concatenation of content and emotion features of shape [Batchsize, N, content_dim+emotion_dim], how to convert it to the mfcc features of shape [Batchsize, N, 28, 12]?
Looking forward to your reply and thank you in advance!

The effectiveness on "Cross-Reconstructed Emotion Disentanglement" module

To ensure audio emotion and speech content are disentangled, you design a Cross-Reconstructed Emotion Disentanglement module in paper. In my opinion, emotion encoder and content encoder should be freeze once the disentanglement training if finished. But i found that the two pretrained models of two different subjects you provide has totally different weights in the emotion encoder and content encoder. Thus i guess that you finetune these two encoders together with other parts when you train your audio2lm module, but how can you guarantee the disentanglement once you finetune these two encoders?

3DMM parameters

Hi,

I am trying to test with my own dataset but have some trouble with generating 3dmm parameters.
I am wondering if you can explain the procedure of how you get the parameters.

Thanks,

Training data of M030 into train.zip

Hello and thank you so much for sharing the code, I was trying to implement you training code but didn't get where a should take the training data that you mentioned "audio2lm/data/M030/audio/".

No such file or directory: base_weight.pickle

Enjoyed the paper - thanks! Looking forward to seeing the training code :) -
A few things:

  1. in test.py line 208 add_audio(config['video_dir'], opt.in_file) throws an error - AttributeError: 'Namespace' object has no attribute 'in_file' (the coide runs if this line is commented out).
  2. Line 3 in M003_test.yaml video_dir: audio2lm/result/M003.mp4 throws an error - OpenCV: FFMPEG: tag 0x44495658/'XVID' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)' that can be fixed by changing it to video_dir: audio2lm/result/M003.avi
  3. In section 2 - running python lm2video/lm2map.py results in FileNotFoundError: [Errno 2] No such file or directory: './base_weight.pickle' - not sure how this file is generated or obtained.
    Any help much appreciated - 谢谢你

Generating video from one image and audio

Hi and thank you for the great work and sharing test and train code! I'm wondering, is it possible to generate a video having only 1 image and audio as an input?

Confusion about audio2lm/test.py

audio2lm/test.py#L113
example_landmark = example_landmark - mean.expand_as(example_landmark)

Both example_landmark and mean are loaded from config['mean'], which will output 0?

eyes blink

Can the eyes blink in the result video?

Confusion about the testing result

Thank you for the great work. When I reproduce the results by the testing step2, I meet some confusion. The images provided in
"test/data/M003/3DMM/3DMM/M003_01_3_output_01/image/" seem that the target person is very angry.
58 However, the images generated in "/results/M003/test_latest/M003_01_3_output_01" do not look so angry and the mouth shape is not so consistent with the images provided.
fake_B_000058
I judge that they should have a corresponding relationship according to the consistency of their folder naming, but the result puzzles me. I think I may have some misunderstandings. Can you help me out of my confusion?

Generate the data for training error

I am running python landmark / code / preProcess.py.
Then, this path does not exist. How should it be operated

Traceback (most recent call last):
File "/home/user/PycharmProjects/EVP/EVP/train/landmark/code/preprocess.py", line 145, in
a = np.load(path)
File "/home/user/anaconda3/envs/evp/lib/python3.6/site-packages/numpy/lib/npyio.py", line 416, in load
fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/PycharmProjects/EVP/EVP/train/landmark/dataset_M030/landmark/M030_fear_3_026/5.npy'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.