GithubHelp home page GithubHelp logo

stanford-tml / edge Goto Github PK

View Code? Open in Web Editor NEW
430.0 430.0 60.0 5.07 MB

Official PyTorch Implementation of EDGE (CVPR 2023)

Home Page: https://edge-dance.github.io

License: MIT License

Python 95.17% Shell 0.67% Jupyter Notebook 4.16%
animation dance-generation diffusion-models pytorch

edge's People

Contributors

jtseng20 avatar rodrigo-castellon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

edge's Issues

How to synchronize with sound

Thanks for sharing the interesting code.

I immediately tried test.py.
The result was about 150 seconds of input and 30 seconds of output (*.mp4).
Is it possible to synchronize the input and output and check the dance to the sound for this result?

test_2RicaUqd9Hg.mp4

NameError: name 'FbxAnimCurve' is not defined

Thanks for providing the interesting code.

Actually, I tried SMPL to FBX and got the following error

The execution environment is as follows

Ubuntu 20.04
Python 3.10
PyTorch 1.13.1

FBX SDK was installed according to the following procedure

FBX SDK Download (https://www.autodesk.com/developer-network/platform-technologies/fbx-sdk-2020-3)

mkdir -p /fbx-sdk/install
tar -zxvf /tmp/fbx202032_fbxpythonsdk_linux.tar.gz -C /fbx-sdk
/fbx-sdk/fbx202032_fbxpythonsdk_linux /fbx-sdk/install 

If there is a way to fix this, please let me know.

What's the input of evaluation?

When I run the eval_pfc.py, the reslut is always changing. But the number is 1.5363 or 1.6545 in the thesis. I wonder that my input is wrong. I am using the 20 pkl files generated by the model from the test set in AIST++.

Clarification on Reproducing PFC Score from EDGE Paper

Dear EDGE Authors,

I am fascinated by your paper and wish to gain a better understanding of your methodology, particularly concerning the process to replicate the evaluation metrics detailed in your publication.

In your work, it's mentioned that for automatic evaluations such as PFC, beat alignment, Distk, and Distg, 5-second clips were obtained from each model using slices from the test music set with a 2.5-second stride. However, the process of deriving these 5-second clips from the AIST++ dataset remains unclear to me. The dataset test set, as far as I understand, comprises 20 musical pieces ranging from 8-15 seconds each.

I have attempted to replicate the PFC metrics using the following approaches:

  1. I used the raw input music from the AIST++ test set, selecting the initial 5 seconds, resulting in a PFC of 1.6836428132824115.
  2. I made use of pre-split slices in the test set, incorporating 186 samples, each 5 seconds long, which led to a PFC of 1.2385500567535723.
  3. By employing the original implementation in the test.py file with an output length (interpreted as a random slice selection for motion generation), two generation runs yielded PFCs of 1.5676957425076252 and 1.7647031391283114, respectively.

I am trying to replicate the PFC score (1.5363) reported in your paper, and I would greatly appreciate your guidance in this matter. Please let me know if there are any misconceptions in my understanding.

Looking forward to your kind assistance.

Demo is not working

Hi thanks for the extensive research on this ! I was trying to see how the demo works in real time, but realise the site is down. https://edge-sandbox.com/. Will you all be fixing this or if any other friends here have been able to replicate this in real-time, do share it thanks !

Loading custom music (ValueError: empty range for randrange())

Hello, thanks for your amazing work!

I tried to use my custom music(.wav) to test the model following the README with the command python test.py --music_dir custom_music/. However, It raises ValueError. Can you get me how to manage this issue?

Thanks!

Computing features for input music
Slicing custom_music/gasoline.wav
Traceback (most recent call last):
  File "test.py", line 128, in <module>
    test(opt)
  File "test.py", line 81, in test
    rand_idx = random.randint(0, len(file_list) - sample_size)
  File "/opt/conda/envs/edge/lib/python3.8/random.py", line 248, in randint
    return self.randrange(a, b+1)
  File "/opt/conda/envs/edge/lib/python3.8/random.py", line 226, in randrange
    raise ValueError("empty range for randrange() (%d, %d, %d)" % (istart, istop, width))
ValueError: empty range for randrange() (0, -1, -1)

data download

Hi! I'm trying to download and preprocess the dataset.
But in Github you said that it will take ~24 hrs and ~50 GB to precompute all the Jukebox features for the dataset.
If I'm just allowed 720 minutes (12 hours) to use the GPU server, how can I continue to preprocess the dataset without starting from the first for another next 12 hours?

Variable length animations

I'm loving this so far and it's produced some hilarious dances, but-- I apologize for being a rookie-- what's a method I could use to batch process songs of varying lengths, and render them at 30 or 60fps? I tried editing FbxReadWriter using GPT-4 but my attempts to edit the code didn't produce the results I wanted. Thanks!

How to compute the pfc of GT .pkls in ./data/test/motions?

Hi, @rodrigo-castellon @jtseng20
Thank you for your great work! I wonder how to compute the pfc index of GT .pkls in ./data/test/motions. When I run ./eval/eval_pfc.py with the motion_dir is ./data/test/motions(GT). Then I got "./data/test/motions has a mean of nan", then I found that code like

info = pickle.load(open(pkl, "rb"))
joint3d = info["full_pose"]

but info doesn't have "full_pose", instead being like:
86bf9b8779e5163b1e8d6d89a904d602

So I just wonder what should I do to figure it out?
Best wishes!
Andy

why the foot is moving all the time?

amazing work!
i found in the result, the foot of dancer is moving, no matter when the music is. it seems like skating.
I wonder if the final skeleton global translation is right?

About the geometric loss and data normalizer

Hello authors, thank you for the very impressive work.

I interestingly noticed that the code used a normalizer to preprocess the pose data vectors before training the model. Then in the FK forward pass to compute the Fk loss and the foot loss, I saw the lines to unnormalize the data (L482-483) was commented out. From my intuition, after normalizing, the rotations may not be valid rotations anymore. So the FK forward may not work properly and I thought the pose should be unnormalized to fix this problem. I'm just not sure whether or not it is true for the case of 6D rotation, maybe I was wrong. Could you explain the reason for this? Thank you!

The reference code:

b, s, c = model_out.shape
# unnormalize
# model_out = self.normalizer.unnormalize(model_out)
# target = self.normalizer.unnormalize(target)
# X, Q
model_x = model_out[:, :, :3]
model_q = ax_from_6v(model_out[:, :, 3:].reshape(b, s, -1, 6))
target_x = target[:, :, :3]
target_q = ax_from_6v(target[:, :, 3:].reshape(b, s, -1, 6))

Easy installation

Hi. I'm trying to install this libary and its super hard. It combines installing pytorch3d and much more.
Is there any docker we can maybe create in order to maintain this project easily?
or at least a working guideline for every OS?

It's really making hard impact on developers trying to learn and see what you have done...

Normalization of COM acc

Thanks for sharing a very nice work!
I can't understand the meaning of partly normalization only for COM in formulation 10 in the paper. If the implausibility is attributed to foot velocities, COM acceleration in formulation 8 should just be a sign simbol(0 or 1).If the implausibility isn't only attributed to foot velocities, there may be a reasonable understanding. I'm confused on the PFC formulation, especially the partly normalization.Can you explain it in detail?

Length of motion used for calculating evaluation metrics

Your paper mentioned: "We compute these metrics on 5-second dance clips produced by each approach". However, in the codebase, you are generating 30s clips by default. Can you please clarify how you computed the metrics comparing other methods to yours?

About Dance Editing

Thanks for sharing such great work!
I saw in the paper that EDGE is capable of Motion Editing. I want to know how to use them in the demo. Provided with first and last pose of motion.

How to get smpl_offset

Hello,

I want to create a new dataset about motion, but the number of keypoints in my dataset and your dataset are different, so I need to adjust the visualization part accordingly. I have already found some existing issues in vis.py.

Could you please explain in detail the steps to determine the smpl_offset parameter in this document herevarious conditions? Additionally, providing some example code or more detailed explanations in the documentation would be immensely helpful for understanding and utilizing this parameter.

About PFC implementation and eval result

Thanks for the amazing work, authors.
I am trying to reproduce the result reported in the paper and have two questions.

  1. What does the constant 10000 mean here in PFC's implementation? I can't map this to the formulation (10) in the paper.

    out = np.mean(scores) * 10000

  2. I find it hard to reproduce the PFC result 1.5363 in the paper. Do you only use the test dataset of AIST++ to compute PFC? and what do you mean here in README to Generate ~1k samples? There are only 20 data and will be 186 pieces after slice in the test dataset.

Has anyone successfully reproduced the results? 🤔

Why use 2 time tokens?

self.to_time_tokens = nn.Sequential(
    nn.Linear(latent_dim * 4, latent_dim * 2),  # 2 time tokens
    Rearrange("b (r d) -> b r d", r=2),
)

In L278-L281 of model/model.py, what is the purpose of making 2 time tokens instead of just 1 time token ?

Maximo bone retarget

Hi, I have a problem about the maximo bone retarget. How does you match the bone?
屏幕截图 2023-07-20 123151

SMPL-to-FBX convert error

Thanks for sharing the work!
I try to convert the pred motion to FBX but get this error:

Traceback (most recent call last):
  File "SMPL-to-FBX/Convert.py", line 45, in <module>
    fbxReadWrite.addAnimation(pkl_name, smpl_params)
  File "/EDGE/SMPL-to-FBX/FbxReadWriter.py", line 93, in addAnimation
    lCurve = node.LclRotation.GetCurve(lAnimLayer, "X", True)
AttributeError: 'NoneType' object has no attribute 'LclRotation'

Any help would be appreciated !

What is the best batch size without considering GPU memory?

Truly a masterpiece! And thank you for your willingness to share your work. I have a few questions about batch_size and would like to ask the following.
The paper mentions the use of 4 A100 graphics cards and batch size of 512. Is the model trained in this batch_size better? If I want to use a different batch_size, is there a recommended data set split ratio? What batch_size is the checkpoint.pt provided in GoogleDrive trained under?

Could you please provide the learning curve?

Could you provide the learning curve of loss over time during the training process, so that we can serve as a reference to make sure that the training process is in the correct direction when we are reproducing the results.

The test dataset is too small

I'm curious, the total length of the test dataset is not enough for the batch size 512 in the paper, how did you train it?
Is there any other data that needs to be added to the test dataset?

image

Load pretrain model not working

I've tried to load pretrain model for further training.
I've check torch.load() did work and I've set learning_rate small, so it should not look so different from original model.
but the result did not seems like it's been pretrain.

the modified code:
model = EDGE(opt.feature_type, learning_rate=0.000002, checkpoint_path="checkpoint.pt")
in train.py

Discrepancy code/paper about the L_simple loss

Hello,

First of all, thanks for your nice work.

I am trying to understand the training losses, and there is something I don't understand.
In the paper (equation 2, page 3), it is written that you are using the loss L_simple (L2 between x_gt and x_generated).

But when I looked at the code, it seems that this loss is multiplied by the "p2_loss_weight" term:

loss = loss * extract(self.p2_loss_weight, t, loss.shape)

which is defined here:
https://github.com/Stanford-TML/EDGE/blob/main/model/diffusion.py#L127-L131

Can you explain why and how the loss is scaled depending on the time step? I did not find this information in the paper.

Thanks for your help!

--
Mathis

A problem where a function is not defined

I encountered a problem during the final generation of the fbx file called 'NameError: name' FbxAnimCurve 'is not defined'. May I know how to resolve this issue? Thank you for your answer

Disk quota exceeded

image

Hi! Thanks for sharing the impressive work.
Could I ask how to solve this? I am processing the dataset by one A100, on a cluster. But it shows disk quota is exceeded, even if it is not true.

Thank you in advance.

Segmentation fault (core dumped)

Hello, I'm trying to visualize the .pkl output from running test.py with --save_motions option with following command.
python SMPL-to-FBX/Convert.py --input_dir SMPL-to-FBX/smpl_samples/ --output_dir SMPL-to-FBX/fbx_out

But it says Segmentation fault (core dumped) and never works. I already installed Python FBX.

Here is my directory hierarchy.

(edge) root@37faa67a6076:/app/SMPL-to-FBX# tree                                                                                                               │·····················································
.                                                                                                                                                             │·····················································
|-- Convert.py                                                                                                                                                │·····················································
|-- FbxFormatConverter.exe                                                                                                                                    │·····················································
|-- FbxReadWriter.py                                                                                                                                          │·····················································
|-- SmplObject.py                                                                                                                                             │·····················································
|-- __pycache__                                                                                                                                               │·····················································
|   |-- FbxReadWriter.cpython-38.pyc                                                                                                                          │·····················································
|   `-- SmplObject.cpython-38.pyc                                                                                                                             │·····················································
|-- fbx_out                                                                                                                                                   │·····················································
|-- smpl_samples                                                                                                                                              │·····················································
|   `-- test_gasoline_demo.pkl                                                                                                                                │·····················································
`-- ybot.fbx

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.