GithubHelp home page GithubHelp logo

jpthu17 / graphmotion Goto Github PK

View Code? Open in Web Editor NEW
104.0 4.0 6.0 15.1 MB

[NeurIPS 2023] Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs

License: Apache License 2.0

Python 99.75% Shell 0.25%
aigc diffusion-models graph-networks motion-generation neurips-2023

graphmotion's Introduction

【NeurIPS'2023 πŸ”₯】 Act As You Wish: Fine-grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs

Conference Paper

We propose hierarchical semantic graphs for fine-grained control over motion generation. Specifically, we disentangle motion descriptions into hierarchical semantic graphs including three levels of motions, actions, and specifics. Such global-to-local structures facilitate a comprehensive understanding of motion description and fine-grained control of motion generation. Correspondingly, to leverage the coarse-to-fine topology of hierarchical semantic graphs, we decompose the text-to-motion diffusion process into three semantic levels, which correspond to capturing the overall motion, local actions, and action specifics.

πŸ“£ Updates

  • [2023/11/16]: I fixed a data load bug that caused performance degradation.
  • [2023/10/07]: We release the code. However, this code may not be the final version. We may still update it later.

πŸ“• Architecture

We factorize motion descriptions into hierarchical semantic graphs including three levels of motions, actions, and specifics. Correspondingly, we decompose the text-to-motion diffusion process into three semantic levels, which correspond to capturing the overall motion, local actions, and action specifics.

😍 Visualization

Qualitative comparison

Video.mp4

Refining motion results

To fine-tune the generated results for more fine-grained control, our method can continuously refine the generated motion by modifying the edge weights and nodes of the hierarchical semantic graph.

🚩 Results

Comparisons on the HumanML3D

Comparisons on the KIT

πŸš€ Quick Start

Datasets

Datasets Google Cloud Baidu Yun Peking University Yun
HumanML3D Download TODO Download
KIT Download TODO Download

Model Zoo

Checkpoint Google Cloud Baidu Yun Peking University Yun
HumanML3D Download TODO TODO

1. Conda environment

conda create python=3.9 --name GraphMotion
conda activate GraphMotion

Install the packages in requirements.txt and install PyTorch 1.12.1

pip install -r requirements.txt

We test our code on Python 3.9.12 and PyTorch 1.12.1.

2. Dependencies

Run the script to download dependencies materials:

bash prepare/download_smpl_model.sh
bash prepare/prepare_clip.sh

For Text to Motion Evaluation

bash prepare/download_t2m_evaluators.sh

3. Pre-train model

Run the script to download the pre-train model

bash prepare/download_pretrained_models.sh

4. Evaluate the model

Please first put the trained model checkpoint path to TEST.CHECKPOINT in configs/config_humanml3d.yaml.

Then, run the following command:

python -m test --cfg configs/config_humanml3d.yaml --cfg_assets configs/assets.yaml

πŸ’» Train your own models

1.1 Prepare the datasets

For convenience, you can download the datasets we processed directly. For more details, please refer to HumanML3D for text-to-motion dataset setup.

Datasets Google Cloud Baidu Yun Peking University Yun
HumanML3D Download TODO Download
KIT Download TODO Download

1.2 Prepare the Semantic Role Parsing (Optional)

Please refer to "prepare/role_graph.py".

We have provided semantic role-parsing results (See "datasets/humanml3d/new_test_data.json").

Semantic Role Parsing Example
        {
            "caption": "a person slowly walked forward",
            "tokens": [
                "a/DET",
                "person/NOUN",
                "slowly/ADV",
                "walk/VERB",
                "forward/ADV"
            ],
            "V": {
                "0": {
                    "role": "V",
                    "spans": [
                        3
                    ],
                    "words": [
                        "walked"
                    ]
                }
            },
            "entities": {
                "0": {
                    "role": "ARG0",
                    "spans": [
                        0,
                        1
                    ],
                    "words": [
                        "a",
                        "person"
                    ]
                },
                "1": {
                    "role": "ARGM-MNR",
                    "spans": [
                        2
                    ],
                    "words": [
                        "slowly"
                    ]
                },
                "2": {
                    "role": "ARGM-DIR",
                    "spans": [
                        4
                    ],
                    "words": [
                        "forward"
                    ]
                }
            },
            "relations": [
                [
                    0,
                    0,
                    "ARG0"
                ],
                [
                    0,
                    1,
                    "ARGM-MNR"
                ],
                [
                    0,
                    2,
                    "ARGM-DIR"
                ]
            ]
        }

2.1. Ready to train VAE model

Please first check the parameters in configs/config_vae_humanml3d_motion.yaml, e.g. NAME,DEBUG.

Then, run the following command:

python -m train --cfg configs/config_vae_humanml3d_motion.yaml --cfg_assets configs/assets.yaml --batch_size 64 --nodebug
python -m train --cfg configs/config_vae_humanml3d_action.yaml --cfg_assets configs/assets.yaml --batch_size 64 --nodebug
python -m train --cfg configs/config_vae_humanml3d_specific.yaml --cfg_assets configs/assets.yaml --batch_size 64 --nodebug

2.2. Ready to train GraphMotion model

Please update the parameters in configs/config_humanml3d.yaml, e.g. NAME,DEBUG,PRETRAINED_VAE (change to your latest ckpt model path in previous step)

Then, run the following command:

python -m train --cfg configs/config_humanml3d.yaml --cfg_assets configs/assets.yaml --batch_size 128 --nodebug

3. Evaluate the model

Please first put the trained model checkpoint path to TEST.CHECKPOINT in configs/config_humanml3d.yaml.

Then, run the following command:

python -m test --cfg configs/config_humanml3d.yaml --cfg_assets configs/assets.yaml

▢️ Demo

TODO

πŸ“Œ Citation

If you find this paper useful, please consider staring 🌟 this repo and citing πŸ“‘ our paper:

@inproceedings{
jin2023act,
title={Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs},
author={Peng Jin and Yang Wu and Yanbo Fan and Zhongqian Sun and Yang Wei and Li Yuan},
booktitle={NeurIPS},
year={2023}
}

πŸŽ—οΈ Acknowledgments

Our code is based on MLD, TEMOS, ACTOR, HumanML3D and joints2smpl. We sincerely appreciate for their contributions.

graphmotion's People

Contributors

eltociear avatar jpthu17 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

graphmotion's Issues

git lfs installation

Hi.

While I run a training code, I faced an error.
It was mainly caused by missing the git-lfs package when download a clip repo.
In the script, you wrote git lfs install, but it does not work.
Rather, I use apt install git-lfs that is from official instruction from git.

Evaluate the model

Thank you for your sharing. when i run the command "python -m test --cfg configs/config_humanml3d.yaml --cfg_assets configs/assets.yaml",there are some error :
Global seed set to 1234
Traceback (most recent call last):
File "/root/miniconda3/envs/GraphMotion/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/root/miniconda3/envs/GraphMotion/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/root/autodl-tmp/GraphMotion/test.py", line 149, in
main()
File "/root/autodl-tmp/GraphMotion/test.py", line 62, in main
datasets = get_datasets(cfg, logger=logger, phase="test")[0]
File "/root/autodl-tmp/GraphMotion/GraphMotion/data/get_data.py", line 114, in get_datasets
dataset = dataset_module_map[dataset_name.lower()](
File "/root/autodl-tmp/GraphMotion/GraphMotion/data/HumanML3D.py", line 36, in init
self._sample_set = self.get_sample_set(overrides=sample_overrides)
File "/root/autodl-tmp/GraphMotion/GraphMotion/data/base.py", line 38, in get_sample_set
return self.Dataset(split_file=split_file, **sample_params)
File "/root/autodl-tmp/GraphMotion/GraphMotion/data/humanml/data/dataset.py", line 376, in init
name_list, length_list = zip(*sorted(zip(new_name_list, length_list), key=lambda x: x[1]))
ValueError: not enough values to unpack (expected 2, got 0)
How should i solve it?

Question about demo

Hello, thank you for your wonderful work and selfless sharing.
I have a small doubt. When I tried to run the current demo file, I found that GraphMotion's forward function seemed incomplete. The denoiser doesn't seem to work very well without providing the [hidden_states] argument to the _diffusion_reverse() function, and I can't seem to find a program that can automatically convert the text into a parsed dictionary (like the one in the dataset containing dictionary of word relationships), can you help me?

Looking forward to a complete demo.

Hello, I am very interested in your work! I noticed that the demo in the code is incomplete and cannot be run directly, so I would really appreciate if you could provide a complete demo soon.

KIT Dataset Training Setting

Hello,

I have a question about the training settings for the KIT dataset.
In the configs folder, there is only a yaml file for humanml3d and none for KIT.
Therefore, I copied the humanml3d file to create a KIT config file and trained the VAE.
I changed the dataset name in the config file from humanml3d to KIT and trained for 30,000 epochs as mentioned in the supplementary material.
However, the training results show lower performance compared to the supplementary's Table E for motion, action, and specific VAE. For example, the results for the motion level VAE are as follows:

R_TOP_1 3.410e-01 R_TOP_2 5.625e-01 R_TOP_3 6.970e-01
gt_R_TOP_1 4.307e-01 gt_R_TOP_2 6.590e-01 gt_R_TOP_3 7.840e-01
FID 3.099e+00

Is there any different setting required when training the KIT dataset compared to humanml3d? Or are there any other points to be aware of?

I look forward to your response.
Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.