tr3e / intergen Goto Github PK

View Code? Open in Web Editor NEW

182.0 13.0 10.0 58.4 MB

[IJCV 2024] InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions

Home Page: https://tr3e.github.io/intergen-page/

Python 99.73% Shell 0.27%

motion-generation text-to-3d interaction-modeling

intergen's People

Contributors

Stargazers

Watchers

Forkers

slimevrx yhx6 rucchzy dnpcs12 zc1213856 smandava98 andersthuesen bruinxiong bring-nirachornkul

intergen's Issues

daily motion的编号

请问daily motion在数据集的标号上有体现吗，还是说daily motion和professional motions在编号上是完全打乱的，感谢回复！

Question About the shape of npy file in motion_processed?

After I load one npy file in motino_process, I notice the shape of it is T*492.
I am curious about the concret meaning of 492, what are the components in 492? In the code, I understand it contains the global postion for 22x3 and 6D rotation represention for 21x3, How about the others?

About converting "motions" to "motions_processed".

Hey,

Great job on this! I noticed that the time steps in "1.pkl" in the "motions" folder don't quite match up with "1.npy" in "motions_processed". Any idea what's causing the difference?

Mind sharing the script you used for the conversion from "motions" to "motions_processed"? It'd be super helpful for getting a handle on the "motions_processed" dataset format.

Thanks a bunch!

Global_mean.npy and global_std.npy calculations

Hello,

I'm attempting to calculate the global_mean and global_std for the whole dataset myself so I can then find the values for a subset of the motion.

I accumulated the 262 processed features across all frames for motion1 and motion1_swap, then for motion2 and motion2_swap relative to motion1. When computing the mean and average I achieve different results than those supplied in the data directory.

Can I inquire as to how exactly you are computing the average and std for the dataset? Thank you for your work and assistance thus far!

交流

Request for Intergen Experiment Source Code

Hi,

I hope this finds you well. I've been studying your Intergen paper and was impressed by the use of the T2M framework for evaluating the InterHuman dataset. I'm working on a related project and am very interested in replicating your experiment.

Would it be possible for you to share the source code used for these experiments? Access to the code would be incredibly helpful for my work.

Thank you for considering my request.

Best

[training.py] Missing Required Data Files: 'ignore_list.txt' and 'train.txt'

Certainly! Here is a suggested GitHub issue that you can post to the author. This includes the topic and detailed description of the problem you're facing:

Issue Title: Missing Required Data Files: `ignore_list.txt` and `train.txt`

Issue Description:

Description

Hello,

I am trying to run the train.py script provided in your repository. However, I encountered errors related to missing files in the ./data/interhuman_processed/ directory. Specifically, the following files are missing:

ignore_list.txt
train.txt

The absence of these files causes the script to fail with the following errors:

[Errno 2] No such file or directory: './data/interhuman_processed/ignore_list.txt'
[Errno 2] No such file or directory: './data/interhuman_processed/train.txt'

Additionally, I observed the following warning related to the checkpoint directory:

/home/blinkdrive/miniconda3/envs/intergen/lib/python3.8/site-packages/lightning/pytorch/callbacks/model_checkpoint.py:613: UserWarning: Checkpoint directory ./checkpoints/IG-S-8/model exists and is not empty.
  rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")

Steps to Reproduce

Clone the repository.
Ensure all dependencies are installed.
Run the train.py script.

Expected Behavior

The script should run without any missing file errors.

Actual Behavior

The script fails due to missing ignore_list.txt and train.txt files.

Additional Information

To resolve this issue, could you please provide the missing ignore_list.txt and train.txt files? Alternatively, if these files need to be generated, could you provide the necessary instructions or scripts to create them?

Temporary Workaround

As a temporary workaround, I created empty placeholder files for ignore_list.txt and train.txt. However, I am not sure what data these files should contain, which might affect the training process.

Environment Details

OS: Ubuntu 24.04 LTS
Python version: Python 3.8.19
PyTorch version: 1.13.0+cu117
Any other relevant dependencies or versions
absl-py==2.1.0
aiohttp==3.9.5
aiosignal==1.3.1
anyio==4.4.0
arrow==1.3.0
async-timeout==4.0.3
attrs==23.2.0
beautifulsoup4==4.12.3
blessed==1.20.0
boto3==1.34.136
botocore==1.34.136
cachetools==5.3.3
certifi==2024.6.2
charset-normalizer==3.3.2
click==8.1.7
clip @ git+https://github.com/openai/CLIP.git@dcba3cb2e2827b402d2701e7e1c7d9fed8a20ef1
croniter==1.3.15
cycler==0.12.1
dateutils==0.6.12
deepdiff==7.0.1
editor==1.6.6
exceptiongroup==1.2.1
fastapi==0.88.0
filelock==3.15.4
frozenlist==1.4.1
fsspec==2023.12.2
ftfy==6.2.0
gdown==5.2.0
google-api-core==2.19.1
google-api-python-client==2.135.0
google-auth==2.30.0
google-auth-httplib2==0.2.0
google-auth-oauthlib==1.0.0
googleapis-common-protos==1.63.2
grpcio==1.64.1
h11==0.14.0
httplib2==0.22.0
idna==3.7
importlib_metadata==8.0.0
inquirer==3.3.0
itsdangerous==2.2.0
Jinja2==3.1.4
jmespath==1.0.1
kiwisolver==1.4.5
lightning==1.9.1
lightning-cloud==0.5.70
lightning-utilities==0.11.3.post0
Markdown==3.6
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib==3.2.0
mdurl==0.1.2
multidict==6.0.5
numpy==1.24.4
oauthlib==3.2.2
ordered-set==4.1.0
packaging==24.1
pillow==10.3.0
proto-plus==1.24.0
protobuf==5.27.2
psutil==6.0.0
pyasn1==0.6.0
pyasn1_modules==0.4.0
pydantic==1.10.17
Pygments==2.18.0
PyJWT==2.8.0
pyparsing==3.1.2
PySocks==1.7.1
python-dateutil==2.9.0.post0
python-multipart==0.0.9
pytz==2024.1
PyYAML==6.0.1
readchar==4.1.0
regex==2024.5.15
requests==2.32.3
requests-oauthlib==2.0.0
rich==13.7.1
rsa==4.9
runs==1.2.2
s3transfer==0.10.2
scipy==1.10.1
six==1.16.0
sniffio==1.3.1
soupsieve==2.5
starlette==0.22.0
starsessions==1.3.0
tensorboard==2.14.0
tensorboard-data-server==0.7.2
torch==1.13.0+cu117
torchaudio==0.13.0+cu117
torchmetrics==1.4.0.post0
torchvision==0.14.0+cu117
tqdm==4.66.4
traitlets==5.14.3
types-python-dateutil==2.9.0.20240316
typing_extensions==4.12.2
uritemplate==4.1.1
urllib3==1.26.19
uvicorn==0.30.1
wcwidth==0.2.13
websocket-client==1.8.0
websockets==11.0.3
Werkzeug==3.0.3
xmod==1.8.1
yacs==0.1.8
yarl==1.9.4
zipp==3.19.2

pip install -r requirements.txt failed

Collecting scipy~=1.4.1 (from -r requirements.txt (line 5))
Using cached scipy-1.4.1.tar.gz (24.6 MB)
Installing build dependencies ... error
error: subprocess-exited-with-error
.......
File "/private/var/folders/pc/sbd6mtzs5g57c9z_vqdnj2z40000gn/T/pip-install-1gioli0t/numpy_e9ee180d7c8849eda824a6ce3f216369/numpy/distutils/command/build_src.py", line 142, in run
self.build_sources()
File "/private/var/folders/pc/sbd6mtzs5g57c9z_vqdnj2z40000gn/T/pip-install-1gioli0t/numpy_e9ee180d7c8849eda824a6ce3f216369/numpy/distutils/command/build_src.py", line 153, in build_sources
self.build_library_sources(*libname_info)
File "/private/var/folders/pc/sbd6mtzs5g57c9z_vqdnj2z40000gn/T/pip-install-1gioli0t/numpy_e9ee180d7c8849eda824a6ce3f216369/numpy/distutils/command/build_src.py", line 286, in build_library_sources
sources = self.generate_sources(sources, (lib_name, build_info))
File "/private/var/folders/pc/sbd6mtzs5g57c9z_vqdnj2z40000gn/T/pip-install-1gioli0t/numpy_e9ee180d7c8849eda824a6ce3f216369/numpy/distutils/command/build_src.py", line 369, in generate_sources
source = func(extension, build_dir)
File "numpy/core/setup.py", line 669, in get_mathlib_info
raise RuntimeError("Broken toolchain: cannot link a simple C program")
RuntimeError: Broken toolchain: cannot link a simple C program
[end of output]

    note: This error originates from a subprocess, and is likely not a problem with pip.
    ERROR: Failed building wheel for numpy

how to fix it? please help~

Weird visualization from 6D rotation representation.

I'm trying to visualize the human motions with the given 6D rotation representations in "motions_processed" folder. However, with the following code, the visualization result is quite weird.

Could you provide some guidance?

    from human_body_prior.body_model.body_model import BodyModel
    import numpy as np
    from pytorch3d import transforms
    from common.quaternion import *

    imw, imh = 800, 800
    mv = MeshViewer(width=imw, height=imh, use_offscreen=False)

    bm_fname = '/smplh/neutral/model.npz'
    bm = BodyModel(bm_fname=bm_fname, num_betas=10)

    data1 = np.load("./data/motions_processed/person1/6035.npy")

    rot_6d = data1[..., 62 * 3:62 * 3 + 21 * 6]

    rot_matrix = cont6d_to_matrix(torch.from_numpy(rot_6d.reshape(-1, 6)))
    rot_axis_angle = transforms.matrix_to_axis_angle(rot_matrix).view(-1, 63)

    body_pose_beta = bm(pose_body=rot_axis_angle)
    for fId in tqdm.tqdm(range(body_pose_beta.v.shape[0])):
        body_mesh = trimesh.Trimesh(vertices=convert(body_pose_beta.v[fId]), faces=bm.f, vertex_colors=np.tile(colors['grey'], (6890, 1)))
        mv.viewer.render_lock.acquire()
        mv.set_static_meshes([body_mesh])
        mv.viewer.render_lock.release()
        plt.pause(1)

Missing 3 joints from SMPL

Hi,

Thanks for your work. Can I ask why the dimension of "pose_body" in the .pkl file is 63, which means 21 joints, rather than 24 joints in the SMPL model? I couldn't find the missing 3 joints or any skeleton hierarchy you used from the paper and the released codes. Thanks!

Code for training the evaluation model

Thanks for your great work!
Would you mind sharing the code to train the evaluation model interclip? I'm investigating its performance and your help would be appreciated.

Question about the SMPL-H skeleton name of npy file in motion_processed?

After I load one npy file in motino_process, I notice the shape of it is num_frame*492.
I am curious about the concret meaning of 492, what are the components in 492?
I see the author explains in other issues that it contains SMPL-H joint positions and rotations (6D representation, no root and finger tops), totaling 62x3+51x6. Does it mean there's an finger tops parameter(22+20+20=62)? If that's the case, what's the order of the joints?

Resulting R_precision accuracy is always higher than the ground truth

Hello, I've tried many methods, but the resulting R_precision accuracy is always higher than the ground truth (GT). Is this situation normal, and is there a solution? Thank you very much.

Pre-processing script and train-val-test for fair comparison

Hi! Thanks a lot for the nice work as the dataset looks great.
In the paper, it is mentioned that some preprocessing is done (mirroring all motions as well as augmenting the descriptions). Would it be possible to share this script so everyone can work with similar datasets? Same for the train-val-test split.

Error occurs when running infer.py

Thanks a lot for your great work. I'd like to know why I encountered an error when I try to run infer.py. Here is the error info I encountered:

Traceback (most recent call last):
File "tools/infer.py", line 126, in
litmodel.generate_one_sample(text, name+"_"+str(i))
File "tools/infer.py", line 65, in generate_one_sample
self.plot_t2m([motion_output[0], motion_output[1]],
File "tools/infer.py", line 49, in plot_t2m
plot_3d_motion(result_path, paramUtil.t2m_kinematic_chain, mp_joint, title=caption, fps=30)
File "/home/ubuntu/xiyan/InterGen/tools/../utils/plot_script.py", line 129, in plot_3d_motion
ani.save(save_path, fps=fps)
File "/home/ubuntu/.local/lib/python3.8/site-packages/matplotlib/animation.py", line 1090, in save
anim._init_draw() # Clear the initial frame
File "/home/ubuntu/.local/lib/python3.8/site-packages/matplotlib/animation.py", line 1748, in _init_draw
self._draw_frame(frame_data)
File "/home/ubuntu/.local/lib/python3.8/site-packages/matplotlib/animation.py", line 1767, in _draw_frame
self._drawn_artists = self._func(framedata, *self._args)
File "/home/ubuntu/xiyan/InterGen/tools/../utils/plot_script.py", line 104, in update
ax.lines = []
AttributeError: can't set attribute

Error occurs when running train.py

When I run train.py, the error occurs as following:
UserWarning: DataLoader returned 0 length. Please make sure this was your intention.
rank_zero_warn(
/data1/fyy/anaconda3/envs/intergen/lib/python3.8/site-packages/lightning/pytorch/utilities/data.py:110: UserWarning: Total length of CombinedLoader across ranks is zero. Please make sure this was your intention.
rank_zero_warn( rank_zero_warn(
Training: 0it [00:00, ?it/s]Trainer.fit stopped: No training batches.
Training: 0it [00:00, ?it/s]

Beacuse
[Errno 2] No such file or directory: './data/interhuman_processed/ignore_list.txt'
[Errno 2] No such file or directory: './data/interhuman_processed/ignore_list.txt'
[Errno 2] No such file or directory: './data/interhuman_processed/ignore_list.txt'
[Errno 2] No such file or directory: './data/interhuman_processed/train.txt'
[Errno 2] No such file or directory: './data/interhuman_processed/train.txt'
[Errno 2] No such file or directory: './data/interhuman_processed/train.txt'

How can I get the interhuman_processed?

The specific composition of 492 columns in each npy file

Hi,
Thanks for your work. Your work is excellent！Can I inquire about how the 492 columns in each npy file under the folder "motions.processed" are composed? What exactly does this 492 represent in order? Thank you very much!!!

Issue with 3D Skeleton Visualization: Unnatural Representation

Hello, thanks for sharing your excellent work and dataset.

As I've been working on visualizing 3D skeletons using the provided dataset, I've encountered some issues. Specifically, the plotted skeletons look unnatural and seems not on the same plane. And the root orientation changes quite abruptly and it appears that the camera of the scene is not fixed. Here is the plotted picture and some of my code, I use SMPL model to visualize the skeletion:

Pictures:

Codes:

for person_name in ['person1', 'person2']:
    global_orient = motions_data[person_name]['root_orient']
    body_pose = motions_data[person_name]['pose_body']
    betas = motions_data[person_name]['betas']
    transl = motions_data[person_name]['trans']

    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    global_orient_tensor = torch.from_numpy(global_orient).float().to(device)
    body_pose_tensor = torch.from_numpy(body_pose).float().to(device)
    betas_tensor = torch.from_numpy(betas).float().to(device)
    transl_tensor = torch.from_numpy(transl).float().to(device)

    rotation_matrix = euler_angles_to_matrix(
        body_pose_tensor.view(body_pose_tensor.shape[0], -1, 3), 'XYZ')
    global_orient_mat = euler_angles_to_matrix(global_orient_tensor, 'XYZ').view(global_orient_tensor.shape[0], -1, 3, 3)
    dtype = body_pose_tensor.dtype

    add_rotation_matrix = torch.eye(3, device=device, dtype=dtype).view(
                    1, 1, 3, 3).expand(rotation_matrix.shape[0], 2, -1, -1)
    processd_rot_mat = torch.cat([rotation_matrix, add_rotation_matrix], dim=1)
    betas_tensor_input = betas_tensor.expand(global_orient_mat.shape[0], -1)

    smpl_model = SMPL().eval().to(device)
    out = smpl_model(
            global_orient=global_orient_mat,
            body_pose=processd_rot_mat,
            betas=betas_tensor_input,
            transl=transl_tensor)
    show_idx = 0
    
    if person_name == 'person1':
        plot_skeleton_onePic(
            out['smpl'][show_idx].cpu().detach().numpy(),
            ax,
            'r',
            'Skeleton 1',
            humanact12_kinematic_chain
                            )
    else:
        plot_skeleton_onePic(
            out['smpl'][show_idx].cpu().detach().numpy(),
            ax,
            'b',
            'Skeleton 2',
            humanact12_kinematic_chain
                            )

Normalising the data.

Hello! thank you for your excellent work!
Can you share your normalization function with which you calculated your mean and std martices?

Thank you!

NPY Processed Motion Representation

Hello! First, let me thank you for your paper and released code/dataset!

I was wondering if you could share the precise format of the '.npy' files in motions_processed.

I understand from preprocess.py you append the joint positions and rotations.

I'm curious where joint velocities and foot-ground features are located within the file format, as well as other pertinent info.

Thanks!

Recovered Mesh from Rotation in processed_motion

Hi, I tried to visualize human mesh according to rotation in npy file of processed_motions.
This is what I got:

    motions = np.load("InterHuman/motions_processed/person1/1.npy")
    motions = torch.from_numpy(motions)
    motions=motions[:,62*3: 62*3+21*6].reshape(-1, 21, 6)
    motions=rotation_6d_to_matrix(motions)
    motions=matrix_to_axis_angle(motions)
    zero=torch.zeros(motions.shape[0], 1, 3)
    motions=torch.cat([zero, motions, zero, zero], dim=1)
    motions=np.array(motions)[0].reshape(-1)
    joblib.dump(motions, 'scratch/temp/intergen_obj/pro_test.pt')

I get rotation like this, and input the 72-dim vector as pose to the SMPL model.
Do you have any advice to fix it?

Missing data in motion_processed

Hi, thanks for sharing your great work and your code!

When I run the training scripts, interhuman.py reports a lot of '.npy' file missing in person2 folder. I also find a lot of xxxx(1).npy in both person1 and person2 folder where xxxx is numbers, which seems to be errors related to the pre-processing scripts to converting original '.pkl' files to be '.npy' files. Could you fix this issue or could you provide the scripts for converting the data in "motions" to be the format in "motions_processd"?

Looking for your reply! Thanks.

Cannot reproduce evaluation results correctly.

Thank you very much for your work and the new dataset!
I am trying to reproduce your results with your given model checkpoints/intergen.ckpt and for some reason don't get the same results.

========== MM Distance Summary ==========
---> [ground truth] Mean: 3.7849 CInterval: 0.0012
---> [InterGen] Mean: 3.7978 CInterval: 0.0011
========== R_precision Summary ==========
---> [ground truth](top 1) Mean: 0.2491 CInt: 0.0048;(top 2) Mean: 0.3828 CInt: 0.0064;(top 3) Mean: 0.4780 CInt: 0.0076;
---> [InterGen](top 1) Mean: 0.2748 CInt: 0.0042;(top 2) Mean: 0.4065 CInt: 0.0052;(top 3) Mean: 0.4881 CInt: 0.0050;
========== FID Summary ==========
---> [ground truth] Mean: 0.9993 CInterval: 0.0211
---> [InterGen] Mean: 7.1862 CInterval: 0.1244

These are the results I get. Looking at the FID summary, The groundtruth's mean is 0.9993 and Intergen's mean is 7.1862 while in your paper your results are 0.273 for the ground truth and 5.918 for Intergen's model.

Thank you!

intergen dataset

Thanks for sharing your awesome work.
I was wondering how can I get access to the videos of the boxing just from 4 different view points. I want to create the multi-person animation of the boxing portion of your dataset with my own pipeline that works with 3 or 4 views and compare it with your GT results.

Person-to-person generation.

Thank you for sharing your excellent code and data!
I am interested in reproducing the person-to-person generation part mentioned in your paper. It was mentioned that fine-tuning was used for this.
Could you share more detailed fine-tuning settings? (how much longer it was trained, how the existing text prompt was handled, etc.)

Error occurs when running infer.py （2）

When I run infer.py the error occurs as following:
Traceback (most recent call last):
File "tools/infer.py", line 128, in
litmodel.generate_one_sample(text, name+"_"+str(i))
File "tools/infer.py", line 66, in generate_one_sample
self.plot_t2m([motion_output[0], motion_output[1]],
File "tools/infer.py", line 50, in plot_t2m
plot_3d_motion(result_path, paramUtil.t2m_kinematic_chain, mp_joint, title=caption, fps=30)
File "/data1/fyy/InterGen/tools/../utils/plot_script.py", line 130, in plot_3d_motion
ani.save(save_path, fps=fps)
File "/data1/fyy/anaconda3/envs/intergen/lib/python3.8/site-packages/matplotlib/animation.py", line 1102, in save
alt_writer = next(writers, None)
TypeError: 'MovieWriterRegistry' object is not an iterator

Metric emb scale

InterGen/utils/metrics.py

Line 68 in 8a90d5d

activations = activations * emb_scale

Hello, may I ask why you times this emb scale of 6 when calculating the fid and diversity metrics?

Large differences in experimental results when BATCH_SIZE = 16 and EPOCH=500

Thanks for sharing your great work!
I have trained the model myself with respect to your readme guideline, but set BATCH_SIZE = 16 and EPOCH=500 due to the lack of computing resources. In this setting, my trained model has much worse performance compared with the evaluation results presented in the paper. I am wondering if it is essential to have exact same training setting to make the model have similar performance to paper's model. Besides, could you kindly release the checkpoint that exclusively trained on the training set? I think that would be really helpful for me!
Thanks for your time and patience!

Dataset Download Link Fails!!!

Hi, thank you for this wonderful work! However, when I want to get the dataset by the link you provided in the README.md, the page can't open correctly. Below is the scrrenshot:

Hope to get the valuable data asap to contribute to this interesting field!!!

Reproducing MDM results

Hi!
I am trying to reproduce MDM results on Interhuman you shared in your work obtaining for example FID of 9.167 using MDM.
Can you share that code also because I cannot reach to the same evaluation metrices training MDM on Interhuman.

Thank you!

WRONG 6d rotation params of motion_processed about joint 20(left_hand) and 21(right_hand)

Hello,

I'm attempting to perform the FK from the processed motion data, which contains pos(223), vel(223), and rot(21*6). I have extracted rotation data from that, yet I found that the last 2 joints' rotation(which refer to hands) are something like [1, 0, 0, 1, 0, 0], which would be the wrong 6d representation.

Does the motions_processed（npy files）contain the canonical representation?

Could you give the format of the npy data in motions_processed? And does the motions_processed（npy files）contain the canonical representation? Thank you for answering!

How to process SMPL to 6D representation

Hope you are doing well!

I see your dataset is SMPL but it is processed to joint positions and rotations (6D representation).
Now, I want to try running the model with another dataset to compare.
Can you provide more detail about processing data?

Thank you so much

latest evaluator.py

Sorry for that, there are some typos in evaluator.py. We have already fixed that.
please make sure your code is up to date.

Originally posted by @tr3e in #10 (comment)

Sorry, but I cannot find the latest evaluator.py after Nov 2, 2023..?

World Frame Determination

Hi there, great work.
I'd like to ask how the world frame is determined. If two synthesized motion representations actually describe the same interaction but the world frames are chosen differently(which makes the global joint positions different), will the evaluation results vary greatly?

Question for input data

Thank you for open source data.
For 1 frame, data have 492 dimension data. Could you explain how to parser joint rotation and root position from data?

How to visualize the generated results with mesh?

Thanks for your provided code.
I'm wondering how to visualize the generated motion with mesh (e.g., SMPL). I can see you could achieve this to present the Fig.7 in your paper. Could you provide some instructions or relevant codes for this?

Thanks

tr3e / intergen Goto Github PK

intergen's People

Contributors

Stargazers

Watchers

Forkers

intergen's Issues

Issue Title: Missing Required Data Files: ignore_list.txt and train.txt

Issue Description:

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Additional Information

Temporary Workaround

Environment Details

Pictures:

Codes:

Originally posted by @tr3e in #10 (comment)

Recommend Projects

Recommend Topics

Recommend Org

Jobs

Issue Title: Missing Required Data Files: `ignore_list.txt` and `train.txt`