tr3e / intergen Goto Github PK
View Code? Open in Web Editor NEW[IJCV 2024] InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions
Home Page: https://tr3e.github.io/intergen-page/
[IJCV 2024] InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions
Home Page: https://tr3e.github.io/intergen-page/
请问daily motion在数据集的标号上有体现吗,还是说daily motion和professional motions在编号上是完全打乱的,感谢回复!
After I load one npy file in motino_process, I notice the shape of it is T*492.
I am curious about the concret meaning of 492, what are the components in 492? In the code, I understand it contains the global postion for 22x3 and 6D rotation represention for 21x3, How about the others?
Hey,
Great job on this! I noticed that the time steps in "1.pkl" in the "motions" folder don't quite match up with "1.npy" in "motions_processed". Any idea what's causing the difference?
Mind sharing the script you used for the conversion from "motions" to "motions_processed"? It'd be super helpful for getting a handle on the "motions_processed" dataset format.
Thanks a bunch!
Hello,
I'm attempting to calculate the global_mean and global_std for the whole dataset myself so I can then find the values for a subset of the motion.
I accumulated the 262 processed features across all frames for motion1 and motion1_swap, then for motion2 and motion2_swap relative to motion1. When computing the mean and average I achieve different results than those supplied in the data directory.
Can I inquire as to how exactly you are computing the average and std for the dataset? Thank you for your work and assistance thus far!
Hi,
I hope this finds you well. I've been studying your Intergen paper and was impressed by the use of the T2M framework for evaluating the InterHuman dataset. I'm working on a related project and am very interested in replicating your experiment.
Would it be possible for you to share the source code used for these experiments? Access to the code would be incredibly helpful for my work.
Thank you for considering my request.
Best
Certainly! Here is a suggested GitHub issue that you can post to the author. This includes the topic and detailed description of the problem you're facing:
ignore_list.txt
and train.txt
Hello,
I am trying to run the train.py
script provided in your repository. However, I encountered errors related to missing files in the ./data/interhuman_processed/
directory. Specifically, the following files are missing:
ignore_list.txt
train.txt
The absence of these files causes the script to fail with the following errors:
[Errno 2] No such file or directory: './data/interhuman_processed/ignore_list.txt'
[Errno 2] No such file or directory: './data/interhuman_processed/train.txt'
Additionally, I observed the following warning related to the checkpoint directory:
/home/blinkdrive/miniconda3/envs/intergen/lib/python3.8/site-packages/lightning/pytorch/callbacks/model_checkpoint.py:613: UserWarning: Checkpoint directory ./checkpoints/IG-S-8/model exists and is not empty.
rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")
train.py
script.The script should run without any missing file errors.
The script fails due to missing ignore_list.txt
and train.txt
files.
To resolve this issue, could you please provide the missing ignore_list.txt
and train.txt
files? Alternatively, if these files need to be generated, could you provide the necessary instructions or scripts to create them?
As a temporary workaround, I created empty placeholder files for ignore_list.txt
and train.txt
. However, I am not sure what data these files should contain, which might affect the training process.
Collecting scipy~=1.4.1 (from -r requirements.txt (line 5))
Using cached scipy-1.4.1.tar.gz (24.6 MB)
Installing build dependencies ... error
error: subprocess-exited-with-error
.......
File "/private/var/folders/pc/sbd6mtzs5g57c9z_vqdnj2z40000gn/T/pip-install-1gioli0t/numpy_e9ee180d7c8849eda824a6ce3f216369/numpy/distutils/command/build_src.py", line 142, in run
self.build_sources()
File "/private/var/folders/pc/sbd6mtzs5g57c9z_vqdnj2z40000gn/T/pip-install-1gioli0t/numpy_e9ee180d7c8849eda824a6ce3f216369/numpy/distutils/command/build_src.py", line 153, in build_sources
self.build_library_sources(*libname_info)
File "/private/var/folders/pc/sbd6mtzs5g57c9z_vqdnj2z40000gn/T/pip-install-1gioli0t/numpy_e9ee180d7c8849eda824a6ce3f216369/numpy/distutils/command/build_src.py", line 286, in build_library_sources
sources = self.generate_sources(sources, (lib_name, build_info))
File "/private/var/folders/pc/sbd6mtzs5g57c9z_vqdnj2z40000gn/T/pip-install-1gioli0t/numpy_e9ee180d7c8849eda824a6ce3f216369/numpy/distutils/command/build_src.py", line 369, in generate_sources
source = func(extension, build_dir)
File "numpy/core/setup.py", line 669, in get_mathlib_info
raise RuntimeError("Broken toolchain: cannot link a simple C program")
RuntimeError: Broken toolchain: cannot link a simple C program
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for numpy
how to fix it? please help~
I'm trying to visualize the human motions with the given 6D rotation representations in "motions_processed" folder. However, with the following code, the visualization result is quite weird.
Could you provide some guidance?
from human_body_prior.body_model.body_model import BodyModel
import numpy as np
from pytorch3d import transforms
from common.quaternion import *
imw, imh = 800, 800
mv = MeshViewer(width=imw, height=imh, use_offscreen=False)
bm_fname = '/smplh/neutral/model.npz'
bm = BodyModel(bm_fname=bm_fname, num_betas=10)
data1 = np.load("./data/motions_processed/person1/6035.npy")
rot_6d = data1[..., 62 * 3:62 * 3 + 21 * 6]
rot_matrix = cont6d_to_matrix(torch.from_numpy(rot_6d.reshape(-1, 6)))
rot_axis_angle = transforms.matrix_to_axis_angle(rot_matrix).view(-1, 63)
body_pose_beta = bm(pose_body=rot_axis_angle)
for fId in tqdm.tqdm(range(body_pose_beta.v.shape[0])):
body_mesh = trimesh.Trimesh(vertices=convert(body_pose_beta.v[fId]), faces=bm.f, vertex_colors=np.tile(colors['grey'], (6890, 1)))
mv.viewer.render_lock.acquire()
mv.set_static_meshes([body_mesh])
mv.viewer.render_lock.release()
plt.pause(1)
Hi,
Thanks for your work. Can I ask why the dimension of "pose_body" in the .pkl file is 63, which means 21 joints, rather than 24 joints in the SMPL model? I couldn't find the missing 3 joints or any skeleton hierarchy you used from the paper and the released codes. Thanks!
Thanks for your great work!
Would you mind sharing the code to train the evaluation model interclip? I'm investigating its performance and your help would be appreciated.
After I load one npy file in motino_process, I notice the shape of it is num_frame*492.
I am curious about the concret meaning of 492, what are the components in 492?
I see the author explains in other issues that it contains SMPL-H joint positions and rotations (6D representation, no root and finger tops), totaling 62x3+51x6. Does it mean there's an finger tops parameter(22+20+20=62)? If that's the case, what's the order of the joints?
Hi! Thanks a lot for the nice work as the dataset looks great.
In the paper, it is mentioned that some preprocessing is done (mirroring all motions as well as augmenting the descriptions). Would it be possible to share this script so everyone can work with similar datasets? Same for the train-val-test split.
Thanks a lot for your great work. I'd like to know why I encountered an error when I try to run infer.py. Here is the error info I encountered:
Traceback (most recent call last):
File "tools/infer.py", line 126, in
litmodel.generate_one_sample(text, name+"_"+str(i))
File "tools/infer.py", line 65, in generate_one_sample
self.plot_t2m([motion_output[0], motion_output[1]],
File "tools/infer.py", line 49, in plot_t2m
plot_3d_motion(result_path, paramUtil.t2m_kinematic_chain, mp_joint, title=caption, fps=30)
File "/home/ubuntu/xiyan/InterGen/tools/../utils/plot_script.py", line 129, in plot_3d_motion
ani.save(save_path, fps=fps)
File "/home/ubuntu/.local/lib/python3.8/site-packages/matplotlib/animation.py", line 1090, in save
anim._init_draw() # Clear the initial frame
File "/home/ubuntu/.local/lib/python3.8/site-packages/matplotlib/animation.py", line 1748, in _init_draw
self._draw_frame(frame_data)
File "/home/ubuntu/.local/lib/python3.8/site-packages/matplotlib/animation.py", line 1767, in _draw_frame
self._drawn_artists = self._func(framedata, *self._args)
File "/home/ubuntu/xiyan/InterGen/tools/../utils/plot_script.py", line 104, in update
ax.lines = []
AttributeError: can't set attribute
When I run train.py, the error occurs as following:
UserWarning: DataLoader
returned 0 length. Please make sure this was your intention.
rank_zero_warn(
/data1/fyy/anaconda3/envs/intergen/lib/python3.8/site-packages/lightning/pytorch/utilities/data.py:110: UserWarning: Total length of CombinedLoader
across ranks is zero. Please make sure this was your intention.
rank_zero_warn( rank_zero_warn(
Training: 0it [00:00, ?it/s]Trainer.fit
stopped: No training batches.
Training: 0it [00:00, ?it/s]
Beacuse
[Errno 2] No such file or directory: './data/interhuman_processed/ignore_list.txt'
[Errno 2] No such file or directory: './data/interhuman_processed/ignore_list.txt'
[Errno 2] No such file or directory: './data/interhuman_processed/ignore_list.txt'
[Errno 2] No such file or directory: './data/interhuman_processed/train.txt'
[Errno 2] No such file or directory: './data/interhuman_processed/train.txt'
[Errno 2] No such file or directory: './data/interhuman_processed/train.txt'
How can I get the interhuman_processed?
Hi,
Thanks for your work. Your work is excellent!Can I inquire about how the 492 columns in each npy file under the folder "motions.processed" are composed? What exactly does this 492 represent in order? Thank you very much!!!
Hello, thanks for sharing your excellent work and dataset.
As I've been working on visualizing 3D skeletons using the provided dataset, I've encountered some issues. Specifically, the plotted skeletons look unnatural and seems not on the same plane. And the root orientation changes quite abruptly and it appears that the camera of the scene is not fixed. Here is the plotted picture and some of my code, I use SMPL model to visualize the skeletion:
for person_name in ['person1', 'person2']:
global_orient = motions_data[person_name]['root_orient']
body_pose = motions_data[person_name]['pose_body']
betas = motions_data[person_name]['betas']
transl = motions_data[person_name]['trans']
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
global_orient_tensor = torch.from_numpy(global_orient).float().to(device)
body_pose_tensor = torch.from_numpy(body_pose).float().to(device)
betas_tensor = torch.from_numpy(betas).float().to(device)
transl_tensor = torch.from_numpy(transl).float().to(device)
rotation_matrix = euler_angles_to_matrix(
body_pose_tensor.view(body_pose_tensor.shape[0], -1, 3), 'XYZ')
global_orient_mat = euler_angles_to_matrix(global_orient_tensor, 'XYZ').view(global_orient_tensor.shape[0], -1, 3, 3)
dtype = body_pose_tensor.dtype
add_rotation_matrix = torch.eye(3, device=device, dtype=dtype).view(
1, 1, 3, 3).expand(rotation_matrix.shape[0], 2, -1, -1)
processd_rot_mat = torch.cat([rotation_matrix, add_rotation_matrix], dim=1)
betas_tensor_input = betas_tensor.expand(global_orient_mat.shape[0], -1)
smpl_model = SMPL().eval().to(device)
out = smpl_model(
global_orient=global_orient_mat,
body_pose=processd_rot_mat,
betas=betas_tensor_input,
transl=transl_tensor)
show_idx = 0
if person_name == 'person1':
plot_skeleton_onePic(
out['smpl'][show_idx].cpu().detach().numpy(),
ax,
'r',
'Skeleton 1',
humanact12_kinematic_chain
)
else:
plot_skeleton_onePic(
out['smpl'][show_idx].cpu().detach().numpy(),
ax,
'b',
'Skeleton 2',
humanact12_kinematic_chain
)
Hello! thank you for your excellent work!
Can you share your normalization function with which you calculated your mean and std martices?
Thank you!
Hello! First, let me thank you for your paper and released code/dataset!
I was wondering if you could share the precise format of the '.npy' files in motions_processed.
I understand from preprocess.py you append the joint positions and rotations.
I'm curious where joint velocities and foot-ground features are located within the file format, as well as other pertinent info.
Thanks!
Hi, I tried to visualize human mesh according to rotation in npy file of processed_motions.
This is what I got:
motions = np.load("InterHuman/motions_processed/person1/1.npy")
motions = torch.from_numpy(motions)
motions=motions[:,62*3: 62*3+21*6].reshape(-1, 21, 6)
motions=rotation_6d_to_matrix(motions)
motions=matrix_to_axis_angle(motions)
zero=torch.zeros(motions.shape[0], 1, 3)
motions=torch.cat([zero, motions, zero, zero], dim=1)
motions=np.array(motions)[0].reshape(-1)
joblib.dump(motions, 'scratch/temp/intergen_obj/pro_test.pt')
I get rotation like this, and input the 72-dim vector as pose to the SMPL model.
Do you have any advice to fix it?
Hi, thanks for sharing your great work and your code!
When I run the training scripts, interhuman.py reports a lot of '.npy' file missing in person2 folder. I also find a lot of xxxx(1).npy in both person1 and person2 folder where xxxx is numbers, which seems to be errors related to the pre-processing scripts to converting original '.pkl' files to be '.npy' files. Could you fix this issue or could you provide the scripts for converting the data in "motions" to be the format in "motions_processd"?
Looking for your reply! Thanks.
Thank you very much for your work and the new dataset!
I am trying to reproduce your results with your given model checkpoints/intergen.ckpt
and for some reason don't get the same results.
========== MM Distance Summary ==========
---> [ground truth] Mean: 3.7849 CInterval: 0.0012
---> [InterGen] Mean: 3.7978 CInterval: 0.0011
========== R_precision Summary ==========
---> [ground truth](top 1) Mean: 0.2491 CInt: 0.0048;(top 2) Mean: 0.3828 CInt: 0.0064;(top 3) Mean: 0.4780 CInt: 0.0076;
---> [InterGen](top 1) Mean: 0.2748 CInt: 0.0042;(top 2) Mean: 0.4065 CInt: 0.0052;(top 3) Mean: 0.4881 CInt: 0.0050;
========== FID Summary ==========
---> [ground truth] Mean: 0.9993 CInterval: 0.0211
---> [InterGen] Mean: 7.1862 CInterval: 0.1244
These are the results I get. Looking at the FID summary, The groundtruth's mean is 0.9993 and Intergen's mean is 7.1862 while in your paper your results are 0.273 for the ground truth and 5.918 for Intergen's model.
Thank you!
Thanks for sharing your awesome work.
I was wondering how can I get access to the videos of the boxing just from 4 different view points. I want to create the multi-person animation of the boxing portion of your dataset with my own pipeline that works with 3 or 4 views and compare it with your GT results.
Thank you for sharing your excellent code and data!
I am interested in reproducing the person-to-person generation part mentioned in your paper. It was mentioned that fine-tuning was used for this.
Could you share more detailed fine-tuning settings? (how much longer it was trained, how the existing text prompt was handled, etc.)
When I run infer.py the error occurs as following:
Traceback (most recent call last):
File "tools/infer.py", line 128, in
litmodel.generate_one_sample(text, name+"_"+str(i))
File "tools/infer.py", line 66, in generate_one_sample
self.plot_t2m([motion_output[0], motion_output[1]],
File "tools/infer.py", line 50, in plot_t2m
plot_3d_motion(result_path, paramUtil.t2m_kinematic_chain, mp_joint, title=caption, fps=30)
File "/data1/fyy/InterGen/tools/../utils/plot_script.py", line 130, in plot_3d_motion
ani.save(save_path, fps=fps)
File "/data1/fyy/anaconda3/envs/intergen/lib/python3.8/site-packages/matplotlib/animation.py", line 1102, in save
alt_writer = next(writers, None)
TypeError: 'MovieWriterRegistry' object is not an iterator
Line 68 in 8a90d5d
Hello, may I ask why you times this emb scale of 6 when calculating the fid and diversity metrics?
Thanks for sharing your great work!
I have trained the model myself with respect to your readme guideline, but set BATCH_SIZE = 16 and EPOCH=500 due to the lack of computing resources. In this setting, my trained model has much worse performance compared with the evaluation results presented in the paper. I am wondering if it is essential to have exact same training setting to make the model have similar performance to paper's model. Besides, could you kindly release the checkpoint that exclusively trained on the training set? I think that would be really helpful for me!
Thanks for your time and patience!
Hi!
I am trying to reproduce MDM results on Interhuman you shared in your work obtaining for example FID of 9.167 using MDM.
Can you share that code also because I cannot reach to the same evaluation metrices training MDM on Interhuman.
Thank you!
Hello,
I'm attempting to perform the FK from the processed motion data, which contains pos(223), vel(223), and rot(21*6). I have extracted rotation data from that, yet I found that the last 2 joints' rotation(which refer to hands) are something like [1, 0, 0, 1, 0, 0], which would be the wrong 6d representation.
Could you give the format of the npy data in motions_processed? And does the motions_processed(npy files)contain the canonical representation? Thank you for answering!
Hope you are doing well!
I see your dataset is SMPL but it is processed to joint positions and rotations (6D representation).
Now, I want to try running the model with another dataset to compare.
Can you provide more detail about processing data?
Thank you so much
Sorry for that, there are some typos in evaluator.py. We have already fixed that.
please make sure your code is up to date.
Sorry, but I cannot find the latest evaluator.py after Nov 2, 2023..?
Hi there, great work.
I'd like to ask how the world frame is determined. If two synthesized motion representations actually describe the same interaction but the world frames are chosen differently(which makes the global joint positions different), will the evaluation results vary greatly?
Thank you for open source data.
For 1 frame, data have 492 dimension data. Could you explain how to parser joint rotation and root position from data?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.