jiahuilei / gart Goto Github PK

View Code? Open in Web Editor NEW

238.0 9.0 12.0 75.73 MB

GART: Gaussian Articulated Template Models

Home Page: https://www.cis.upenn.edu/~leijh/projects/gart/

License: MIT License

Jupyter Notebook 8.24% Shell 0.55% Python 77.29% CMake 0.18% C++ 2.72% Cuda 10.90% C 0.12%

gart's Introduction

GART: Gaussian Articulated Template Models

Project Page; Arxiv; Youtube video; Bilibili video

2023.Nov.27: Code pre-release.

Install

Our code is tested on Ubuntu 20.04 and cuda 11.8, you may have to change the following install script according to your system. To install, run:

bash install.sh

Prepare Data

Template Models and Poses

SMPL for human body

Download SMPL v1.1 SMPL_python_v.1.1.0.zip from SMPL official website and move and rename SMPL_python_v.1.1.0/smpl/models/*.pkl to PROJECT_ROOT/data/smpl_model So you can get:

PROJECT_ROOT/data/smpl_model
    ├── SMPL_FEMALE.pkl
    ├── SMPL_MALE.pkl
    ├── SMPL_NEUTRAL.pkl

# to check the version of SMPL, here is the checksum of female pkl we are using
cksum SMPL_FEMALE.pkl
3668678829 247530000 SMPL_FEMALE.pkl

AMASS (generation only)

Download the SMPL-X N package of BMLrub subset from AMASS, unzip, and put it into PROJECT_ROOT/data/amass/BMLrub.

For generation application, you only have to download the SMPL models and AMASS poses. You can go to generation section directly and skip the following data downloading steps.

D-SMAL for dog body

If you want to work with dogs, please download smal_data from BITE and put this folder to lib_gart/smal/smal_data so you get:

lib_gart/smal/smal_data
    ├── mean_dog_bone_lengths.txt
    ├── my_smpl_data_SMBLD_v3.pkl
    ├── my_smpl_SMBLD_nbj_v3.pkl
    ├── new_dog_models
    │   ├── 39dogsnorm_newv3_dog_dog_0.obj
    ...
    │   └── X_scaled_39dogsnorm_newv3_dog.npy
    └── symmetry_inds.json

Real Videos

ZJU-MoCap

We use the data from Instant-nvr. Note the poses from Instant-nvr is different from the original ZJU-MoCap, please follow the instructions in Instant-nvr to download their data: download ZJU-MoCap and their smpl model smpl-meta.tar.gz. link the unzipped data so you have:

PROJECT_ROOT/data/
    ├── smpl-meta
    │   ...
    │   └── SMPL_NEUTRAL.pkl
    └── zju_mocap
        ├── my_377
        ├── ...
        └── my_394

PeopleSnapshot

We use the data from InstantAvatar, including their pre-processed poses (already included when you clone this repo). First download the data from PeopleSnapshot. Then run DATA_ROOT=PATH_TO_UNZIPPED_PEOPLESNAPSHOT bash utils/preprocess_people_snapshot.sh to prepare the data.

Dog

Download our preprocessed data (pose estimated via BITE):

cd PROJECT_ROOT/data
gdown 1mPSnyLClyTwITPFrWDE70YOmJMXdnAZm
unzip dog_data_official.zip

UBC-Fashion

Download our preprocessed data (6 videos with pose estimated via ReFit):

cd PROJECT_ROOT/data
gdown 18byTvRqOqRyWOQ3V7lFOSV4EOHlKcdHJ
unzip ubc_release.zip

All the released data and logs can be downloaded from google drive.

Fit Real Videos

We provide examples for the four dataset in example.ipynb.

And all fitting and testing scripts can be found under script.

For example, if you want to fit all people snapshot video, run:

bash script/fit_people_30s.sh

You may found other useful scripts under script. The running time of each configuration is obtained from this laptop with an RTX-3080 GPU with 16GB vRAM and an Intel i9-11950H processor with 4x16GB RAM. We observe that for some unkown reason the training is slower on our A40 cluster.

Text2GART

Please see text2gart.ipynb for an example.

Acknowledgement

Our code is based on several interesting and helpful projects:

InstantAvatar: https://github.com/tijiang13/InstantAvatar
InstantNVR: https://github.com/zju3dv/instant-nvr
DreamGaussian: https://github.com/dreamgaussian/dreamgaussian
MVDream-ThreeStudio: https://github.com/bytedance/MVDream-threestudio
GaussianSplatting: https://github.com/graphdeco-inria/gaussian-splatting
Diff-gaussian-rasterization: https://github.com/JonathonLuiten/diff-gaussian-rasterization
ReFit: https://github.com/yufu-wang/ReFit
BITE: https://github.com/runa91/bite_release

TODO

clean the code
add bibtex

gart's People

Contributors

Stargazers

Watchers

Forkers

whztt07 leohsieh57 sal-dti peterzs yztang4 chikayan alan-delete xichongling yhd-ai mizeller chunjins whuhxb

gart's Issues

How can I get the refined pose of custom data?

I notice that there is a refined pose in peoplesnapshot. I wonder if I can get such refined pose on custom data.
Many thanks :) !

I want to know how to visualize Gaussians like your Fig.1

Thanks a lot for your great work.
I want to visualize the gaussian model as you did in the fig 1.
Thanks

About dL_depth and dL_mean

Hi Jiahui

Thanks for your code releasing!

I am trying to develop the depth supervision based on your lib_render and noticed that you have modify the differentiable_renderer in lib_render

dL_dmean

It seems that the gradients of mean3D did not include depth and alpha for back progration.I would like to know if this is intentional on your part?(I mean will this make the results more better? etc..) Based on my understanding, the gradient from dL_ddepths and dL_dalphas should also be include to the gradient of means3D.

Results on UBC data is noisy

Hi, thanks for this excellent work.
The results on people_snapshot_public are quite good. But when I run on UBC data, the results are quite noisy, as:

May I ask where is the problem? Thanks

Meaning of (A dot A0_inv) of SMPLTemplate

First and foremost, I'd like to express my appreciation for the outstanding work. I have a query regarding a specific part of the implementation.

In the forward function of SMPLTemplate, there's a line where the transformation matrix A is calculated as follows:
A = torch.einsum("bnij, njk->bnik", A, self.A0_inv)
I'm seeking clarification on the conceptual meaning behind the operation A dot A0_inv. As I understand it, A0 represents the relative transformation from the SMPL default pose joints (J) to the DaPose joints (J_dapose). Consequently, inv(A0) should denote the transformation from J_dapose back to J.

If my understanding is correct, then A in this context represents the transformation from J_dapose to the joints in the theta pose (J_theta). Could you please confirm if my interpretation is accurate or provide further explanation of A dot A0_inv if necessary?

Thank you in advance for your assistance!

Relevant code are attached here for convenience.

init_smpl_output = self._template_layer(
    betas=init_beta[None],
    body_pose=can_pose[None, 1:],
    global_orient=can_pose[None, 0],
    return_full_pose=True,
)
J_canonical, A0 = init_smpl_output.J, init_smpl_output.A
A0_inv = torch.inverse(A0)

def forward(self, theta=None, xyz_canonical=None):
            # skinning
            _, A = batch_rigid_transform(
                axis_angle_to_matrix(theta),
                self.J_canonical.expand(nB, -1, -1),
                self._template_layer.parents,
            )
            A = torch.einsum("bnij, njk->bnik", A, self.A0_inv)

VRAM usage and mesh generation

Hi, thanks for developing a fascinating algorithm!

I have just attempted running it on an Nvidia RTX 4070 with 12 GB VRAM, and get a CUDA out of memory error when running bash script/fit_people_30s.sh (it attempts to allocate ~13GB). I now wonder if there are any simple ways of reducing the VRAM usage during the fitting of the model — for example, which parameters in profiles/people/people_30s.yaml have the largest effect on memory?

Additionally, what would be the preferred way of generating a textured mesh from the learned avatar?

Thanks,
Filip

Meaning of A (rot) and A (t)

First and foremost, I want to express my admiration for the incredible work!
I have a bit of confusion, why we use left top 3 × 3 and the right 3 × 1 block rather than other settings?

TypeError: 'NoneType' object is not subscriptable

Hi,
I have got this error when runing 'bash script/fit_people_30s.sh'

Do you have any idea about it?

Best,

Exported .ply don't look like rendered out images

I used the save_gauspl_ply function in lib_gart/model_utils.py to export the gaussian for the training view but the results don't look the same as the exported renders and I'm trying to figure out why.

Is it a bug in the save function or something you do differently in the renderer?

Here's an example using the lab scene from the Neumann dataset with InstantAvatar format

It's not the best looking but you can still see his facial details in the gif but they seem to get lost somehow on the .ply export. here's a copy of the .ply visualized in antimatter's web viewer

when use custom data, the result is bad

how can I get the cameras.npz and train_pose.npz? when I use the data which processed by the instantAvatar, after ./scripts/fit.sh , the resutl is bad, like this: can you give me some suggestions?

Rendering from different views

If I want to render around the object in a circular path, how can I achieve that? Looking forward to your reply, thank you!

Code release of Banana

I recently read your paper titled "Banana" and found it extremely insightful. I'm interested in experimenting with the techniques you've described. Could you please let me know when you plan to release the code?

AttributeError: 'SMPLOutput' object has no attribute 'J'

When I run this program, the following error occurs.
s8maolab | INFO | Dec-13-16:37:57 | Optimization with ./profiles/zju/zju_30s.yaml [solver.py:1389]
WARNING: You are using a SMPL model, with only 10 shape coefficients.
Using predefined pose: da_pose
python-BaseException
Traceback (most recent call last):
File "/.pycharm_helpers/pydev/pydevd.py", line 1500, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/GART/solver.py", line 1390, in
_, optimized_seq = solver.run(data_provider)
File "/GART/solver.py", line 786, in run
) = self._get_model_optimizer(betas=init_beta, add_bones_total_t=total_t)
File "/GART/solver.py", line 181, in _get_model_optimizer
template = get_template(
File "/GART/lib_gart/templates.py", line 29, in get_template
template = SMPLTemplate(
File "/GART/lib_gart/templates.py", line 69, in init
J_canonical, A0 = init_smpl_output.J, init_smpl_output.A
AttributeError: 'SMPLOutput' object has no attribute 'J'

How to process my own dataset for training

SMAL Dog Generation

Hi @JiahuiLei

Can you please let me know when the code for Dog generation will be released?

smpl data correction

I note in zju_mocap.py, you correct the smpl data like this.

        # Use camera extrinsic to rotate the simple to each camera coordinate frame!

        # the R,t is used like this, stored in cam
        # i.e. the T stored in cam is actually p_c = T_cw @ p_w
        # def get_rays(H, W, K, R, T):
        #     # calculate the camera origin
        #     rays_o = -np.dot(R.T, T).ravel()
        #     # calculate the world coodinates of pixels
        #     i, j = np.meshgrid(
        #         np.arange(W, dtype=np.float32), np.arange(H, dtype=np.float32), indexing="xy"
        #     )
        #     xy1 = np.stack([i, j, np.ones_like(i)], axis=2)
        #     pixel_camera = np.dot(xy1, np.linalg.inv(K).T)
        #     pixel_world = np.dot(pixel_camera - T.ravel(), R)
        #     # calculate the ray direction
        #     rays_d = pixel_world - rays_o[None, None]
        #     rays_d = rays_d / np.linalg.norm(rays_d, axis=2, keepdims=True)
        #     rays_o = np.broadcast_to(rays_o, rays_d.shape)
        #     return rays_o, rays_d

        # ! the cams R is in a very low precision, have use SVD to project back to SO(3)
        for cid in range(num_cams):
            _R = self.cams["R"][cid]
            u, s, vh = np.linalg.svd(_R)
            new_R = u @ vh
            self.cams["R"][cid] = new_R

        # this is copied
        smpl_layer = SMPLLayer(osp.join(osp.dirname(__file__), "../data/smpl-meta/SMPL_NEUTRAL.pkl"))

        # * Load smpl to camera frame
        self.smpl_theta_list, self.smpl_trans_list, smpl_beta_list = [], [], []
        self.meta = []
        for img_fn in self.ims:
            cam_ind = int(img_fn.split("/")[-2])
            frame_idx = int(img_fn.split("/")[-1].split(".")[0])
            self.meta.append({"cam_ind": cam_ind, "frame_idx": frame_idx})
            smpl_fn = osp.join(root, "smpl_params", f"{frame_idx}.npy")
            smpl_data = np.load(smpl_fn, allow_pickle=True).item()
            T_cw = np.eye(4)
            T_cw[:3, :3], T_cw[:3, 3] = (
                np.array(self.cams["R"][cam_ind]),
                np.array(self.cams["T"][cam_ind]).squeeze(-1) / 1000.0,
            )

            smpl_theta = smpl_data["poses"].reshape((24, 3))
            assert np.allclose(smpl_theta[0], 0)
            smpl_rot, smpl_trans = smpl_data["Rh"][0], smpl_data["Th"]
            smpl_R = axangle2mat(
                smpl_rot / (np.linalg.norm(smpl_rot) + 1e-6), np.linalg.norm(smpl_rot)
            )

            T_wh = np.eye(4)
            T_wh[:3, :3], T_wh[:3, 3] = smpl_R.copy(), smpl_trans.squeeze(0).copy()

            T_ch = T_cw.astype(np.float64) @ T_wh.astype(np.float64)

            smpl_global_rot_d, smpl_global_rot_a = mat2axangle(T_ch[:3, :3])
            smpl_global_rot = smpl_global_rot_d * smpl_global_rot_a
            smpl_trans = T_ch[:3, 3]  # 3
            smpl_theta[0] = smpl_global_rot
            beta = smpl_data["shapes"][0][:10]

            # ! Because SMPL global rot is rot around joint-0, have to correct this in the global translation!!
            _pose = axis_angle_to_matrix(torch.from_numpy(smpl_theta)[None])
            so = smpl_layer(
                torch.from_numpy(beta)[None],
                body_pose=_pose[:, 1:],
            )
            j0 = (so.joints[0, 0]).numpy()
            t_correction = (_pose[0, 0].numpy() - np.eye(3)) @ j0
            smpl_trans = smpl_trans + t_correction

            self.smpl_theta_list.append(smpl_theta)
            smpl_beta_list.append(beta)
            self.smpl_trans_list.append(smpl_trans)

Do this correction has any reference? Since during the correction, camera pose is used, it means one frame has different smpl data in different views. It feels quiet strange...

training condition for own dataset

Hello
I tried neuman dataset using UBC-dataset training condition and peoplesnapshot dataset training condition.
but the results was not so good.
Should we fine tune hyper parameters by ourselves?
I'd like to know how to optimize i, or this is a expected results?

May I know how to get the LPIPS* score?

Hi, thanks for the excellent work. In the Table 3. of your paper, you used "LPIPS*" (with a star) as a metric instead of normal "LPIPS" (which is usually zero point something), may I know how to calculate the "LPIPS*" score? Thanks

Custom data

Congratulations on this excellent work!
I wonder how to run this work on my own data. For example, after capturing a monocular video, how to run your method? How should I process the data for training?
Many thanks!

jiahuilei / gart Goto Github PK

gart's Introduction

GART: Gaussian Articulated Template Models

Install

Prepare Data

Template Models and Poses

Real Videos

Fit Real Videos

Text2GART

Acknowledgement

TODO

gart's People

Contributors

Stargazers

Watchers

Forkers

gart's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs