GithubHelp home page GithubHelp logo

jellyheadandrew / chorus Goto Github PK

View Code? Open in Web Editor NEW
37.0 1.0 1.0 31.28 MB

License: Other

Python 90.68% Shell 0.38% C++ 0.72% Cuda 1.02% Dockerfile 0.05% Makefile 0.02% CMake 0.01% GLSL 0.03% Ruby 0.01% HTML 0.01% Roff 1.90% Jupyter Notebook 5.17% CSS 0.01% Batchfile 0.01%

chorus's Introduction

CHORUS [ICCV 2023, Oral]

Open In Colab   Project Page  

CHORUS: Learning Canonicalized 3D Human-Object Spatial Relations from Unbounded Synthesized Images,
Sookwan Han, Hanbyul Joo
International Conference on Computer Vision (ICCV), 2023


teaser.gif

News

Sep 2023

Initial code release, including the codes for training and evaluation.

Installation

To set up the necessary environments for running CHORUS, please refer to the instructions provided here.

Demo

demo.png

Demo colab notebook is coming soon! (eta October 2023)

Training

overview.gif

NOTE: Current version only supports COCO & LVIS categories.

Dataset Generation

CHORUS is trained on a generated dataset of human-object interaction images. Here, we provide an example for running the entire dataset generation pipeline for the surfboard category. To create this dataset of images, please activate the appropriate environment beforehand using the following command:

conda activate chorus_gen

Prompt Generation

CHORUS initially produces multiple HOI prompts for the given category surfboard using ChatGPT. You can find example prompts for the surfboard category under prompts/demo directory. If you wish to generate prompts for other categories or create your own, follow the steps outlined below.

  1. The OpenAI API relies on API keys for authentication. To generate prompts on your own, it is essential to have access to your own API key. If you don't have one already, please refer to this link.

  2. After successfully configuring your API keys, execute the following command:

    python scripts/generation/generate_prompts.py --categories 'surfboard'

    to generate plausible HOI prompts for the specified surfboard category. By default, the results will be saved under the prompts/demo directory.

Please note that the OpenAI API does not support random seeding, as mentioned here; hence, the results of prompt generation are not reproducible. To address this, we also provide the prompts used in our paper under the prompts/chorus directory, as they may not be reproducible.


Image Generation

To generate the dataset of images from HOI prompts for the surfboard category, execute the following command:

sh scripts/demo_surfboard_gen.sh $NUM_BATCH_PER_AUGPROMPT $BATCH_SIZE   # Default: 20, 6

Please note the following details:

  • The generation process typically requires around 9~10 hours when running on a single RTX 3090 GPU.
  • If you wish to reduce the generation time (i.e., number of generated images), you can consider modifying the $NUM_BATCH_PER_AUGPROMPT argument (default: 20).
  • In case you encounter a CUDA Out Of Memory (OOM) error, you can alleviate this issue by reducing the batch size. Adjust the $BATCH_SIZE argument (default: 6) accordingly.
  • To resume the generation process, simply rerun the command. The program will automatically skip existing samples during the process.
  • The generated images will be saved under the results/images directory by default.

Aggregation (Learning)

Once the dataset generation is complete, CHORUS aggregates the information from images for 3D HOI reasoning. To execute the aggregation pipeline, please activate the appropriate environment beforehand using the following command:

conda activate chorus_aggr

With the generated dataset in place, you can execute the complete aggregation pipeline for the surfboard category by running the following command:

sh scripts/demo_surfboard_aggr.sh

Please note the following details:

  • To resume the aggregation process, simply rerun the command. The program will automatically skip existing samples during the process.
  • After running the command successfully, you can check the visualizations under the results_demo directory!

Evaluation

pap.png

Preparing the Test Dataset

For quantitative evaluation, we utilize the extended COCO-EFT dataset as our test dataset. To set up the test dataset, please follow the steps below.

  1. Download the COCO dataset and the COCO-EFT dataset by running the following command:

    sh scripts/download_coco_eft.sh

    By default, the COCO dataset will be downloaded to the imports/COCO directory, and the COCO-EFT dataset will be downloaded to the imports/eft directory.

  2. After downloading the datasets, preprocess and extend the dataset by executing the following command:

    python scripts/evaluation/extend_eft.py

    This script will prepare the dataset for evaluation.


Reproduce Results

To replicate the results in the paper, you have two options:

Option 1: Download Pretrained Results

You can download the pretrained results using the following script:

sh scripts/download_pretrained_chorus.sh

This option is recommended if you have limited storage space or want to quickly access the pretrained results.

Option 2: Full Reproduction

Note: This process requires at least 5TB of storage to save the generated results.

To fully reproduce the results, follow the steps below.

  1. Activate the chorus_gen environment:

    conda activate chorus_gen
  2. Generate the dataset for all categories used in quantitative evaluation by running:

    sh scripts/run_quant_gen.sh

    Please note that this step may require significant storage space.

  3. Activate the chorus_aggr environment:

    conda activate chorus_aggr
  4. Aggregate the information from images by running:

    sh scripts/run_quant_aggr.sh    

    Please note that this step may require significant storage space.


Evaluate PAP (Projective Average Precision)

We perform quantitative evaluations for COCO categories using the proposed Projective Average Precision (PAP) metrics. To compute PAP for the reproduced results, run the following command:

python scripts/evaluation/evaluate_pap.py --aggr_setting_names 'quant:full'

This command will calculate PAP for each category based on the reproduced results. To report the mean PAP (mPAP) averaged over all categories, execute the following command:

python scripts/evaluation/report_pap.py

Citation

If you find our work helpful or use our code, please consider citing:

@inproceedings{han2023chorus,
  title = {Learning Canonicalized 3D Human-Object Spatial Relations from Unbounded Synthesized Images},
  author = {Han, Sookwan and Joo, Hanbyul},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year = {2023}
}

Acknowledgements

  1. Our codebase builds heavily on

    Thanks for open-sourcing!

  2. We thank Byungjun Kim for valuable insights & comments!

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. However, please note that our code depends on other libraries (e.g., SMPL), which each have their own respective licenses that must also be followed.

chorus's People

Contributors

jellyheadandrew avatar

Stargazers

Taehyeok Ha avatar Lee Seunggeon avatar  avatar Jisoo Kim avatar Joon avatar Yuxuan Xue avatar Jeff Carpenter avatar Xianghui avatar  avatar samwang avatar Raphaël avatar Hyeongjin Nam avatar 최연우(Yonwoo Choi) avatar Jinlu Zhang avatar Changmin Jeon avatar Taewoo Kim avatar Han Joo avatar Inhee Lee avatar  avatar  avatar Pike渔市场 avatar 爱可可-爱生活 avatar  avatar S5248 avatar  avatar Moira Shooter avatar Xiyi Chen avatar Hyunsoo Cha avatar Mr.Hang avatar Jeonghwan Kim avatar Sauradip Nag avatar Petrov Ilya avatar SShowbiz avatar Dimitrije Antic avatar Byungjun Kim avatar Taeksoo Kim avatar Sangwon Beak avatar

Watchers

 avatar

Forkers

peterzs

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.