GithubHelp home page GithubHelp logo

adobe-research / diffusionhandles Goto Github PK

View Code? Open in Web Editor NEW
5.0 3.0 1.0 29.2 MB

Diffusion Handles is a training-free method that enables 3D-aware image edits using a pre-trained Diffusion Model.

Home Page: https://diffusionhandles.github.io/

Python 66.88% Jupyter Notebook 32.40% HTML 0.28% Shell 0.44%
3d-aware-image-synthesis 3d-aware-models diffusion-models image-editing

diffusionhandles's Introduction

DiffusionHandles

[Project Page][ArXiv]

Diffusion Handles is a training-free method that enables 3D-aware image edits using a pre-trained Diffusion Model.

Teaser

This is the official implementation of the paper
Diffusion Handles: Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D
by Karran Pandey, Paul Guerrero, Metheus Gadelha, Yannick Hold-Geoffroy, Karan Singh, Niloy J. Mitra
published at CVPR 2024.

Examples

Example 1 Example 2 Example 3 Example 4

Approach

Example Pipeline Overview Edit

  1. The input image is first reconstructed with a depth-to-image diffusion model. Intermediate activations are recorded.

  2. Depth is estimated using a monocular depth estimator and the intermediate activations from the last step are lifted to the 3D depth surface.

  3. A user-supplied 3D transform is applied to the depth surface and the lifted activations.

  4. The 3D-transformed depth and activations are used to guide the diffusion model to generate an edited image.

Installation

Create a Conda environment:

conda create -n diffusionhandles python=3.9
conda activate diffusionhandles

CUDA & PyTorch Installation

If PyTorch and a compatible CUDA runtime are not installed on your system, install PyTorch with conda to make sure you have a CUDA version that works with PyTorch:

conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

If a suitable CUDA dev environment including nvcc is not installed on your system, install CUDA dev environment matching the CUDA runtime version:

conda install cuda-libraries-dev=12.1 cuda-nvcc=12.1 cuda-nvtx=12.1 cuda-cupti=12.1 -c nvidia

Clone the Diffusion Handles repository:

git clone https://https://github.com/Research-Adobe/DiffusionHandles.git
cd DiffusionHandles

Next, install Diffusion Handles as editable package. Different sets of package dependencies are provided, depending on what you need:

pip install -e . # Only basic packages required for the 'diffhandles' directory.
pip install -e .[test] # Basic + packages required for the 'test' directory.
pip install -e .[webapp] # Basic + packages required for the 'webapp' directory.

Run Test Scripts

The following will run through the test set and put results in the results subdirectory:

cd test
python test_diffusion_handles.py

Run Web App

Start the full Diffusion Handles Pipeline Web App in tmux, where netpath is the base network path from the root of the server (for example /demo for a server at https://my_server.com/demo):

sudo apt install tmux
tmux
cd webapp
source start_webapps_in_tmux.sh <netpath>

The main web app should then be reachable at https://my_server.com/demo/dh.

The demo consists of multiple services, all of which are started by this script in separate tmux tabs. The main service is in the diffhandles_pipeline tab and requires the other services to be running, so the script above starts the main service a few seconds after the others. This is sometimes too short for the other servivces to have finished starting, in that case the main service will fail. You can check by going to the diffhandles_pipeline tab in tmux and check if the service is running there. If not, just repeat the last command that was run in that tab after making sure the services in all other tabs are running.

Check start_webapps_in_tmux.sh to adjust configuration details like the distribution of ports and GPUs among services.

Citation

@article{pandey2024diffusionhandles,
  title={Diffusion Handles: Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D},
  author={Pandey, Karran and Guerrero, Paul and Gadelha, Metheus and Hold-Geoffroy, Yannick and Singh, Karan and Mitra, Niloy J.},
  journal={CVPR},
  year={2024}
}

diffusionhandles's People

Stargazers

Richard Huang avatar Julian Londono avatar  avatar  avatar Max Ku avatar

Watchers

David Tompkins avatar Eric Stollnitz avatar Kostas Georgiou avatar

Forkers

jackzhousz

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.