Comments (7)
Did some further research:
-
so if i have the cloth mask, cloth image, human image, human image parsed, human pose --> what is a way i can concatenate these together to present the input image to the diffuser model, and have it generate an output and then match that against the expected output?
-
ideally, i could just concatenate the cloth image + human image and check output against the expected one.
open to thoughts/ ways of doing this .
from diffusers.
hi @anton-l,
just wanted to circle back to this. I'm not sure how i could concat the 2 images and pass that + output through the diffusion model. Curious if you might have any ideas for how to approach this?
cc: @patrickvonplaten, @patil-suraj
from diffusers.
Hi @krrishdholakia! By setting in_channels
and out_channels
in the UNet configuration you can adapt it to concatenated input and outputs, e.g. in_channels=6 for two concatenated input images.
from diffusers.
@anton-l How would you calculate loss at the interim stages for this? since you want it to generate a target image different (i.e. person wearing the clothing) from the concatenated images (clothing item + source person image)
# Predict the noise residual noise_pred = model(noisy_images, timesteps)["sample"] loss = F.mse_loss(noise_pred, noise) accelerator.backward(loss)
from diffusers.
hey @anton-l just wanted to follow up on this
cc: @patil-suraj @patrickvonplaten
from diffusers.
@krrishdholakia the idea would be to feed the concatenated clothing + person images (6 channels), and have 6 channels as output as well (since the number of channels needs to match to compute the residuals). Then the first (or last) 3 channels of the output would be your predicted clothed person, and the other 3 channels can be discarded (not used for the loss calculation). This is similar to how super-resolution is done with diffusion models.
from diffusers.
Hey @krrishdholakia not quite what you're looking for, but we now have an in-painting example with stable diffusion here https://github.com/huggingface/diffusers/tree/main/examples/inference#in-painting-using-stable-diffusion
from diffusers.
Related Issues (20)
- lpw_stable_diffusion pipeline not working when "from_single_file" is used HOT 7
- stable diffusion adapter pipeline for t2i adapter missing `from_single_file` HOT 2
- changing stable cascade sampler HOT 1
- Strange Results in Latent to Diffusion HOT 4
- single_file_utils.py#load_single_file_model_checkpoint bugs HOT 4
- [Exception]Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! HOT 2
- Flax's use_memory_efficient_attention is broken
- text_to_image multi-gpu not working HOT 4
- Importing error; modules not added to __init__.py HOT 2
- new from_single_file implementation is always internet-first and using local files only on timeout HOT 12
- pipe.scheduler.sigmas change after forward call HOT 5
- Add aspect ratio bucketing to training scripts HOT 6
- Support Lumina T2I 5B flow matching T2I DiT model HOT 1
- device_map='balanced' pipe.load_textual_inversion() function error HOT 3
- training example for instruct pix2pix doesn't zero out embeds HOT 8
- how can I use control_guidance_start, end in ip adapter ? HOT 1
- AttributeError: module diffusers has no attribute ORTStableDiffusionXLPipeline. Did you mean: 'StableDiffusionXLPipeline HOT 3
- About training controlnet with lora and meet problems
- support Kandinsky 3.1
- `StableDiffusionXLControlNetInpaintPipeline` not working with IP-Adapter when using `ip_adapter_image_embeds` parameter. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from diffusers.