humansensinglab / iti-gen Goto Github PK
View Code? Open in Web Editor NEW[ICCV 2023 Oral, Best Paper Finalist] ITI-GEN: Inclusive Text-to-Image Generation
Home Page: https://czhang0528.github.io/iti-gen
License: Other
[ICCV 2023 Oral, Best Paper Finalist] ITI-GEN: Inclusive Text-to-Image Generation
Home Page: https://czhang0528.github.io/iti-gen
License: Other
请问下,论文中的生成效果使用到的prompt模型(basis_final_embed_19.pt)后面开源吗,我们这边复现了下,改变年龄属性生成图质量很差,是和参考图的选择有关系吗?
Hi,
Would it be possible to include the performance metrics code used in the original paper?
When n_samples
is set to a value > 1, the generation errors out.
To reproduce
python generation.py
--config='models/sd/configs/stable-diffusion/v1-inference.yaml'
--ckpt='models/sd/models/ldm/stable-diffusion-v1/model.ckpt'
--attr-list='Male'
--outdir='./ckpts/a_headshot_of_a_person_Male/original_prompt_embedding/sample_results'
--prompt-path='./ckpts/a_headshot_of_a_person_Male/original_prompt_embedding/basis_final_embed_19.pt'
--n_iter=5
--n_rows=5
--n_samples=4
--gpu 0
Full error trace
Global seed set to 42
Warning: Got 1 conditionings but batch-size is 4
Data shape for DDIM sampling is (4, 4, 64, 64), eta 0.0
Running DDIM Sampling with 50 timesteps
DDIM Sampler: 0%| | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
File "generation.py", line 371, in <module>
main()
File "generation.py", line 321, in main
samples_ddim, tmp = sampler.sample(S=opt.ddim_steps,
File "/data2/user/miniconda3/envs/iti-gen/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/data2/user/blackbox_bias_mitigation/blackbox-codebase/ITI-GEN/models/sd/ldm/models/diffusion/ddim.py", line 96, in sample
samples, intermediates = self.ddim_sampling(conditioning, size,
File "/data2/user/miniconda3/envs/iti-gen/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/data2/user/blackbox_bias_mitigation/blackbox-codebase/ITI-GEN/models/sd/ldm/models/diffusion/ddim.py", line 149, in ddim_sampling
outs = self.p_sample_ddim(img, cond, ts, index=index, use_original_steps=ddim_use_original_steps,
File "/data2/user/miniconda3/envs/iti-gen/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/data2/user/blackbox_bias_mitigation/blackbox-codebase/ITI-GEN/models/sd/ldm/models/diffusion/ddim.py", line 177, in p_sample_ddim
e_t_uncond, e_t = self.model.apply_model(x_in, t_in, c_in).chunk(2)
File "/data2/user/blackbox_bias_mitigation/blackbox-codebase/ITI-GEN/models/sd/ldm/models/diffusion/ddpm.py", line 987, in apply_model
x_recon = self.model(x_noisy, t, **cond)
File "/data2/user/miniconda3/envs/iti-gen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/data2/user/blackbox_bias_mitigation/blackbox-codebase/ITI-GEN/models/sd/ldm/models/diffusion/ddpm.py", line 1410, in forward
out = self.diffusion_model(x, t, context=cc)
File "/data2/user/miniconda3/envs/iti-gen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/data2/user/blackbox_bias_mitigation/blackbox-codebase/ITI-GEN/models/sd/ldm/modules/diffusionmodules/openaimodel.py", line 732, in forward
h = module(h, emb, context)
File "/data2/user/miniconda3/envs/iti-gen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/data2/user/blackbox_bias_mitigation/blackbox-codebase/ITI-GEN/models/sd/ldm/modules/diffusionmodules/openaimodel.py", line 85, in forward
x = layer(x, context)
File "/data2/user/miniconda3/envs/iti-gen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/data2/user/blackbox_bias_mitigation/blackbox-codebase/ITI-GEN/models/sd/ldm/modules/attention.py", line 258, in forward
x = block(x, context=context)
File "/data2/user/miniconda3/envs/iti-gen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/data2/user/blackbox_bias_mitigation/blackbox-codebase/ITI-GEN/models/sd/ldm/modules/attention.py", line 209, in forward
return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint)
File "/data2/user/blackbox_bias_mitigation/blackbox-codebase/ITI-GEN/models/sd/ldm/modules/diffusionmodules/util.py", line 114, in checkpoint
return CheckpointFunction.apply(func, len(inputs), *args)
File "/data2/user/blackbox_bias_mitigation/blackbox-codebase/ITI-GEN/models/sd/ldm/modules/diffusionmodules/util.py", line 127, in forward
output_tensors = ctx.run_function(*ctx.input_tensors)
File "/data2/user/blackbox_bias_mitigation/blackbox-codebase/ITI-GEN/models/sd/ldm/modules/attention.py", line 213, in _forward
x = self.attn2(self.norm2(x), context=context) + x
File "/data2/user/miniconda3/envs/iti-gen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/data2/user/blackbox_bias_mitigation/blackbox-codebase/ITI-GEN/models/sd/ldm/modules/attention.py", line 180, in forward
sim = einsum('b i d, b j d -> b i j', q, k) * self.scale
File "/data2/user/miniconda3/envs/iti-gen/lib/python3.8/site-packages/torch/functional.py", line 330, in einsum
return _VF.einsum(equation, operands) # type: ignore[attr-defined]
Hey,
First of all, congratulations for the paper, we found it extremely interesting. We were trying to replicate the results on our end, and came across some image quality issues.
We trained the tokens as follows:
python train_iti_gen.py \
--prompt='a headshot of a person' \
--attr-list='Male,Skin_tone,Age' \
--epochs=30 \
--save-ckpt-per-epochs=10 \
--device=0
Then, we generated the images by running
python generation.py \
--config='models/sd/configs/stable-diffusion/v1-inference.yaml' \
--ckpt='models/sd/models/ldm/stable-diffusion-v1/model.ckpt' \
--plms \
--attr-list='Male,Skin_tone,Age' \
--outdir='./results/a_headshot_of_a_person_Male_Skin_tone_Age/' \
--prompt-path='./ckpts/a_headshot_of_a_person_Male_Skin_tone_Age/original_prompt_embedding/basis_final_embed_19.pt' \
--n_iter=5 \
--n_rows=5 \
--n_samples=1 \
--gpu=0
We can see some samples below.
The images look alright, but we can notice there is "something off" around the eyes and nose. Thus, we were wondering if you modified any of the generator hyperparameters (scale, resolution, sampler, number of denoising steps...).
Thanks in advance, and congratulations again for the paper!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.