google-research / maskgit Goto Github PK
View Code? Open in Web Editor NEWOfficial Jax Implementation of MaskGIT
License: Apache License 2.0
Official Jax Implementation of MaskGIT
License: Apache License 2.0
With MaskGIT, if the input image size varies over iteration, it does not work because bidirectional transformer needs input image size to be fixed?
If in autoregressive way, there is no limit in length or size as I know.
Is it right?
Hi, cool work!
I know that the license of the source code is Apache 2.0.
I am aware that Google does not own the rights to the dataset (ImageNet), but it would be cool to have an express license on the pretrained models as well, as far as Google and the repo owners' rights are concearned (although I am not arguing pretrained models are copyrightable).
If the intention is for the Apache 2.0 license to apply to the pretrained models (which are hosted elsewhere) as well, then I suggest adding a "License" section to the README file clarifying this.
Hello!
Thanks for providing part of the code and pre trained models.
I am wondering if you guys are planning on releasing the training code.
Thanks again!
Hi there! First off, thanks so much for publishing this code!
This issue may just amount to my GPU not having enough memory, but I thought I'd share it since the Colab mentions that pmap should work with V100s. I am running it on Colab Pro+ with the High-RAM runtime shape, but when I get to this line:
results = p_generate_256_samples(pmap_input_tokens, sample_rngs)
I get the following error (I've tried a few times):
---------------------------------------------------------------------------
UnfilteredStackTrace Traceback (most recent call last)
[<ipython-input-9-f2eb36cfcc0b>](https://localhost:8080/#) in <module>()
17 sample_rngs = jax.random.split(sample_rng, jax.local_device_count())
---> 18 results = p_generate_256_samples(pmap_input_tokens, sample_rngs)
19
10 frames
UnfilteredStackTrace: RuntimeError: UNKNOWN: CUDNN_STATUS_NOT_SUPPORTED
in external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc(4839): 'status'
The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.
--------------------
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
[<ipython-input-9-f2eb36cfcc0b>](https://localhost:8080/#) in <module>()
16 elif run_mode == 'pmap':
17 sample_rngs = jax.random.split(sample_rng, jax.local_device_count())
---> 18 results = p_generate_256_samples(pmap_input_tokens, sample_rngs)
19
20 # flatten the pmap results
RuntimeError: UNKNOWN: CUDNN_STATUS_NOT_SUPPORTED
in external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_dnn.cc(4839): 'status'
It may also be notable that the function runs for a long time (20+ seconds) before throwing that error despite the fact that pmap should make it fast. Since I get an actual out-of-memory crash here on other GPUs with less RAM, I figured there may be a chance that this is a different issue.
Thx again for sharing the code :)
Hello. Thanks for the awesome project !
I want to reproduce the checkpoint of Stage 1 model, which has rFID < 2.5, since the rFID is much lower than original VQ-GAN (rFID=4.7).
Could you let me know the detailed differences of implementation and training from the original VQGAN?
I think it is important for a fair comparison of other works and reproducibility.
Thank you wonderful job, i was very interesting with your job,when are you prepare to release your training code?
Dear authors,
Congratulation on your acceptance to CVPR; this is fantastic work.
Do you have a rough expectation for the release date of the training code?
Thanks
In the paper, it seems that there is also functionality for the image inpainting. Is there any possibility for publishing it?
I saw on the https://github.com/dome272/MaskGIT-pytorch the functionality is still under development, is there any update? Thank you!
Hey,
I was trying to run the untouched notebook in an evironment with jax-0.3.13
and jaxli-0.3.10
on an Ubuntu 18.04 machine with CUDA11.7 and CUDNN 8.2, but I get the error
TypeError: take_along_axis indices must be of integer type, got float32
when running
elif run_mode == 'pmap': sample_rngs = jax.random.split(sample_rng, jax.local_device_count()) results = p_generate_256_samples(pmap_input_tokens, sample_rngs)
Any help?
Hi! Thank you for your great work! May I ask when are you planning to release the full training code for this project?
Hello, thank you for sharing this code, it is very useful!
I am wondering if there is a bug in the ResBlock in vqgan_tokenizer
. It currently is like this:
if input_dim != self.filters:
if self.use_conv_shortcut:
residual = self.conv_fn(
self.filters, kernel_size=(3, 3), use_bias=False)(
x)
else:
residual = self.conv_fn(
self.filters, kernel_size=(1, 1), use_bias=False)(
x)
return x + residual
But this is using the same x
coming from the previous convolutions, so should it be like this instead?
if input_dim != self.filters:
if self.use_conv_shortcut:
residual = self.conv_fn(
self.filters, kernel_size=(3, 3), use_bias=False)(
residual)
else:
residual = self.conv_fn(
self.filters, kernel_size=(1, 1), use_bias=False)(
residual)
return x + residual
Hi, many thanks for your code.
I'm using colab notebook, is it possible to get result images in real size?
Once this line is executed
visualize_images(composite_images, title=f'outputs')
I got images, but at a very small size, how can I get them at 512?
Many thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.