GithubHelp home page GithubHelp logo

justin-tan / high-fidelity-generative-compression Goto Github PK

View Code? Open in Web Editor NEW
406.0 5.0 77.0 86.07 MB

Pytorch implementation of High-Fidelity Generative Image Compression + Routines for neural image compression

License: Apache License 2.0

Python 83.31% C++ 2.31% Jupyter Notebook 14.38%
image-compression generative-adversarial-networks entropy-coding pytorch computer-vision machine-learning

high-fidelity-generative-compression's Introduction

high-fidelity-generative-compression

Pytorch implementation of the paper "High-Fidelity Generative Image Compression" by Mentzer et. al.. This repo also provides general utilities for lossless compression that interface with Pytorch. For the official (TensorFlow) code release, see the TensorFlow compression repo.

About

This repository defines a model for learnable image compression based on the paper "High-Fidelity Generative Image Compression" (HIFIC) by Mentzer et. al.. The model is capable of compressing images of arbitrary spatial dimension and resolution up to two orders of magnitude in size, while maintaining perceptually similar reconstructions. Outputs tend to be more visually pleasing than standard image codecs operating at higher bitrates.

This repository also includes a partial port of the Tensorflow Compression library - which provides general tools for neural image compression in Pytorch.

Open In Colab

You can play with a demonstration of the model in Colab, where you can upload and compress your own images.

Example

Original HiFIC
guess guess
Original: (6.01 bpp - 2100 kB) | HiFIC: (0.160 bpp - 56 kB). Ratio: 37.5.

The image shown is an out-of-sample instance from the CLIC-2020 dataset. The HiFIC image is obtained by reconstruction via a learned model provided below.

Note that the learned model was not adapted in any way for evaluation on this image. More sample outputs from this model can be found at the end of the README and in EXAMPLES.md.

Note

The generator is trained to achieve realistic and not exact reconstruction. It may synthesize certain portions of a given image to remove artifacts associated with lossy compression. Therefore, in theory images which are compressed and decoded may be arbitrarily different from the input. This precludes usage for sensitive applications. An important caveat from the authors is reproduced here:

"Therefore, we emphasize that our method is not suitable for sensitive image contents, such as, e.g., storing medical images, or important documents."

Usage

  • Install Pytorch nightly and dependencies from https://pytorch.org/. Then install other requirements:
pip install -r requirements.txt
  • Clone this repository, cd in:
git clone https://github.com/Justin-Tan/high-fidelity-generative-compression.git
cd high-fidelity-generative-compression

To check if your setup is working, run python3 -m src.model in root. Usage instructions can be found in the user's guide.

Training

  • Download a large (> 100,000) dataset of diverse color images. We found that using 1-2 training divisions of the OpenImages dataset was able to produce satisfactory results on arbitrary images. Add the dataset path under the DatasetPaths class in default_config.py.

  • For best results, as described in the paper, train an initial base model using the rate-distortion loss only, together with the hyperprior model, e.g. to target low bitrates:

# Train initial autoencoding model
python3 train.py --model_type compression --regime low --n_steps 1e6
  • Then use the checkpoint of the trained base model to 'warmstart' the GAN architecture. Please see the user's guide for more detailed instructions.
# Train using full generator-discriminator loss
python3 train.py --model_type compression_gan --regime low --n_steps 1e6 --warmstart --ckpt path/to/base/checkpoint

Compression

  • compress.py will compress generic images using some specified model. This performs a forward pass through the model to yield the quantized latent representation, which is losslessly compressed using a vectorized ANS entropy coder and saved to disk in binary format. As the model architecture is fully convolutional, this will work with images of arbitrary size/resolution (subject to memory constraints).
python3 compress.py -i path/to/image/dir -ckpt path/to/trained/model --reconstruct

The compressed format can be transmitted and decoded using the routines in compress.py. The Colab demo illustrates the decoding process.

Pretrained Models

  • Pretrained model weights using the OpenImages dataset can be found below (~2 GB). The examples at the end of this readme were produced using the HIFIC-med model. The same models are also hosted in the following Zenodo repository: https://zenodo.org/record/4026003.
Target bitrate (bpp) Weights Training Instructions
0.14 HIFIC-low
python3 train.py --model_type compression_gan --regime low --warmstart -ckpt path/to/trained/model -nrb 9 -norm
0.30 HIFIC-med
python3 train.py --model_type compression_gan --regime med --warmstart -ckpt path/to/trained/model --likelihood_type logistic
0.45 HIFIC-high
python3 train.py --model_type compression_gan --regime high --warmstart -ckpt path/to/trained/model -nrb 9 -norm

Examples

The samples below are taken from the CLIC2020 dataset, external to the training set. The bitrate is reported in bits-per-pixel (bpp). The reconstructions are produced using the above HIFIC-med model (target bitrate 0.3 bpp). It's interesting to try to guess which image is the original (images are saved as PNG for viewing - best viewed widescreen). You can expand the spoiler tags below each image to reveal the answer.

For more examples see EXAMPLES.md. For even more examples see this shared folder (images within generated using the HIFIC-low model).

A B
guess guess
Image 1
Original: A (11.8 bpp) | HIFIC: B (0.269 bpp). Ratio: 43.8
A B
guess guess
Image 2
Original: A (14.6 bpp) | HIFIC: B (0.330 bpp). Ratio: 44.2
A B
guess guess
Image 3
Original: A (12.3 bpp) | HIFIC: B (0.209 bpp). Ratio: 58.9
A B
guess guess
Image 4
Original: B (19.9 bpp) | HIFIC: A (0.565 bpp). Ratio: 35.2

The last two show interesting failure modes: small figures in the distance are almost entirely removed (top of the central rock in the penultimate image), and the required model bitrate increases significantly when the image is dominated by high-frequency components.

Authors

  • Grace Han
  • Justin Tan

Acknowledgements

Contributing

All content in this repository is licensed under the Apache-2.0 license. Please open an issue if you encounter unexpected behaviour, or have corrections/suggestions to contribute.

Citation

This is a PyTorch port of the original implementation. Please cite the original paper if you use their work.

@article{mentzer2020high,
  title={High-Fidelity Generative Image Compression},
  author={Mentzer, Fabian and Toderici, George and Tschannen, Michael and Agustsson, Eirikur},
  journal={arXiv preprint arXiv:2006.09965},
  year={2020}
}

high-fidelity-generative-compression's People

Contributors

dependabot[bot] avatar justin-tan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

high-fidelity-generative-compression's Issues

bit stream of the .hfc file

Dear author:
After the image is compressed into .hfc, how do we get its specific bit information? Or how can the .hfc file be converted into a bit stream for transmission?

about the model saving

When I use the code to train my own dataset, I find the saved model seems useless. Just like this(image_compression_2021_03_24_00_04_epoch1_idx1107_2021_03_24_00). However I find that the save path is right(image_compression_2021_03_24_00_04_epoch1_idx1107_2021_03_24_00.pt). I am looking forward your help. Thanks.

Is there a possiblity for good quality compression on images containing text ?

Hi, I tried this project in google colab and tested it on a .jpg image from one of the physics ebook I have. I used the
HIFIC-low model as it was mentioned in the colab notebook that it will give best compression ratio. After decompression I got a .png image with all text inside unrecognizable. Does this mean this program can only do good on content not containing text ?
Or is there a possiblity of this program doing good on these type of images ? Will training this program a custom dataset (specifically for compression of type of images defined) be any good ?
If yes, what should be the structure of the dataset ?

Is "theoretical bpp" calculated by Hyperprior model?

Hello Justin,
Thank you for your outstanding reference codes and I learned a lot. I got high-quality reconstructed images. I have a question about "theoretical bpp". Is it calculated by the hyperprior model?
Is the hyperprior model accurate because I see the big gap between real bpp and "theoretical bpp"?

Thank you!

Sincerely,
Yifei

TypeError: can't pickle _thread.RLock objects

python compress.py -i data/originals/ -ckpt model/hific_med.pt --reconstruct
later


12:58:16 INFO - compress_and_decompress: Starting compression...
0%| | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
File "compress.py", line 240, in
main()
File "compress.py", line 237, in main
compress_and_decompress(args)
File "compress.py", line 143, in compress_and_decompress
for idx, (data, bpp, filenames) in enumerate(tqdm(eval_loader), 0):
File "C:\Users\DELL\Anaconda3\envs\torch\lib\site-packages\tqdm\std.py", line 1171, in iter
for obj in iterable:
File "C:\Users\DELL\Anaconda3\envs\torch\lib\site-packages\torch\utils\data\dataloader.py", line 279, in iter
return _MultiProcessingDataLoaderIter(self)
File "C:\Users\DELL\Anaconda3\envs\torch\lib\site-packages\torch\utils\data\dataloader.py", line 719, in init
w.start()
File "C:\Users\DELL\Anaconda3\envs\torch\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Users\DELL\Anaconda3\envs\torch\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\DELL\Anaconda3\envs\torch\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Users\DELL\Anaconda3\envs\torch\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\Users\DELL\Anaconda3\envs\torch\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle _thread.RLock objects

please help me.What's wrong with me?

Issues with pop() function in ans.py

Hey Justin,

thanks for the repo, great work! When I run your decoder, it works just fine for a while, but every once in a while I get the following message:

image_compression_container_decoder  | [ERROR] [1696691601.057191]: bad callback: <bound method GANDecoder.decoder_callback of <__main__.GANDecoder object at 0x7fafa6c79dc0>>
image_compression_container_decoder  | Traceback (most recent call last):
image_compression_container_decoder  |   File "/opt/ros/noetic/lib/python3/dist-packages/rospy/topics.py", line 750, in _invoke_callback
image_compression_container_decoder  |     cb(msg)
image_compression_container_decoder  |   File "/home/carpc/catkin_ws/src/generative_models/scripts/demo_decoder_image_gan.py", line 34, in decoder_callback
image_compression_container_decoder  |     decoded_img = self.decode(in_encoded_img)
image_compression_container_decoder  |   File "/home/carpc/catkin_ws/src/generative_models/scripts/demo_decoder_image_gan.py", line 55, in decode
image_compression_container_decoder  |     output_decoder = decompress(self.gan, in_encoded_img)
image_compression_container_decoder  |   File "/home/carpc/catkin_ws/src/generative_models/scripts/gan_compression.py", line 22, in decompress
image_compression_container_decoder  |     compressed_output = model.decompress(data)
image_compression_container_decoder  |   File "/home/carpc/catkin_ws/src/generative_models/scripts/high_fidelity_generative_compression/gan_compression/src/model.py", line 327, in decompress
image_compression_container_decoder  |     latents_decoded = self.Hyperprior.decompress_forward(compression_output, device=utils.get_device())
image_compression_container_decoder  |   File "/home/carpc/catkin_ws/src/generative_models/scripts/high_fidelity_generative_compression/gan_compression/src/hyperprior.py", line 269, in decompress_forward
image_compression_container_decoder  |     latents_decoded, _ = self.prior_entropy_model.decompress(latents_encoded, means=latent_means,
image_compression_container_decoder  |   File "/home/carpc/catkin_ws/src/generative_models/scripts/high_fidelity_generative_compression/gan_compression/src/compression/prior_model.py", line 240, in decompress
image_compression_container_decoder  |     decoded = compression_utils.ans_decompress(encoded, indices, cdf, cdf_length, cdf_offset,
image_compression_container_decoder  |   File "/home/carpc/catkin_ws/src/generative_models/scripts/high_fidelity_generative_compression/gan_compression/src/compression/compression_utils.py", line 184, in ans_decompress
image_compression_container_decoder  |     decoded = entropy_coding.vec_ans_index_decoder(
image_compression_container_decoder  |   File "/home/carpc/catkin_ws/src/generative_models/scripts/high_fidelity_generative_compression/gan_compression/src/compression/entropy_coding.py", line 620, in vec_ans_index_decoder
image_compression_container_decoder  |     message, value = symbol_pop(message)
image_compression_container_decoder  |   File "/home/carpc/catkin_ws/src/generative_models/scripts/high_fidelity_generative_compression/gan_compression/src/compression/entropy_coding.py", line 40, in pop
image_compression_container_decoder  |     return pop_fun(start, freq), symbol
image_compression_container_decoder  |   File "/home/carpc/catkin_ws/src/generative_models/scripts/high_fidelity_generative_compression/gan_compression/src/compression/ans.py", line 85, in pop
image_compression_container_decoder  |     tail, new_head = stack_slice(tail_, n)
image_compression_container_decoder  |   File "/home/carpc/catkin_ws/src/generative_models/scripts/high_fidelity_generative_compression/gan_compression/src/compression/ans.py", line 35, in stack_slice
image_compression_container_decoder  |     arr, stack = stack
image_compression_container_decoder  | ValueError: not enough values to unpack (expected 2, got 0)
image_compression_container_decoder  

My setup is a bit complicated to reproduce, as I have embedded the code in a ROS node as part of a larger stack. The decoder (mostly) actually continues working, but I have also seen it crashing. Do you have any suggestions on how to fix it? Would it be sufficient to simply check if tail_ is not empty, so that the slice function cannot fail? As the error occurs so deep "inside" the code, I find it a bit hard to detect the root cause.

How to use a custom dataset?

I've changed the default_config.py to a custom folder with images:
folder/path
|----/image001.jpg
|----/image002.jpg
...

But it returned me
ValueError: num_samples should be a positive integer value, but got num_samples=0

Doesn't work for single channel images

Great repo! Thanks for implementing this.

I'm trying to apply this to gray-scale imagery with shape [1,L,L]. I noticed a couple spots where you assume the number of input channels is 3:

  1. In train.py you call create_model before you set args.image_dim.
    https://github.com/markveillette/high-fidelity-generative-compression/blob/052a34c4760a85a89e42b55846972c913f206ba1/train.py#L283

  2. In the generator, a 3 is hard coded in the output layer
    https://github.com/markveillette/high-fidelity-generative-compression/blob/052a34c4760a85a89e42b55846972c913f206ba1/src/network/generator.py#L141

How to reproduce TensorFlow results?

Hi!

Thanks for your implementation of HiFiC!
I'm trying to understand why the results of this Pytorch implementation are so much different from the TensorFlow implementation. Are there any differences in implementations? How to reproduce TensorFlow results?
There are some examples below.

Thanks in advance for your reply

Original

kodim01 (1)

(low) Paper

kodim01

(low) TensorFlow

kodim01-0_258bpp

(low) Pytorch (Your pre-trained GAN model)

kodim01_RECON_0_205bpp

(low) Pytorch (What I get when train Pytorch model with default settings and batch_size=12 for GAN)

kodi01_RECON_0_214bpp

what is the minimum input size for this model

I am trying to use a 64x64 pixel image as the input of the model. Then I get the following error:

/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
   2115             magic_arg_s = self.var_expand(line, stack_depth)
   2116             with self.builtin_trap:
-> 2117                 result = fn(magic_arg_s, cell)
   2118             return result
   2119 

<decorator-gen-60> in time(self, line, cell, local_ns)

/usr/local/lib/python3.6/dist-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    186     # but it's overkill for just that one bit of state.
    187     def magic_deco(arg):
--> 188         call = lambda f, *a, **k: f(*a, **k)
    189 
    190         if callable(arg):

/usr/local/lib/python3.6/dist-packages/IPython/core/magics/execution.py in time(self, line, cell, local_ns)
   1191         else:
   1192             st = clock2()
-> 1193             exec(code, glob, local_ns)
   1194             end = clock2()
   1195             out = None

<timed exec> in <module>()

/content/high-fidelity-generative-compression/compress.py in compress_and_save(model, args, data_loader, output_dir)
     76 
     77             # Perform entropy coding
---> 78             compressed_output = model.compress(data)
     79 
     80             out_path = os.path.join(output_dir, f"{filenames[0]}_compressed.hfc")

/content/high-fidelity-generative-compression/src/model.py in compress(self, x, silent)
    290             y = utils.pad_factor(y, y.size()[2:], factor)
    291 
--> 292         compression_output = self.Hyperprior.compress_forward(y, spatial_shape)
    293         attained_hbpp = 32 * len(compression_output.hyperlatents_encoded) / np.prod(spatial_shape)
    294         attained_lbpp = 32 * len(compression_output.latents_encoded) / np.prod(spatial_shape)

/content/high-fidelity-generative-compression/src/hyperprior.py in compress_forward(self, latents, spatial_shape, **kwargs)
    196 
    197         # Obtain hyperlatents from hyperencoder
--> 198         hyperlatents = self.analysis_net(latents)
    199         hyperlatent_spatial_shape = hyperlatents.size()[2:]
    200         batch_shape = latents.size(0)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/content/high-fidelity-generative-compression/src/network/hyper.py in forward(self, x)
     59         x = self.activation(self.conv1(x))
     60         x = self.activation(self.conv2(x))
---> 61         x = self.conv3(x)
     62 
     63         return x

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py in forward(self, input)
    417 
    418     def forward(self, input: Tensor) -> Tensor:
--> 419         return self._conv_forward(input, self.weight)
    420 
    421 class Conv3d(_ConvNd):

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight)
    410     def _conv_forward(self, input, weight):
    411         if self.padding_mode != 'zeros':
--> 412             return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
    413                             weight, self.bias, self.stride,
    414                             _pair(0), self.dilation, self.groups)

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in _pad(input, pad, mode, value)
   3567             assert len(pad) == 4, '4D tensors expect 4 values for padding'
   3568             if mode == 'reflect':
-> 3569                 return torch._C._nn.reflection_pad2d(input, pad)
   3570             elif mode == 'replicate':
   3571                 return torch._C._nn.replication_pad2d(input, pad)

RuntimeError: Padding size should be less than the corresponding input dimension, but got: padding (2, 2) at dimension 3 of input [1, 320, 2, 2]

So I am wondering what is the minimum input size for this model. Thanks in advance : )

Error in loading optimizers state dict: optimizers['amort'].load_state_dict(checkpoint['compression_optimizer_state_dict'])

Hi,

I am training to finetune the pre-trained model hific_low.pt but I am getting an error in the following line in the load_model() function in the utils.py module:

optimizers['amort'].load_state_dict(checkpoint['compression_optimizer_state_dict'])

Here are more details about the error:

Traceback (most recent call last):
  File "train.py", line 283, in <module>
    model_type=args.model_type, current_args_d=dictify(args), strict=False, prediction=False)
  File "D:\Research\Image Compression\HiFiC\high-fidelity-generative-compression-master\src\helpers\utils.py", line 261, in load_model
    optimizers['amort'].load_state_dict(checkpoint['compression_optimizer_state_dict'])
  File "C:\Users\Ahmed Fawzy Gad\AppData\Roaming\Python\Python37\site-packages\torch\optim\optimizer.py", line 123, in load_state_dict
    raise ValueError("loaded state dict contains a parameter group "
ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's group

Please help me to solve it.

Thanks you.

Unable to use pretrain model

When I try to use the pretrain model, I got the error that said "ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's group". Any idea how I can fix this? Thanks

why not use 'torch.uint8'

Hello,
Since the range of image is 0-255, why the code use
data = data.to(device, dtype=torch.float)
not data = data.to(device, dtype=torch.uint8)?

For image compression, unint8 is more efficient than float, right?
Thank you.

train from scratch using openimage failed

HI Justin:
Thanks for the great repo. I met the following strange question, roughly investigated all the issue feedbacks but did not find the answer. See below for detail.
1. I want to reproduce the hific low with openimages, but failed to train from scratch(warmup+gan), bpp much higher than expected like 0.3, while your pretrain works fine, same file, only 0.078bpp
2. for openimages, the train contain 100,000 images(first 100,0000 images of the original train_0 sub zip), while validation contain 41620 images(full set of the original validation)
3. I see that you have only get 200K steps(much smaller than 1M as in paper), while I modified the epoch number so that, I train 500K step for each.
4. Except for the above mentioned difference, no config is changed.
5. The below tensorboard message seemed to show that test data and validation data have much difference in bpp, and bpp in training seemed to go in a not-too-small-range. I tried to use larger batch_size like 16 but also failed.
I hope it's not a burden for you to give out some suggestions and insights.
Thanks!
004f3fb46c4c9823a14a54663f67747

Pretrained model

Dearest Maintainer,

The pretrain model appears to be gone. Has it moved? If so what is the new download? If not I can cut a pr to remove the link.

Thanks
Dictated but not reviewed,
Becker

Question about target rate

First, I want to thank your wonderful code which is useful for my research. I have a question about the target_rate variable which controls the bpp: I did not find target_rate is used in any other code documents(maybe i just miss it). I wonder which part is changed between pre-train models with different regime and how target_rate is used? And if I want to continue increasing the bpp, did I only need to change number of latent_channels(or maybe the quantization precision in hyperprior.py). Any suggestion about the change need to do to increase bpp will be useful.

Thanks a lot for your time.

Any plan to optimize the decoding time?

Great work! However, the decoding time is about 12s for a 1080p image. The bottleneck lies in calling Hyperprior.decompress_forward, especially in prior_entropy_model.decompress (~10s).

Is there any plan to optimize the decoding time?

Total steps of training

Thank you for the repository!
Regrading the total number of epochs of training, from the original paper they are training both part (autoencoder/hyperprior and GAN traininig) for 1 million steps. However in this implementation its trained for 10 epochs (If I am not wrong). Could you please explain this

When train in lower bitrates ( < 0.14)?

Hi, thank you for your contributions. Following https://github.com/Justin-Tan/high-fidelity-generative-compression/issues/12, I train the model on my own datasets (very small, contain 10,000 images with size 256*256). I change the values in the target_rate_map dict in default_config.py to [0.05,0.07,0.1]. However, the model shows higher bpp than the pretrained low model (0.14). Specially, the model trained with target rate 0.05 shows poor performance. And the q_bpp of test is 0.325, while that of train is 0.078. Could you give me some suggestions.

compression problem

The hfc file I get by compressing a large image is 21kib. But cut this picture into many small pictures, and then compress all the pictures with hific, the total size of hfc obtained exceeds 190kib

Runtime Error

RuntimeError: CUDA out of memory. Tried to allocate 210.00 MiB (GPU 0; 10.91 GiB total capacity; 9.59 GiB already allocated; 85.56 MiB free; 9.81 GiB reserved in total by PyTorch)

I decrease order of priority in default_config.py. but it doesn't work.
I set:
batch_size = 2; latent_channels = 44;
n_residual_blocks = 3; crop_size = 64;
image_dims = (3,64,64); latent_dims = (latent_channels,8,8)

My GPU is GeForce GTX 1080P

Large overhead for vectorized ans encoding

Hi,

Thank you for this wonderful repo! I have a question about vectorized ans encoding.

The situation is that my images have the size 256x256. When I am using ans encoding, the file size is about 1kb. But after switching to vectorized ans encoding, the file size comes to 6kb. I think it's a large overhead.

Do you have any idea about that? Or, are there any relevant parameters I need to finetune? Thank you again!

’ train.py --model_type compression_gan --regime low --n_steps 1e6 --warmstart --ckpt path/to/base/checkpoint‘ did not work

Train using full generator-discriminator loss

python3 train.py --model_type compression_gan --regime low --n_steps 1e6 --warmstart --ckpt path/to/base/checkpoint
this command did not work for me ,after i changed it to 'python3 train.py --model_type compression_gan --regime low --n_steps 1e6 --warmstart -ckpt path/to/base/checkpoint' (--ckpt to -ckpt),it worked
maybe this can help somebody

About the model choice

Firstly, thank you for offering this good code project. When I try to test the pre_trained model on my own dataset, I find the hific_low.pt can compress images at a level of bpp0.1441, hific_med.pt at a level of bpp0.2865, and hific_hi.pt at a level of bpp0.3041. The hific_med and hific_hi offer a nearly same compression rate, through the image quality is different.

Then when I train the model on my own dataset, could you share some experience with me for how to choose model trained for different epoches.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.