fab-jul / l3c-pytorch Goto Github PK

View Code? Open in Web Editor NEW

391.0 391.0 61.0 3.95 MB

PyTorch Implementation of the CVPR'19 Paper "Practical Full Resolution Learned Lossless Image Compression"

License: GNU General Public License v3.0

Python 92.63% C++ 4.72% Cuda 1.70% Shell 0.94%

l3c-pytorch's People

Stargazers

Watchers

l3c-pytorch's Issues

3D Case Implementation

Hi,

I'm currently working on Geometric Point Cloud Compression, and I'm really impressed by your approach. I have two questions I would like to ask you :

What do you think of the approach of Balle and al. ?
Do you have any recommendations for such an implementations ? I was thinking about a PointNet like structure.

Thanks a lot,
Have a nice week end!

Thank you very much for sharing the code. It is very helpful. I have a question about your evaluation datasets. Is it possible that you can share both DIV2K and RAISE-1k datasets like you share Open Images for evaluation?

Thank you for your answer.

questions about AC code

Dear Dr.Mentzer
I have some questions about "def encode_uniform(self, dmll, S, fout):" and "encode_scale(self, scale, dmll, out, img, fout):"

The two functions encode image into out.l3c. The encoded message include "S, fout" and "scale,fout" in out.l3c file. am I right?
If images are encoded using AC Code, why the encoded messages of out.l3c are not bitstream such as "01110010......."?

PyTorch 1.2+ Not Supported

Will add here what I find. For now:

"Comparison operations (lt (<), le (<=), gt (>), ge (>=), eq (==), ne, (!=) ) return dtype has changed from torch.uint8 to torch.bool": This breaks logistic_mixture.py.
"sum(Tensor) (python built-in) does not upcast dtype like torch.sum" → Check for this in code.
"Tensorboard is no Longer Considered Experimental" → should use built in Tensorboard

AttributeError: module 'torchac.torchac' has no attribute 'CUDA_SUPPORTED'

Hello,

I have been trying to compress the image using this command:

python3 l3c.py --device cpu log_dir 0306_0001 enc ../../sample_data/img.png out.l3c

Output:

*** AC_NEEDS_CROP_DIM = 3000000
Using /root/.cache/torch_extensions as PyTorch extensions root...
Emitting ninja build file /root/.cache/torch_extensions/torchac_backend/build.ninja...
Building extension module torchac_backend...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module torchac_backend...
Traceback (most recent call last):
  File "l3c.py", line 129, in <module>
    main()
  File "l3c.py", line 111, in main
    parse_device_flag(flags.device)
  File "l3c.py", line 46, in parse_device_flag
    print(f'Status: '
AttributeError: module 'torchac.torchac' has no attribute 'CUDA_SUPPORTED'

I have installed torchac using pip from https://github.com/fab-jul/torchac

Requirement already satisfied: torchac in /usr/local/lib/python3.7/dist-packages (0.9.3)

Thank you for your help in advance.

AssertionError: Expected a base dir for every component, got ['0524_0001'] and ['configs/ms', 'configs/dl']

Thanks for sharing this great repo!

I have cloned the repo and downloaded the pre-trained model. The pre-trained file was unzipped, and saved in 'root/src/log_dir/0524_0001/ckpts/ckpt_0000684250.pt'

As I ran the following code,

python test.py python test.py log_dir 0524_0001 ~/Data/kodak

I met the following error,

'AssertionError: Expected a base dir for every component, got ['0524_0001'] and ['configs/ms', 'configs/dl']'

I'm not sure if this is a personal question. Do you have any suggestions?

Question about the computation of cdf in the code

Hi, I'm curious about the computation of cdf in your code. The quantization levels have length of L(default 25 in your code), why the cdf takes the length of (L+1) for each quantized value?

for uniform distribution, you seem to pad zeros to the new added dimension, like:

# here cdf's shape is [N, H, W, L]
cdf = torch.cumsum(pr, -1)
cdf = cdf.mul_(2**precision)
cdf = cdf.round()
# here cdf's shape changes to [N, H, W, L+1]
cdf = torch.cat((torch.zeros((N, H, W, 1), dtype=cdf.dtype, device=cdf.device),
                     cdf), dim=-1)

and for mixture of logistic, you seem to re-define new quantization levels, like:

# Lp = L+1
# here new quantization levels are defined
self.targets = torch.linspace(dmll.x_min - dmll.bin_width / 2,
                                      dmll.x_max + dmll.bin_width / 2,
                                      dmll.L + 1, dtype=torch.float32, device=l.device)

Error of torchac.

Hello, Fabian:

When I try to install the torchac, I meet the following errors:

(pytorch) [wency@localhost torchac]$ COMPILE_CUDA=no python setup.py
Compiling, cuda_support=False
/home/wency/anaconda3/envs/pytorch/lib/python3.7/distutils/dist.py:274: UserWarning: Unknown distribution option: 'extra_compile_args'
warnings.warn(msg)
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: setup.py --help [cmd1 cmd2 ...]
or: setup.py --help-commands
or: setup.py cmd --help

error: no commands supplied

My environment is gcc 4.8.5, nvcc 8.0, python 3.7, pytorch 1.0.

question about header

I found the RGBHeader
edsr.MeanShift(0, (0., 0., 0.), (128., 128., 128.))
and the pixel values of the image in dataloader are between 0-255. So what's the perpose of using it while it only turn the value to 0-2.
And how should i set the mean and std if i use a dataset whose image is 12-bit?

get errors when compress image

use command below:
python l3c.py --device=cpu /home/tt/workspace/L3C-PyTorch/logs 0306_0001 enc /home/longmao/workspace/L3C-PyTorch/2.png out1.l3c
*** AC_NEEDS_CROP_DIM = 3000000
Status: torchac-backend-gpu available: False // torchac-backend-cpu available: True // CUDA available: True
*** Using torchac-backend-cpu; did set CUDA_AVAILABLE=False and DEVICE=cpu
Testing 0306_0001 at -1 ---
Restoring /home/tt/workspace/L3C-PyTorch/logs/0306_0001 cr oi/ckpts/ckpt_0001002750.pt... (strict=True)
*** WARN: Will discard 4th (alpha) channel.
Killed

GPU encoding/decoding entails errors

After fixing the cpu version on my setup, the GPU encoding/decoding is still not error-free.

I could verify that the test.py script works.

The only major difference between the scripts is the MultiscaleTester l3c argument,
so I assume this affects the results.

Edit: It is not the l3c argument. I build a standalone compression script based on the test.py,
and successively remove parts not needed for compression.

The loss value remained around 5

When I trained on CLIC dataset, the loss remained 5. Is it necessary to readjust parameters such as learning rare?

Question about CDF function and AC code

Hi
I don't understand how the program encode symbols with cdf function using AC code, if without PMF function. can you give me example about encoding?

Thank you

torchac: how to find <EOM> to separate bytes strings (channels) ?

I've managed to successfully use your torchac implementation in a setting like Balle's hyperprior, for that I am creating a cdf table of all possible values for each channel (which is normalized with _renorm_cast_cdf_to create cdf_tbl), shifting up and rounding the hyperprior to be encoded (z_int) and encoding one channel at a time (with its own cdf_tbl). The decoded bytes string matches z_int and the total number of bytes is as low as expected.

This creates a bytes string per channel. Do you know of a straightforward way to find the End-Of-Message of these bytes s.t. I can simply store a long series of bytes (without explicitly encoding the number of bytes per channel)? It seems that the decoder will not be affected by additional bytes (so I could in theory brute-force it s.t. I separate the bytes when the output of the decoder no longer changes after adding additional bytes; ie no need to store additional EOM's). I am reading up on how the range encoder works and your C++ code and in the meantime posting this in case you have a simpler answer.

Proper GPU/CPU support for torchac

At the moment, torchac.py has the following import logic

try:
    # TODO: somehow this suceeds even if only _cpu is installed
    import torchac_backend_gpu as torchac_backend
    CUDA_SUPPORTED = True
except ImportError:
    CUDA_SUPPORTED = False
    # Try importing the cpu version
    try:
        import torchac_backend_cpu as torchac_backend
    except ImportError:
        print('*** ERROR: torchac_backend not found. Please see README.')
        sys.exit(1)

However, this has a bug AND is also not a nice design.

The bug is: If you compile torchac with GPU support, then uninstall it from pip, then install it with CPU support without doing a clean build, the torch_backend_gpu library will still be installed. Meaning that the first import will succeed, even though pip list shows torch_backend_cpu.

In general, the design is not nice, because it should not be the import that decides CUDA support, but the usage. Meaning that in encode_logistic_mixture, either torch_backend_gpu or torch_backend_cpu should be selected based on whether the tensors are on CPU or GPU.

How to modify this code to enable compression of 16Bit Images

I am running into a seg fault error when I try to compress a 16 Bit Image. The problem is in Arithmetic Coding. Any help is appreciated

CDF function and model p

In picture of paper, the picture shows that encode z(s) with model p(z(s) | f(s+1)) . In encoding details, it says encode each channel c of z(s) with C(z(s) | f(s+1)). Why?
What is the relationship between CDF function and model p？

Updated Training Set and New Results -- L3C v3

Previously, our preprocessing script saved all training images and validation images as JPGs with a high quality factor of Q=95, downscaled by a factor 0.75. It turns out that the resulting images have a specific enough distribution that the neural network picks up on it, and the images are also easier to compress for the non-learned codecs.

For correctness, we have thus re-created the training and validation sets. The new preprocessing script is available in the repo. The important differences are:

All images are saved as PNGs.
We do not rescale validation sets in any way, and instead divide the images into crops such that everything fits into memory. Note that this is a bias against our method, since more context can only help. We only crop images too big to fit into our GPU (TITAN X Pascal). Please see the updated README.
For the training set, we use a random downscaling factor, instead of fixed 0.75x: this provides a wider variety of downscaling artefacts.
Additionally, we use the Lanczos filter, as we found that Bicubic also introduces specific artefacts.

This causes all results to shift, however, as before, we still outperform WebP, JPEG-2000, and PNG, i.e. the ordering of the methods according to bpp remains unchanged.

We evaluated our model on 500 images randomly selected from the Open Images validation set, and preprocessed like the training data. To compare, please download Open Images evaluation set here.

Updated ArXiv

Available here https://arxiv.org/abs/1811.12817v3.

New Results

	Method	Open Images	DIV2K	RAISE-1k
Ours	L3C v3	2.990597132	3.093768752	2.386501087
Learned Baselines	RGB Shared	4.313588005	4.429001861	3.779201962
	RGB	3.297824781	3.418117799	2.572320659
Non-Learned Approaches	PNG	4.004512908	4.234729262	3.556403138
	JPEG2000	3.054759549	3.126744435	2.46459739
	WebP	3.047477818	3.176081706	2.461481317
	FLIF	2.866778476	2.910950783	2.084036243

Status

Merged into master.

l3c.py decoder does not unpad

Should somehow write to the bitstream that we padded when encoding.

See L388 in multiscale_tester.py

weird problem when encoding

dear Dr. Mentzer:

I modify your code to a stereo compression version, when I try to encode one of two images, the stages after 0 are all good, but at stage 0, I only got an '@', and two images are compressed with 3 bit. The image has a size of 3x375x1242. You can see this figure below. I add a breakpoint in there.

Howevere, when I crop image to a size of 3x300x800, this mistake disappears.

Then I try to use your L3C to compress one image with size of 3x375x1242, see if there is the same problem, but it doesn't.

I don't know where is the problem, do you have any idea?

torchac installed but still error torchac backend not found

python -c "import torchac" gives no error

but

python l3c.py ../../L3C 0524_0001 enc ../figs out.l3c

gives error torchac_backend not found.

gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0

What is the positional argument log_dir for?

I keep getting this I have a empty folder as log_dir, I am using the main model with id 0524_0001 and the images to be tested are in folder highres.

Error while decompressing

decoding using:

python l3c.py logdir 0306_0001 dec out.l3c decoded.png

The following exception appears :

ValueError: buffer is smaller than requested size

(full trace below)

installed torchac for CPU
when encoding i used :

python l3c.py logdir 0306_0001 enc imgdir/val_oi_500_r/0a7f13330a5d0023.png  out.l3c

full trace :

Traceback (most recent call last):
  File "l3c.py", line 129, in <module>
    main()
  File "l3c.py", line 123, in main
    tester.decode(flags.img_p, flags.out_p_png)
  File "/home/yonathan/Documents/GitHub/L3C-PyTorch/src/test/multiscale_tester.py", line 405, in decode
    decoded = self.bc.decode(pin)
  File "/home/yonathan/Documents/GitHub/L3C-PyTorch/src/bitcoding/bitcoding.py", line 153, in decode
    bn_prev = self.decode_scale(dmll, l, fin)
  File "/home/yonathan/Documents/GitHub/L3C-PyTorch/src/bitcoding/bitcoding.py", line 264, in decode_scale
    bn, _ = self.code_with_cdf(l, (1, C, H, W), decoder, dmll)
  File "/home/yonathan/Documents/GitHub/L3C-PyTorch/src/bitcoding/bitcoding.py", line 291, in code_with_cdf
    decoded_bn[:, c, ...], extra_info_c = bn_coder(c, C_cond_cur)
  File "/home/yonathan/Documents/GitHub/L3C-PyTorch/src/bitcoding/bitcoding.py", line 254, in decoder
    num_bytes = read_num_bytes_encoded(fin)
  File "/home/yonathan/Documents/GitHub/L3C-PyTorch/src/bitcoding/bitcoding.py", line 352, in read_num_bytes_encoded
    return int(read_bytes(fin, [np.uint32])[0])
  File "/home/yonathan/anaconda3/envs/l3c_env/lib/python3.7/site-packages/fjcommon/functools_ext.py", line 32, in composed
    return f1(f2(*args_c, **kwargs_c))
  File "/home/yonathan/Documents/GitHub/L3C-PyTorch/src/bitcoding/bitcoding.py", line 375, in read_bytes
    yield np.frombuffer(f.read(num_bytes_to_read), t, count=1)
ValueError: buffer is smaller than requested size

Running time

Hi, thank you for your work. I try to run the l3c.py to test the running time. However, I find that it takes about 6s for encoding/decoding a 512x512 RGB image. Although running time varies at different machines, 6s seems too slow to be correct.

My machine should be fast enough with RTX3090 and intel i9 cpu and 32G memory.
I try gcc-5.5 to compile the torchac with cpu only.
I use the 0306_0001 model.

I use the following command line.
Encode to out.l3c
python l3c.py /path/to/logdir 0306_0001 enc /path/to/img out.l3c
Decode from out.l3c, save to decoded.png
python l3c.py /path/to/logdir 0306_0001 dec out.l3c decoded.png

Could you give me some advice about the slow running time? Thank you in advance.

python .\setup.py install faill

Hi,
I'm currently trying to install torchac on my pc (windows 10) and I'm encountering this problem (for cuda_flag = auto, force and no).
I would really appreciate some help !

What is torchac.obj and where does it come from?

My setup is :
windows 10
python 3.7
torch 1.10.2+cu113
CUDA nvcc 11.3

PS F:\PhD\Compression\ANFIC\main_code\torchac> python .\setup.py install
Compiling, cuda_support=True
C:\Users\Jeremy\AppData\Local\Programs\Python\Python37\lib\distutils\dist.py:274: UserWarning: Unknown distribution option: 'extra_compile_args'
  warnings.warn(msg)
running install
running bdist_egg
running egg_info
creating torchac_backend_gpu.egg-info
writing torchac_backend_gpu.egg-info\PKG-INFO
writing dependency_links to torchac_backend_gpu.egg-info\dependency_links.txt
writing top-level names to torchac_backend_gpu.egg-info\top_level.txt
writing manifest file 'torchac_backend_gpu.egg-info\SOURCES.txt'
reading manifest file 'torchac_backend_gpu.egg-info\SOURCES.txt'
writing manifest file 'torchac_backend_gpu.egg-info\SOURCES.txt'
installing library code to build\bdist.win-amd64\egg
running install_lib
running build_ext
C:\Users\Jeremy\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\cpp_extension.py:316: UserWarning: Error checking compiler version for cl: [WinError 2] Le fichier spécifié est introuvable
  warnings.warn(f'Error checking compiler version for {compiler}: {error}')
building 'torchac_backend_gpu' extension
creating F:\PhD\Compression\ANFIC\main_code\torchac\build
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC\main_code
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC\main_code\torchac
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC\main_code\torchac\torchac_backend
Emitting ninja build file F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
1.10.2.git.kitware.jobserver-1
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\lib.win-amd64-3.7
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.31.31103\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\Jeremy\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\lib "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\lib/x64" /LIBPATH:C:\Users\Jeremy\AppData\Local\Programs\Python\Python37\libs /LIBPATH:C:\Users\Jeremy\AppData\Local\Programs\Python\Python37\PCbuild\amd64 "/LIBPATH:C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.31.31103\ATLMFC\lib\x64" "/LIBPATH:C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.31.31103\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.20348.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\\lib\10.0.20348.0\\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib cudart.lib c10_cuda.lib torch_cuda_cu.lib torch_cuda_cpp.lib /EXPORT:PyInit_torchac_backend_gpu F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC\main_code\torchac\torchac_backend\torchac.obj F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC\main_code\torchac\torchac_backend\torchac_kernel.obj /OUT:build\lib.win-amd64-3.7\torchac_backend_gpu.cp37-win_amd64.pyd /IMPLIB:F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC\main_code\torchac\torchac_backend\torchac_backend_gpu.cp37-win_amd64.lib
LINK : fatal error LNK1181: impossible d'ouvrir le fichier en entrée 'F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC\main_code\torchac\torchac_backend\torchac.obj'
error: command 'C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.31.31103\\bin\\HostX86\\x64\\link.exe' failed with exit status 1181

Do not able to learn the latent representations

Thank you for the excellent work.
In your paper, you stated that "by stopping gradients from propagating through the targets of our loss, we get significantly worse performance – in fact, the optimizer does not manage to pull down the cross-entropy of any of the learned representations z(s) significantly".
What do you mean by stopping gradients? do you have any method to force propagate the gradients?
I build my own model, the network is able to learn the RGB scale but has high losses on lower scales.

questions about the probability prediction part in the paper

Hi, thanks a lot for your work. I have two questions about the probalibity prediction part of your method:

L3C predicts a different conditional probability distribution p(x|f) for each image x and uses p to perform entropy encoding, am I right?
If I understand right about 1), the more accurate the estimated distribution p is, the smaller code length we can get after entropy encoding, so how about counting the number of different pixel values across each input image x, and normalize the counting results to get the probability distribution p' of x over pixel values, then we can also use p' for entropy encoding. What's the advantage of L3C over such method?

Thanks in advance for your reply!

No compression happened

I tested two of the pre trained models on an image directory with JPEG images of the size 50-70KB. The output images in the sample folder were of the size 600-800KB. So basically no compression. Am I doing something wrong?

Size of compressed file

Hi there,

first of all thank you for sharing the code!
After I downloaded the Main Model L3C I tried out the compression of an example image with:

python l3c.py ./models/ 0306_0001 enc ./data/original/006239.png 006239.l3c

I was suprised to see that the compressed file is still quite large (882KB)

Size original image: 942 KB
Size compressed file(006239.l3c): 882 KB
Size of reconstructed file: 874 KB (The quality of the decoded .l3c file seems as expected lossless)

My question now would be: Is this compression rate normal?

Thanks a lot in advance!

Pip list:
awscli 1.19.83 botocore 1.20.83 certifi 2020.12.5 cffi 1.14.5 colorama 0.4.3 cycler 0.10.0 decorator 4.4.2 docutils 0.15.2 fasteners 0.14.1 fjcommon 0.2.10 imageio 2.9.0 jmespath 0.10.0 kiwisolver 1.3.1 matplotlib 2.2.2 mkl-fft 1.3.0 mkl-random 1.2.1 mkl-service 2.3.0 monotonic 1.6 networkx 2.5.1 numpy 1.20.2 olefile 0.46 Pillow 8.2.0 pip 21.1.1 protobuf 3.17.1 pyasn1 0.4.8 pycparser 2.20 pyparsing 2.4.7 python-dateutil 2.8.1 pytz 2021.1 PyWavelets 1.1.1 PyYAML 5.4.1 rsa 4.7.2 s3transfer 0.4.2 scikit-image 0.18.1 scipy 1.1.0 setuptools 52.0.0.post20210125 six 1.15.0 tensorboardX 1.2 tifffile 2021.4.8 torch 1.1.0 torchac-backend-cpu 1.0.0 torchvision 0.3.0 urllib3 1.26.5 wheel 0.36.2

run train.py error

Dear Dr.Mentzer
I can't run "python train.py -h", but "python test.py -h" works well.
"python train.py -h" error: ValueError: unsupported format character 'O' (0x4f) at index 38. File "train.py", line 125, in
main(sys.argv[1:]) File "train.py", line 91, in main flags = p.parse_args(args)

I can't run "python train.py configs/ms/cr.cf configs/dl/oi.cf log_dir -p upsampling=deconv" to train dataset. but I can run it with "python train.py configs/ms/cr.cf configs/dl/oi.cf log_dir"
Please answer my question
thank you

fab-jul / l3c-pytorch Goto Github PK

l3c-pytorch's People

Stargazers

Watchers

Forkers

l3c-pytorch's Issues

Updated ArXiv

New Results

Status

Recommend Projects

Recommend Topics

Recommend Org

Jobs