fab-jul / l3c-pytorch Goto Github PK
View Code? Open in Web Editor NEWPyTorch Implementation of the CVPR'19 Paper "Practical Full Resolution Learned Lossless Image Compression"
License: GNU General Public License v3.0
PyTorch Implementation of the CVPR'19 Paper "Practical Full Resolution Learned Lossless Image Compression"
License: GNU General Public License v3.0
Hi,
I'm currently working on Geometric Point Cloud Compression, and I'm really impressed by your approach. I have two questions I would like to ask you :
Thanks a lot,
Have a nice week end!
Hello,
Thank you very much for sharing the code. It is very helpful. I have a question about your evaluation datasets. Is it possible that you can share both DIV2K and RAISE-1k datasets like you share Open Images for evaluation?
Thank you for your answer.
Dear Dr.Mentzer
I have some questions about "def encode_uniform(self, dmll, S, fout):" and "encode_scale(self, scale, dmll, out, img, fout):"
Will add here what I find. For now:
logistic_mixture.py
.Hello,
I have been trying to compress the image using this command:
python3 l3c.py --device cpu log_dir 0306_0001 enc ../../sample_data/img.png out.l3c
Output:
*** AC_NEEDS_CROP_DIM = 3000000
Using /root/.cache/torch_extensions as PyTorch extensions root...
Emitting ninja build file /root/.cache/torch_extensions/torchac_backend/build.ninja...
Building extension module torchac_backend...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module torchac_backend...
Traceback (most recent call last):
File "l3c.py", line 129, in <module>
main()
File "l3c.py", line 111, in main
parse_device_flag(flags.device)
File "l3c.py", line 46, in parse_device_flag
print(f'Status: '
AttributeError: module 'torchac.torchac' has no attribute 'CUDA_SUPPORTED'
I have installed torchac
using pip from https://github.com/fab-jul/torchac
Requirement already satisfied: torchac in /usr/local/lib/python3.7/dist-packages (0.9.3)
Thank you for your help in advance.
Thanks for sharing this great repo!
I have cloned the repo and downloaded the pre-trained model. The pre-trained file was unzipped, and saved in 'root/src/log_dir/0524_0001/ckpts/ckpt_0000684250.pt'
As I ran the following code,
python test.py python test.py log_dir 0524_0001 ~/Data/kodak
I met the following error,
'AssertionError: Expected a base dir for every component, got ['0524_0001'] and ['configs/ms', 'configs/dl']'
I'm not sure if this is a personal question. Do you have any suggestions?
Hi, I'm curious about the computation of cdf in your code. The quantization levels have length of L(default 25 in your code), why the cdf takes the length of (L+1) for each quantized value?
for uniform distribution, you seem to pad zeros to the new added dimension, like:
# here cdf's shape is [N, H, W, L]
cdf = torch.cumsum(pr, -1)
cdf = cdf.mul_(2**precision)
cdf = cdf.round()
# here cdf's shape changes to [N, H, W, L+1]
cdf = torch.cat((torch.zeros((N, H, W, 1), dtype=cdf.dtype, device=cdf.device),
cdf), dim=-1)
and for mixture of logistic, you seem to re-define new quantization levels, like:
# Lp = L+1
# here new quantization levels are defined
self.targets = torch.linspace(dmll.x_min - dmll.bin_width / 2,
dmll.x_max + dmll.bin_width / 2,
dmll.L + 1, dtype=torch.float32, device=l.device)
Hello, Fabian:
When I try to install the torchac, I meet the following errors:
(pytorch) [wency@localhost torchac]$ COMPILE_CUDA=no python setup.py
Compiling, cuda_support=False
/home/wency/anaconda3/envs/pytorch/lib/python3.7/distutils/dist.py:274: UserWarning: Unknown distribution option: 'extra_compile_args'
warnings.warn(msg)
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: setup.py --help [cmd1 cmd2 ...]
or: setup.py --help-commands
or: setup.py cmd --help
error: no commands supplied
My environment is gcc 4.8.5, nvcc 8.0, python 3.7, pytorch 1.0.
I found the RGBHeader
edsr.MeanShift(0, (0., 0., 0.), (128., 128., 128.))
and the pixel values of the image in dataloader are between 0-255. So what's the perpose of using it while it only turn the value to 0-2.
And how should i set the mean and std if i use a dataset whose image is 12-bit?
use command below:
python l3c.py --device=cpu /home/tt/workspace/L3C-PyTorch/logs 0306_0001 enc /home/longmao/workspace/L3C-PyTorch/2.png out1.l3c
*** AC_NEEDS_CROP_DIM = 3000000
Status: torchac-backend-gpu available: False // torchac-backend-cpu available: True // CUDA available: True
*** Using torchac-backend-cpu; did set CUDA_AVAILABLE=False and DEVICE=cpu
Testing 0306_0001 at -1 ---
Restoring /home/tt/workspace/L3C-PyTorch/logs/0306_0001 cr oi/ckpts/ckpt_0001002750.pt... (strict=True)
*** WARN: Will discard 4th (alpha) channel.
Killed
After fixing the cpu version on my setup, the GPU encoding/decoding is still not error-free.
I could verify that the test.py script works.
The only major difference between the scripts is the MultiscaleTester l3c
argument,
so I assume this affects the results.
Edit: It is not the l3c
argument. I build a standalone compression script based on the test.py
,
and successively remove parts not needed for compression.
When I trained on CLIC dataset, the loss remained 5. Is it necessary to readjust parameters such as learning rare?
Hi
I don't understand how the program encode symbols with cdf function using AC code, if without PMF function. can you give me example about encoding?
Thank you
I've managed to successfully use your torchac implementation in a setting like Balle's hyperprior, for that I am creating a cdf table of all possible values for each channel (which is normalized with _renorm_cast_cdf_to create cdf_tbl), shifting up and rounding the hyperprior to be encoded (z_int) and encoding one channel at a time (with its own cdf_tbl). The decoded bytes string matches z_int and the total number of bytes is as low as expected.
This creates a bytes string per channel. Do you know of a straightforward way to find the End-Of-Message of these bytes s.t. I can simply store a long series of bytes (without explicitly encoding the number of bytes per channel)? It seems that the decoder will not be affected by additional bytes (so I could in theory brute-force it s.t. I separate the bytes when the output of the decoder no longer changes after adding additional bytes; ie no need to store additional EOM's). I am reading up on how the range encoder works and your C++ code and in the meantime posting this in case you have a simpler answer.
At the moment, torchac.py
has the following import logic
try:
# TODO: somehow this suceeds even if only _cpu is installed
import torchac_backend_gpu as torchac_backend
CUDA_SUPPORTED = True
except ImportError:
CUDA_SUPPORTED = False
# Try importing the cpu version
try:
import torchac_backend_cpu as torchac_backend
except ImportError:
print('*** ERROR: torchac_backend not found. Please see README.')
sys.exit(1)
However, this has a bug AND is also not a nice design.
The bug is: If you compile torchac with GPU support, then uninstall it from pip, then install it with CPU support without doing a clean build, the torch_backend_gpu
library will still be installed. Meaning that the first import will succeed, even though pip list
shows torch_backend_cpu
.
In general, the design is not nice, because it should not be the import that decides CUDA support, but the usage. Meaning that in encode_logistic_mixture
, either torch_backend_gpu
or torch_backend_cpu
should be selected based on whether the tensors are on CPU or GPU.
I am running into a seg fault error when I try to compress a 16 Bit Image. The problem is in Arithmetic Coding. Any help is appreciated
Previously, our preprocessing script saved all training images and validation images as JPGs with a high quality factor of Q=95, downscaled by a factor 0.75. It turns out that the resulting images have a specific enough distribution that the neural network picks up on it, and the images are also easier to compress for the non-learned codecs.
For correctness, we have thus re-created the training and validation sets. The new preprocessing script is available in the repo. The important differences are:
This causes all results to shift, however, as before, we still outperform WebP, JPEG-2000, and PNG, i.e. the ordering of the methods according to bpp remains unchanged.
We evaluated our model on 500 images randomly selected from the Open Images validation set, and preprocessed like the training data. To compare, please download Open Images evaluation set here.
Available here https://arxiv.org/abs/1811.12817v3.
Method | Open Images | DIV2K | RAISE-1k | |
---|---|---|---|---|
Ours | L3C v3 | 2.990597132 | 3.093768752 | 2.386501087 |
Learned Baselines | RGB Shared | 4.313588005 | 4.429001861 | 3.779201962 |
RGB | 3.297824781 | 3.418117799 | 2.572320659 | |
Non-Learned Approaches | PNG | 4.004512908 | 4.234729262 | 3.556403138 |
JPEG2000 | 3.054759549 | 3.126744435 | 2.46459739 | |
WebP | 3.047477818 | 3.176081706 | 2.461481317 | |
FLIF | 2.866778476 | 2.910950783 | 2.084036243 |
Merged into master.
Should somehow write to the bitstream that we padded when encoding.
dear Dr. Mentzer:
I modify your code to a stereo compression version, when I try to encode one of two images, the stages after 0 are all good, but at stage 0, I only got an '@', and two images are compressed with 3 bit. The image has a size of 3x375x1242. You can see this figure below. I add a breakpoint in there.
Howevere, when I crop image to a size of 3x300x800, this mistake disappears.
Then I try to use your L3C to compress one image with size of 3x375x1242, see if there is the same problem, but it doesn't.
I don't know where is the problem, do you have any idea?
python -c "import torchac" gives no error
but
python l3c.py ../../L3C 0524_0001 enc ../figs out.l3c
gives error torchac_backend not found.
gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
decoding using:
python l3c.py logdir 0306_0001 dec out.l3c decoded.png
The following exception appears :
ValueError: buffer is smaller than requested size
(full trace below)
python l3c.py logdir 0306_0001 enc imgdir/val_oi_500_r/0a7f13330a5d0023.png out.l3c
full trace :
Traceback (most recent call last):
File "l3c.py", line 129, in <module>
main()
File "l3c.py", line 123, in main
tester.decode(flags.img_p, flags.out_p_png)
File "/home/yonathan/Documents/GitHub/L3C-PyTorch/src/test/multiscale_tester.py", line 405, in decode
decoded = self.bc.decode(pin)
File "/home/yonathan/Documents/GitHub/L3C-PyTorch/src/bitcoding/bitcoding.py", line 153, in decode
bn_prev = self.decode_scale(dmll, l, fin)
File "/home/yonathan/Documents/GitHub/L3C-PyTorch/src/bitcoding/bitcoding.py", line 264, in decode_scale
bn, _ = self.code_with_cdf(l, (1, C, H, W), decoder, dmll)
File "/home/yonathan/Documents/GitHub/L3C-PyTorch/src/bitcoding/bitcoding.py", line 291, in code_with_cdf
decoded_bn[:, c, ...], extra_info_c = bn_coder(c, C_cond_cur)
File "/home/yonathan/Documents/GitHub/L3C-PyTorch/src/bitcoding/bitcoding.py", line 254, in decoder
num_bytes = read_num_bytes_encoded(fin)
File "/home/yonathan/Documents/GitHub/L3C-PyTorch/src/bitcoding/bitcoding.py", line 352, in read_num_bytes_encoded
return int(read_bytes(fin, [np.uint32])[0])
File "/home/yonathan/anaconda3/envs/l3c_env/lib/python3.7/site-packages/fjcommon/functools_ext.py", line 32, in composed
return f1(f2(*args_c, **kwargs_c))
File "/home/yonathan/Documents/GitHub/L3C-PyTorch/src/bitcoding/bitcoding.py", line 375, in read_bytes
yield np.frombuffer(f.read(num_bytes_to_read), t, count=1)
ValueError: buffer is smaller than requested size
Hi, thank you for your work. I try to run the l3c.py to test the running time. However, I find that it takes about 6s for encoding/decoding a 512x512 RGB image. Although running time varies at different machines, 6s seems too slow to be correct.
My machine should be fast enough with RTX3090 and intel i9 cpu and 32G memory.
I try gcc-5.5 to compile the torchac with cpu only.
I use the 0306_0001 model.
I use the following command line.
Encode to out.l3c
python l3c.py /path/to/logdir 0306_0001 enc /path/to/img out.l3c
Decode from out.l3c, save to decoded.png
python l3c.py /path/to/logdir 0306_0001 dec out.l3c decoded.png
Could you give me some advice about the slow running time? Thank you in advance.
Hi,
I'm currently trying to install torchac on my pc (windows 10) and I'm encountering this problem (for cuda_flag = auto, force and no).
I would really appreciate some help !
What is torchac.obj and where does it come from?
My setup is :
windows 10
python 3.7
torch 1.10.2+cu113
CUDA nvcc 11.3
PS F:\PhD\Compression\ANFIC\main_code\torchac> python .\setup.py install
Compiling, cuda_support=True
C:\Users\Jeremy\AppData\Local\Programs\Python\Python37\lib\distutils\dist.py:274: UserWarning: Unknown distribution option: 'extra_compile_args'
warnings.warn(msg)
running install
running bdist_egg
running egg_info
creating torchac_backend_gpu.egg-info
writing torchac_backend_gpu.egg-info\PKG-INFO
writing dependency_links to torchac_backend_gpu.egg-info\dependency_links.txt
writing top-level names to torchac_backend_gpu.egg-info\top_level.txt
writing manifest file 'torchac_backend_gpu.egg-info\SOURCES.txt'
reading manifest file 'torchac_backend_gpu.egg-info\SOURCES.txt'
writing manifest file 'torchac_backend_gpu.egg-info\SOURCES.txt'
installing library code to build\bdist.win-amd64\egg
running install_lib
running build_ext
C:\Users\Jeremy\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\cpp_extension.py:316: UserWarning: Error checking compiler version for cl: [WinError 2] Le fichier spécifié est introuvable
warnings.warn(f'Error checking compiler version for {compiler}: {error}')
building 'torchac_backend_gpu' extension
creating F:\PhD\Compression\ANFIC\main_code\torchac\build
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC\main_code
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC\main_code\torchac
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC\main_code\torchac\torchac_backend
Emitting ninja build file F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
1.10.2.git.kitware.jobserver-1
creating F:\PhD\Compression\ANFIC\main_code\torchac\build\lib.win-amd64-3.7
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.31.31103\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\Jeremy\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\lib "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\lib/x64" /LIBPATH:C:\Users\Jeremy\AppData\Local\Programs\Python\Python37\libs /LIBPATH:C:\Users\Jeremy\AppData\Local\Programs\Python\Python37\PCbuild\amd64 "/LIBPATH:C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.31.31103\ATLMFC\lib\x64" "/LIBPATH:C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.31.31103\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.20348.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\\lib\10.0.20348.0\\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib cudart.lib c10_cuda.lib torch_cuda_cu.lib torch_cuda_cpp.lib /EXPORT:PyInit_torchac_backend_gpu F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC\main_code\torchac\torchac_backend\torchac.obj F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC\main_code\torchac\torchac_backend\torchac_kernel.obj /OUT:build\lib.win-amd64-3.7\torchac_backend_gpu.cp37-win_amd64.pyd /IMPLIB:F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC\main_code\torchac\torchac_backend\torchac_backend_gpu.cp37-win_amd64.lib
LINK : fatal error LNK1181: impossible d'ouvrir le fichier en entrée 'F:\PhD\Compression\ANFIC\main_code\torchac\build\temp.win-amd64-3.7\Release\PhD\Compression\ANFIC\main_code\torchac\torchac_backend\torchac.obj'
error: command 'C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.31.31103\\bin\\HostX86\\x64\\link.exe' failed with exit status 1181
Thank you for the excellent work.
In your paper, you stated that "by stopping gradients from propagating through the targets of our loss, we get significantly worse performance – in fact, the optimizer does not manage to pull down the cross-entropy of any of the learned representations z(s) significantly".
What do you mean by stopping gradients? do you have any method to force propagate the gradients?
I build my own model, the network is able to learn the RGB scale but has high losses on lower scales.
Hi, thanks a lot for your work. I have two questions about the probalibity prediction part of your method:
L3C predicts a different conditional probability distribution p(x|f) for each image x and uses p to perform entropy encoding, am I right?
If I understand right about 1), the more accurate the estimated distribution p is, the smaller code length we can get after entropy encoding, so how about counting the number of different pixel values across each input image x, and normalize the counting results to get the probability distribution p' of x over pixel values, then we can also use p' for entropy encoding. What's the advantage of L3C over such method?
Thanks in advance for your reply!
I tested two of the pre trained models on an image directory with JPEG images of the size 50-70KB. The output images in the sample folder were of the size 600-800KB. So basically no compression. Am I doing something wrong?
Hi there,
first of all thank you for sharing the code!
After I downloaded the Main Model L3C I tried out the compression of an example image with:
python l3c.py ./models/ 0306_0001 enc ./data/original/006239.png 006239.l3c
I was suprised to see that the compressed file is still quite large (882KB)
Size original image: 942 KB
Size compressed file(006239.l3c): 882 KB
Size of reconstructed file: 874 KB (The quality of the decoded .l3c file seems as expected lossless)
My question now would be: Is this compression rate normal?
Thanks a lot in advance!
Pip list:
awscli 1.19.83 botocore 1.20.83 certifi 2020.12.5 cffi 1.14.5 colorama 0.4.3 cycler 0.10.0 decorator 4.4.2 docutils 0.15.2 fasteners 0.14.1 fjcommon 0.2.10 imageio 2.9.0 jmespath 0.10.0 kiwisolver 1.3.1 matplotlib 2.2.2 mkl-fft 1.3.0 mkl-random 1.2.1 mkl-service 2.3.0 monotonic 1.6 networkx 2.5.1 numpy 1.20.2 olefile 0.46 Pillow 8.2.0 pip 21.1.1 protobuf 3.17.1 pyasn1 0.4.8 pycparser 2.20 pyparsing 2.4.7 python-dateutil 2.8.1 pytz 2021.1 PyWavelets 1.1.1 PyYAML 5.4.1 rsa 4.7.2 s3transfer 0.4.2 scikit-image 0.18.1 scipy 1.1.0 setuptools 52.0.0.post20210125 six 1.15.0 tensorboardX 1.2 tifffile 2021.4.8 torch 1.1.0 torchac-backend-cpu 1.0.0 torchvision 0.3.0 urllib3 1.26.5 wheel 0.36.2
Dear Dr.Mentzer
I can't run "python train.py -h", but "python test.py -h" works well.
"python train.py -h" error: ValueError: unsupported format character 'O' (0x4f) at index 38. File "train.py", line 125, in
main(sys.argv[1:]) File "train.py", line 91, in main flags = p.parse_args(args)
I can't run "python train.py configs/ms/cr.cf configs/dl/oi.cf log_dir -p upsampling=deconv" to train dataset. but I can run it with "python train.py configs/ms/cr.cf configs/dl/oi.cf log_dir"
Please answer my question
thank you
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.