GithubHelp home page GithubHelp logo

mingsun-tse / assl Goto Github PK

View Code? Open in Web Editor NEW
58.0 8.0 7.0 4.42 MB

[NeurIPS'21 Spotlight] Aligned Structured Sparsity Learning for Efficient Image Super-Resolution (PyTorch)

Python 86.31% Shell 13.69%
pruning super-resolution neural-network-pruning model-compression filter-pruning

assl's Introduction

Hi there πŸ‘‹

I am a Ph.D. candidate at SMILE Lab of Northeastern University (Boston, USA). Before that, I spent seven wonderful years at Zhejiang Univeristy (Hangzhou, China) to get my B.E. and M.S. degrees.

I am interested in a variety of topics in computer vision and machine learning. My research works orbit efficient deep learning (a.k.a. model compression), spanning from the most common image classifcation task (GReg, Awesome-PaI, TPP) to neural style transfer (Collaborative-Distillation), single image super-resolution (ASSL, SRP), and 3D novel view synthesis (R2L, MobileR2L).

I do my best towards easily reproducible research.

πŸ”₯ NEWS: [NeurIPS'23] We are excited to present SnapFusion, a super-efficient mobile diffusion model that can do text-to-image generation in less than 2sπŸš€ on mobile devices! [Arxiv] [Webpage]
πŸ”₯ NEWS: [CVPR'23] Check out our new blazing fastπŸš€ neural rendering model on mobile devices: MobileR2L (the lightweight version of R2L), can render 1008x756 images at 56fps on iPhone13 [Arxiv] [Code]
πŸ”₯ NEWS: [ICLR'23] Check out the very first trainability-preserving filter pruning method: TPP [Arxiv] [Code]
πŸ”₯ NEWS: Check out our preprint work that deciphers the so confusing benchmark situation in neural network (filter) pruning: [Arxiv] [Code]
✨ NEWS: Check out our investigation of what makes a "good" data augmentation in knowledge distillation, in NeurIPS 2022: [Webpage] [Code]
✨ NEWS: Check out our Efficient NeRF project via distillation, in ECCV 2022: [R2L]

Github stats

assl's People

Contributors

mingsun-tse avatar yulunzhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

assl's Issues

Questions on Data Preparation

Hello and thanks for your amazing work!
When I try to reproduce the paper results, I met some trouble binarizing the DF2K data:

data/DF2K/bin/DF2K_train_LR_bicubic/X4/3548x4.pt does not exist. Now making binary...
Direct pt file without name or image
data/DF2K/bin/DF2K_train_LR_bicubic/X4/3549x4.pt does not exist. Now making binary...
Direct pt file without name or image
data/DF2K/bin/DF2K_train_LR_bicubic/X4/3550x4.pt does not exist. Now making binary...
Direct pt file without name or image
data/DF2K/bin/DF2K_train_HR/3551.pt does not exist. Now making binary...
Traceback (most recent call last):
...
FileNotFoundError: No such file: '/home/nfs_data/shixiangsheng/projects/ModelCompression/Prune/ASSL/src/data/DF2K/DF2K_train_HR/3551.png'

I created dirs like this:
----data
|__DF2K
|__DF2K_train_HR
|__DF2K_train_LR_bicubic

I put '0001.png' - '0900.png' from ./data/DIV2K/DIV2K_train_HR and '000001.png' - '002650.png' (renamed to '0901.png' - '3550.png') from .data/Flickr2K/Flickr2K_HR to ./DF2K/DF2K_train_HR. As for downsampled images, I created folders named in ['X2', 'X3', 'X4'] under ./DF2K/DF2K_train_LR_bicubic and copied related images from DIV2K_train_LR_bicubic and Flickr2K_LR_bicubic (with images renamed as '0001x_.png' to '3550x_.png').
At the first and second stages of binarization (binarizing HR images and X4 LR images), it seems OK, but then the above error emerged. It's kind of weird since the total training images are 900 + 2650 and I have no idea why it returned to binarize the HR images after binarizing X4 LR images.
I'm new to SR and have tried to look up for data preparation of DF2K in other SR repos, but in vain. I wonder how you actually get DF2K images binarized. Thanks for your help in advance XD

Attempt to train my own model

Hello, dear developer, I am very interested in your project and want to use your code to train a super-resolution model of my own. Can I directly replace the pictures in the DF2K dataset with the same name? Or where do I need to modify it?

Questions about implementation detail

hello , I have some questiones about implementation details.

Data are obtained using the HR-LR data pairs obtained by the down-sampling code provided in BasicSR. The training data was DF2K (900 DIV2K + 2650 Flickr2K), and the test data was Set5.

I run this command to prune the EDSR_16_256 model to EDSR_16_48. Only the pruning ratio and storage path name are modified compared to the command provided by the official.

Prune from 256 to 48, pr=0.8125, x2, ASSL

python main.py --model LEDSR --scale 2 --patch_size 96 --ext sep --dir_data /home/notebook/data/group_cpfs/wurongyuan/data/data
--data_train DF2K --data_test DF2K --data_range 1-3550/3551-3555 --chop --save_results --n_resblocks 16 --n_feats 256
--method ASSL --wn --stage_pr [0-1000:0.8125] --skip_layers *mean*,*tail*
--same_pruned_wg_layers model.head.0,model.body.16,*body.2 --reg_upper_limit 0.5 --reg_granularity_prune 0.0001
--update_reg_interval 20 --stabilize_reg_interval 43150 --pre_train pretrained_models/LEDSR_F256R16BIX2_DF2K_M311.pt
--same_pruned_wg_criterion reg --save main/SR/LEDSR_F256R16BIX2_DF2K_ASSL_0.8125_RGP0.0001_RUL0.5_Pretrain_06011101
Results
model_just_finished_prune ---> 33.739dB
fine-tuning after one epoch ---> 37.781dB
fine-tuning after 756 epoch ---> 37.940dB

The result (37.940dB) I obtained with the code provided by the official is still a certain gap from the result in the paper (38.12dB). I should have overlooked some details.

I also compared L1-norm method provided in the code.
Prune from 256 to 48, pr=0.8125, x2, L1

python main.py --model LEDSR --scale 2 --patch_size 96 --ext sep --dir_data /home/notebook/data/group_cpfs/wurongyuan/data/data
--data_train DF2K --data_test DF2K --data_range 1-3550/3551-3555 --chop --save_results --n_resblocks 16 --n_feats 256
--method L1 --wn --stage_pr [0-1000:0.8125] --skip_layers *mean*,*tail*
--same_pruned_wg_layers model.head.0,model.body.16,*body.2 --reg_upper_limit 0.5 --reg_granularity_prune 0.0001
--update_reg_interval 20 --stabilize_reg_interval 43150 --pre_train pretrained_models/LEDSR_F256R16BIX2_DF2K_M311.pt
--same_pruned_wg_criterion reg --save main/SR/LEDSR_F256R16BIX2_DF2K_L1_0.8125_06011101

Results

model_just_finished_prune ---> 13.427dB
fine-tuning after one epoch ---> 33.202dB
fine-tuning after 756 epoch ---> 37.933dB

The difference between the results of L1-norm method and those of ASSL seems negligible at this pruning ratio (256->48)

Is there something I missed? Looking forward to your reply! >-<

Which dataset do the 3551 to 3555 images come from?

Hi, thanks for your excellent work. Which dataset do the 3551 to 3555 images used for validation during training come from? DIV2K 900 + Flickr2K 2650 = DF2K 3550,so do you use Set5 for validation during training ?

What is the real Mult-Adds?

Hi, we directly using your test scripts with provided final models to testing on benchmark(Set5, Set14, B100, Urban100, Manga109), we got x2 results below:

x2 Set5 Set14 B100 Urban100 Manga109
PSNR 38.04 33.71 32.22 32.24 38.87

But it's far below with results reported in the paper, benchmark x2 results in the paper reported:

x2 Set5 Set14 B100 Urban100 Manga109
PSNR 38.12 33.77 32.27 32.41 39.12

Only if using the --self-ensemble parameter when testing, we can get comparable results to the paper.
results of provided final models testing with --self-ensemble parameter below:

x2 Set5 Set14 B100 Urban100 Manga109
PSNR 38.12 33.77 32.26 32.40 39.09

So, Does the result in the paper used the --self-ensemble parameter?
But, If --self-ensemble used, the model need forward per image 8 times for ensemble, and the Mult-Adds of R16F48 should be 159.1 G x 8 = 1272.8 G

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.