GithubHelp home page GithubHelp logo

Comments (12)

mkaglins avatar mkaglins commented on May 28, 2024

Branch with experiments code: https://github.com/mkaglins/nncf_pytorch/tree/mkaglins/RL_experiments

from nncf.

vshampor avatar vshampor commented on May 28, 2024

@mkaglins what is the status of this issue?

from nncf.

mkaglins avatar mkaglins commented on May 28, 2024

LeGR was re-implemented in NNCF on the branch: https://github.com/mkaglins/nncf_pytorch/tree/mkaglins/legr_impl

And baseline results for MobileNet v2 on CIFAR100 from this paper were reproduced successfully.

Global ranking coefficients were trained with settings from paper and LeGR github ( https://github.com/cmu-enyac/LeGR) and the following results were obtained:

Pruning Rate Original acc 80% 87% 90%
mobilenet_v2 Top1@acc 73.47% 73.64% 72.26% 71.2%

To reproduce results next changes/settings in NNCF were made:

  • MobilenetV2 model architecture and pretrained weights from https://github.com/cmu-enyac/LeGR
  • Dataset (CIFAR-100) normalization params and train/validation transformations from https://github.com/cmu-enyac/LeGR
  • Pruning quota = 0.1(algorithm can prune only 90% of every layer)
  • Allow pruning last convolution (in current NNCF settings this is not possible)

from nncf.

AlexKoff88 avatar AlexKoff88 commented on May 28, 2024

@mkaglins, do we have results for Geomean method to compare with what you got on CIFAR?

from nncf.

mkaglins avatar mkaglins commented on May 28, 2024

No, this model with weights is from LeGR github. But I will run such an experiment to compare.

from nncf.

mkaglins avatar mkaglins commented on May 28, 2024

Filter Pruning algorithm (with geomen magnitude method) and same target flops pruning rate showed significately worse result than LeGR:

  • with pruning rate= 80%, top1@acc = 68.6%
  • with pruning rate= 90%, top1@acc = 65%
    Experiment was conducted with same MobilenetV2 pretrained weights, dataset params and fine-tuning scheme as in LeGR case.

from nncf.

lzrvch avatar lzrvch commented on May 28, 2024

@mkaglins So the results are specific to thes particular MobileNetV2 weights? What about the ones from, say, torchvision? Would there be such a significant gap between LeGR and geomean+uniform results?

from nncf.

mkaglins avatar mkaglins commented on May 28, 2024

@vanyalzr such experiments to compare LeGR with the current Filter Pruning algorithms on different models are planned and in progress.

from nncf.

mkaglins avatar mkaglins commented on May 28, 2024

Further experiments are planned:

Experiments to compare LeGR with the current FP algorithm:

  • LeGR comparison with current FP algorithm on Imagenet on already exposed pruned models (resnet18, resnet34, resnet50, googlenet, unet) (currently experiment on resnet18 are in progress)
  • LeGR comparison with current FP algorithm on CIFAR-100 (resnet18, resnet50, inceptionv3, mobilenetv2)

Also some experiments for potential LeGR improvement:

  1. LeGR with fewer generation numbers (200 instead of 400)
  2. LeGR with progressive fine-tuning (exponential scheduler)
  3. LeGR with Batch-Norm adaptation instead of small training to estimate pruned model accuracy
  4. LeGR with the configuration from geomean pruning added to evolution algorithm search space.

from nncf.

mkaglins avatar mkaglins commented on May 28, 2024

Experiments results summary:

Imagenet:

Resnet18, FLOPs pruning rate=30%

Algorithm Original model Filter pruning + geomean LeGR
top1@acc 69.64 68.72 69.43

LeGR shows significantly better results on resnet-18, Imagenet.

CIFAR-100:

Algorithms descriptions:

  • Filter pruning with geomean – current Filter Pruning algorithm with the geometric median as filter importance function.
  • LeGR – LeGR algorithm trained on the bigger pruning rate (0.8) and after used trained ranking coefficients to prune and fine-tune model with different pruning rates. Three trials are made to test algorithm stability.
  • LeGR, 200 generations – LeGR algorithm (same as LeGR above), but with 200 generations of the evolution algorithm instead of default 400.
  • LeGR, progressive - LeGR algorithm (same as LeGR above), but with progressive fine-tuning (exponential scheduler and 15 pruning steps).
Resnet-18-cifar results:

Original acc = 75.51

LeGR:

Algorithm\FLOPs PR 0.4 0.6 0.8
Filter pruning + geomean 74,97 74,29 68,69
LeGR 74,12 73,77 68,25
LeGR 74,69 74,07 68,52
LeGR 74,83 73,56 72,17
MEAN(LeGR) 74,55 73,80 69,65
STD(LeGR) 0,38 0,26 2,19

LeGR, 200 generations:

Algorithm\FLOPs PR 0.4 0.6 0.8
LeGR, 200 generations 74,60 72,95 71,81
LeGR, 200 generations 74,33 73,13 69,43
LeGR, 200 generations 74,48 73,37 68,55
MEAN(LeGR, 200 generations) 74,47 73,15 69,93
STD(LeGR, 200 generations) 0,14 0,21 1,69

LeGR with progressive fine-tuning:

Algorithm\FLOPs PR 0.4 0.6 0.8
LeGR, progressive 74,30 74,20 72,94
LeGR, progressive 75,27 74,47 73,14
LeGR, progressive 74,58 74,01 73,38
MEAN(LeGR, progressive) 74,72 74,23 73,15
STD(LeGR, progressive) 0,50 0,23 0,22
Resnet-50-cifar results:

Original acc=75.1%

LeGR:

Algorithm\FLOPs PR 0.4 0.6 0.8
Filter pruning + geomean 75,05 75,05 74,53
LeGR 75,46 75,53 75,11
LeGR 75,94 75,17 74,62
LeGR 75,58 75,33 75,05
MEAN(LeGR) 75,66 75,34333 74,92667
STD(LeGR) 0,2498 0,18037 0,26727

LeGR, 200 generations:

Algorithm\FLOPs PR 0.4 0.6 0.8
LeGR, 200 generations 75,75 75,28 74,98
LeGR, 200 generations 75,47 75,39 74,82
LeGR, 200 generations 75,66 75,47 74,49
MEAN(LeGR, 200 generations) 75,62667 75,38 74,76333
STD(LeGR, 200 generations) 0,142945 0,095394 0,249867

LeGR with progressive fine-tuning:

Algorithm\FLOPs PR 0.4 0.6 0.8
LeGR, progressive 75,02 75,56 74,51
LeGR, progressive 75,22 74,92 75,05
LeGR, progressive 75,33 75,17 74,71
MEAN(LeGR, progressive) 75,19 75,21667 74,75667
STD(LeGR, progressive) 0,157162 0,322542 0,273008
Inception_v3 results: Original acc=77.7%

LeGR:

Algorithm\FLOPs PR 0.4 0.6 0.8
Filter pruning + geomean 78,17 77,68 75,51
LeGR 78,1 76,63 74,9
LeGR 77,8 78 73,59
MEAN(LeGR) 77,95 77,315 74,245
STD(LeGR) 0,212132 0,968736 0,92631

LeGR, 200 generations:

Algorithm\FLOPs PR 0.4 0.6 0.8
LeGR, 200 generations 78,02 77,6 75,59
LeGR, 200 generations 78,01 76,86 75,73
LeGR, 200 generations 77,88 77,24 75,72
MEAN(LeGR, 200 generations) 77,97 77,23333 75,68
STD(LeGR, 200 generations) 0,078102 0,370045 0,078102

LeGR with progressive fine-tuning:

Algorithm\FLOPs PR 0.4 0.6 0.8
LeGR, progressive 78,05 77,79 77,25
LeGR, progressive 78,13 78,13 77,07
LeGR, progressive 78,09 78,07 77,80
MEAN(LeGR, progressive) 78,09 78,00 77,37
STD(LeGR, progressive) 0,04 0,18 0,38
MobilenetV2 results: Original acc=65,65%

LeGR:

Algorithm\FLOPs PR 0.4 0.6 0.8
Filter pruning + geomean 62,68 55,46 45,78
LeGR 66,02 63,62 56,70
LeGR 65,52 62,51 54,29
LeGR 65,32 63,54 55,79
MEAN(LeGR) 65,62 63,22 55,59
STD(LeGR) 0,36 0,62 1,22

LeGR, 200 generations:

Algorithm\FLOPs PR 0.4 0.6 0.8
LeGR, 200 generations 65,64 63,40 54,36
LeGR, 200 generations 65,41 62,43 54,90
LeGR, 200 generations 65,39 63,48 52,75
MEAN(LeGR, 200 generations) 65,48 63,10 54,00
STD(LeGR, 200 generations) 0,14 0,58 1,12

from nncf.

mkaglins avatar mkaglins commented on May 28, 2024

Results summary:

LeGR vs Filter Pruning:

  • There is no definite result of comparison: on resnet18 LeGR is better on 0.8 pruning rates and worse on other, on resnet-50 LeGR is significantly better with all pruning rate values, on inception_v3 LrGR is unstable and on average worse than original Filter pruning on all pruning rates.
  • Large variance of results across different trials

LeGR vs LeGR with 200 generations of evolution algorithm:

  • LeGR with 200 generations shows on the average better (or comparable) results on all models
  • Variance of final accuracy is significantly less in LeGR with 200 generations case

LeGR vs LeGR with progressive fine-tuning:

  • Progressive fine-tuning shows on average better results than LeGR (much better on the biggest pruning rate 0.8)
  • Progressive fine-tuning shows much less variance of final results

from nncf.

vshampor avatar vshampor commented on May 28, 2024

LeGR went into the code base in #501.

from nncf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.