htqin / ir-net Goto Github PK
View Code? Open in Web Editor NEW[CVPR 2020] This project is the PyTorch implementation of our accepted CVPR 2020 paper : forward and backward information retention for accurate binary neural networks.
[CVPR 2020] This project is the PyTorch implementation of our accepted CVPR 2020 paper : forward and backward information retention for accurate binary neural networks.
hello friend, when running the code of Resnet20-1w1a with cifar-10 dataset, it reported that
model.modules.layer1[i].conv[i].k = k
AttribuateError: 'function object has no attribute 'layer1'
in IR-Net/CIFAR-10/ResNet20/1w1a/trainer.py line[147]
i have no idea why this problem occour.
Thanks for your reply
Hi, author team,
After reading your paper and codes, I wonder that how to train the networks prototypes.
The training code are important for reproduction, please release them later. Thanks a lot for that. 👍
As you proposed in paper, the EDE and Libra-PB are both benefit the network performance, in that I am curious how to train.
A problem will appear when I run "python main.py" in VGG-Small:
RuntimeError: arguments are located on different GPUs at /opt/conda/conda-bld/pytorch_1533672544752/work/aten/src/THC/generated/../generic/THCTensorMathPointwise.cu:314
How can I deal with it?
这样写的话,给k和t返回的梯度都是None,根本不会更新吧
hello,thank you for your excellent work!
I still have a question,does the high inference speed of BNN also need to be deployed on the high-performance inference framework ,such as dabnn, otherwise the inference speed will not be much different from the original?
Can you give details of the training steps?
Hi everyone,
I noticed that in the CIFAR-10 folder, IR-Net has a trainer.py file for each experiment setting. But in the ImageNet folder, there is no such a trainer.py, and only a .yaml file.
Does it mean that we need to write a trainer.py ourselves based on the hyperparams in the yaml file for ImageNet experiments? Or there is a way to train on ImageNet using these yaml files directly?
Thank you!
hello ~ when i run your code with the multiple GPUS, i find the problem.
RuntimeError: binary_op(): expected both inputs to be on same device, but input a is on cuda:3 and input b is on cuda:0
when using the single GPU. it's safe. i check the tensor and find the k and t is stored in GPU0 and cause the problem.
i dont know if you have any idea how to solve it
Thanks~
Hello, sorry for similar question.
You have mentioned you used bi-real net structure for fair competition.
Is it a popular way to use bi-real net structure for quantization even for weight/activation with 1/32 bit?
I mean, did other networks such as BWN, HWGQ, ..., in your paper used same structures without binarizing downsamples?
Hello,
Thank you very much for sharing your work.
When I trained ResNet18 for CIFAR10, it gives 86% accuracy on the validation data.
I use the same hyperparameters of https://github.com/kuangliu/pytorch-cifar as you point out in the paper.
Nice work!!! Thank you for sharing the code. When i check the code, I found that the downsample layers for ResNet on ImageNet are not binarized to 1-bit values. Is that correct? In Cifar10, the downsample layer is quantized?
This is a huge gap.
ResNet-18 | 1 / 1 | 91.5
ResNet-20 | 1 / 1 | 86.5
@htqin Thank you for your quick response! I intended to ask this in this issue, but it was closed and this seems to be a new question, so I open a new issue for this question. Thank you a lot!
I notice in the yaml
augmentation:
input_size: 224
test_resize: 256
colorjitter: [0.2, 0.2, 0.2, 0.1]
Thank you! Your efforts and the sprit to open source will benefit the community a lot!
grad_input = k * t * (1 - torch.pow(torch.tanh(input * t), 2)) * grad_output
Expected all tensors to be on the same device, but found at least two devices , cuda:1 and cuda:0!
Hello, everyone,
Thank @htqin 's great work!
I noticed the training setting of vgg-small is using cosine decay + 300 max epochs, while the cited reference in the paper, use step decay + 400 max epochs.
And by running the main.py, I can only get 87.80% while the reported number in the paper is 90.40%. But the full precision version I reproduced using this training setting could have 91.79% while the reported number in the paper is 91.70%. In other words, I can reproduce the reported full precision baselines for vgg-small using the training setting in this repo, but cannot reproduce the reported binarized IR-Net for vgg-small. That makes me very confusing, I was wondering if there is any mismatch in the training setting of vgg-small@cifar-10? I am also very looking forward to discussing your training settings when you reproduce vgg-small@cifar-10!
Thank you!
Best regards,
Can you please provide a training file (trainer.py) for the imagenet dataset?thinks
I have try in CIFAR-10 with VGG-small, and can get similar result as you mention in your paper. Your work is really useful.
And I have trouble in deploying your model into dabnn when convert onnx to the format that dabnn support.
So I wander if you could opensource the real code that deploy to real Arm device?
Especially the model convert part.
Thank you.
Hi,
I read your paper, where you wrote that you binarize all convolutional and fc layer except the first and last one.
However, I see from your code that also the downsample convolution aren't quantized (They are not replaced with IRConv2d - e.g https://github.com/htqin/IR-Net/blob/master/CIFAR-10/ResNet18/resnet.py#L28) . What am I missing?
Thank you
hello,thank you for your work!
I got a model of size 81.3MB when I used the resnet.py file in your "IR-Net-master\ImageNet\ResNet34\1w1a\models"directory.
This is the same size as my model using the full precision ResNet34 network.
How do I compress my model?
Hi,
I'm following your brilliant work and thanks for your sharing.
However, I found that the parameter 'k' and 't' in ir_1w1a.py seems fixed, which is not the same as the one in your paper.
Will this difference matter?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.