GithubHelp home page GithubHelp logo

Comments (24)

dinghuanghao avatar dinghuanghao commented on May 2, 2024 1

For example, in the BN layer, even if requires_grad is false, the internal statistical variables(running_mean, running_var) will still change. You must set layer.eval(). You can add this and try again.

from tinyneuralnetwork.

liamsun2019 avatar liamsun2019 commented on May 2, 2024 1

Not yet. I can try that later.

from tinyneuralnetwork.

liamsun2019 avatar liamsun2019 commented on May 2, 2024

Actually, I tried the approach you just mentioned but the situation remained unchanged, i,e the weights/bias are different after dequantization. Is there an alternative way to achieve that? Thanks.

from tinyneuralnetwork.

dinghuanghao avatar dinghuanghao commented on May 2, 2024

After calling our quantizer, a quantized model code will be generated. You need to move the operators that do not need to be quantized outside of QuantStub() and DeQuantStub()

from tinyneuralnetwork.

dinghuanghao avatar dinghuanghao commented on May 2, 2024

-----------before-----------

  def forward(self, input_1):
        fake_quant_0 = self.fake_quant_0(input_1) # quantize
        model_0_0 = self.model_0_0(fake_quant_0) # qat
        model_0_1 = self.model_0_1(model_0_0) # qat
        model_0_2 = self.model_0_2(model_0_1) # qat
        model_1_0 = self.model_1_0(model_0_2) # qat
        fake_dequant_0 = self.fake_dequant_0(model_1_0)  # dequantize

-----------after-----------

  def forward(self, input_1):
        model_0_0 = self.model_0_0(input_1)   # float32
        fake_quant_0 = self.fake_quant_0(model_0_0) # quantize
        model_0_1 = self.model_0_1(fake_quant_0) # qat
        model_0_2 = self.model_0_2(model_0_1) # qat
        model_1_0 = self.model_1_0(model_0_2) # qat
        fake_dequant_0 = self.fake_dequant_0(model_1_0) # dequantize

from tinyneuralnetwork.

liamsun2019 avatar liamsun2019 commented on May 2, 2024

Thanks for your suggestions. Based on your sample, my understanding is that, except the layers that need to be fine-tuned, the other layers remain non-qat, i.e, the full-precision/float32. Then the generated model will be a hybrid QAT one as a result, right?

from tinyneuralnetwork.

liamsun2019 avatar liamsun2019 commented on May 2, 2024

I modified the description file of the quantized model as the way you suggested. The training goes normally while the final conversion to tflite fails with the following errors:

converter.convert()

File "/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+858a132143b7f4c6f3e1bfc95084a29e37ea8ef7-py3.6.egg/tinynn/converter/base.py", line 260, in convert
self.init_jit_graph()
File "/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+858a132143b7f4c6f3e1bfc95084a29e37ea8ef7-py3.6.egg/tinynn/converter/base.py", line 79, in init_jit_graph
script = torch.jit.trace(self.model, self.dummy_input)
File "/usr/local/lib/python3.6/dist-packages/torch/jit/_trace.py", line 742, in trace
_module_class,
File "/usr/local/lib/python3.6/dist-packages/torch/jit/_trace.py", line 940, in trace_module
_force_outplace,
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 725, in _call_impl
result = self._slow_forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 709, in _slow_forward
result = self.forward(*input, **kwargs)
File "out/movenet_qat.py", line 182, in forward
backbone_body_0_1 = self.backbone_body_0_1(backbone_body_0_0)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 725, in _call_impl
result = self._slow_forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 709, in _slow_forward
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/quantized/modules/conv.py", line 332, in forward
input, self._packed_params, self.scale, self.zero_point)
RuntimeError: Could not run 'quantized::conv2d.new' with arguments from the 'CPU' backend. 'quantized::conv2d.new' is only available for these backends: [QuantizedCPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, Tracer, Autocast, Batched, VmapMode].
movenet_qat.zip

from tinyneuralnetwork.

dinghuanghao avatar dinghuanghao commented on May 2, 2024

When you manually modify the model, you need to set rewrite_graph = false when initializing the quantizer.

quantizer = QATQuantizer(model, dummy_input, config={'rewrite_graph': False}, work_dir='out')

This will use your manual mixed precision model instead of full quantization(We will make this check automated in the future).

from tinyneuralnetwork.

dinghuanghao avatar dinghuanghao commented on May 2, 2024

Yes,operators outside the range of FakeQuant and FakeDequant will be float 32, and a mixed precision model will be generated after convert.

from tinyneuralnetwork.

liamsun2019 avatar liamsun2019 commented on May 2, 2024

I set rewrite_graph=False after my modifications and the following error rises up during conversion:
File "main.py", line 227, in main_quant
converter.convert()
File "/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+858a132143b7f4c6f3e1bfc95084a29e37ea8ef7-py3.6.egg/tinynn/converter/base.py", line 260, in convert
self.init_jit_graph()
File "/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+858a132143b7f4c6f3e1bfc95084a29e37ea8ef7-py3.6.egg/tinynn/converter/base.py", line 79, in init_jit_graph
script = torch.jit.trace(self.model, self.dummy_input)
File "/usr/local/lib/python3.6/dist-packages/torch/jit/_trace.py", line 742, in trace
_module_class,
File "/usr/local/lib/python3.6/dist-packages/torch/jit/_trace.py", line 940, in trace_module
_force_outplace,
RuntimeError: Encountering a dict at the output of the tracer might cause the trace to be incorrect, this is only valid if the container structure does not change based on the module's inputs. Consider using a constant container instead (e.g. for list, use a tuple instead. for dict, use a NamedTuple instead). If you absolutely need this and know the side effects, pass strict=False to trace() to allow this behavior.

The original model (before being quantized) does return a dictionary. But above error does not appear before my modifications to the quantized model description. No idea what's the root cause.

from tinyneuralnetwork.

dinghuanghao avatar dinghuanghao commented on May 2, 2024

Tomorrow I will upload a demo of mixed precision quantization :)

from tinyneuralnetwork.

dinghuanghao avatar dinghuanghao commented on May 2, 2024

The mixed precision QAT example has been added, please update to the latest tinynn.

from tinyneuralnetwork.

liamsun2019 avatar liamsun2019 commented on May 2, 2024

Yes, I just updated and would try it out.

from tinyneuralnetwork.

liamsun2019 avatar liamsun2019 commented on May 2, 2024

I modified my codes accordingly. But the same error still exists:
File "main.py", line 227, in main_quant
converter.convert()
File "/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+a60ddb1432bfd81eebd2d8d232e3a67bb6d2c434-py3.6.egg/tinynn/converter/base.py", line 260, in convert
self.init_jit_graph()
File "/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+a60ddb1432bfd81eebd2d8d232e3a67bb6d2c434-py3.6.egg/tinynn/converter/base.py", line 79, in init_jit_graph
script = torch.jit.trace(self.model, self.dummy_input)
File "/usr/local/lib/python3.6/dist-packages/torch/jit/_trace.py", line 742, in trace
_module_class,
File "/usr/local/lib/python3.6/dist-packages/torch/jit/_trace.py", line 940, in trace_module
_force_outplace,
RuntimeError: Encountering a dict at the output of the tracer might cause the trace to be incorrect, this is only valid if the container structure does not change based on the module's inputs. Consider using a constant container instead (e.g. for list, use a tuple instead. for dict, use a NamedTuple instead). If you absolutely need this and know the side effects, pass strict=False to trace() to allow this behavior.

code snippet:
fake_quant_0 = self.fake_quant_0(backbone_fpn_layer_blocks_0_conv_2)
#hm_hp_0 = self.hm_hp_0(backbone_fpn_layer_blocks_0_conv_2)
hm_hp_0 = self.hm_hp_0(fake_quant_0)
hm_hp_1 = self.hm_hp_1(hm_hp_0)
hm_hp_2 = self.hm_hp_2(hm_hp_1)
hm_hp_3 = self.hm_hp_3(hm_hp_2)
fake_dequant_0 = self.fake_dequant_0(hm_hp_3)

movenet_qat.zip

from tinyneuralnetwork.

dinghuanghao avatar dinghuanghao commented on May 2, 2024

Please give me the original generated model file.

from tinyneuralnetwork.

liamsun2019 avatar liamsun2019 commented on May 2, 2024

FYR
test.zip

from tinyneuralnetwork.

dinghuanghao avatar dinghuanghao commented on May 2, 2024

Can you run example correctly?

from tinyneuralnetwork.

liamsun2019 avatar liamsun2019 commented on May 2, 2024

As for the example, I have not tested it yet since I have no cifar10 dataset currently(I can download it anyway). I don't find out any fatal issues in my attached py script either. So I'm still unsure what's the possible reason for the conversion error.

from tinyneuralnetwork.

dinghuanghao avatar dinghuanghao commented on May 2, 2024

If you don’t have the cifar10 dataset, just use dummy_input to inference once.

from tinyneuralnetwork.

liamsun2019 avatar liamsun2019 commented on May 2, 2024

I run the mixed qat example after some minor modification(path, batchsize etc.). The training goes well and the conversion is successful. Could you help point out any possible buggy codes in my attached file? I guess the big difference is that my model returns an output containing QAT and non-QAT while your model just returns QAT in your example.

from tinyneuralnetwork.

dinghuanghao avatar dinghuanghao commented on May 2, 2024

ok

from tinyneuralnetwork.

dinghuanghao avatar dinghuanghao commented on May 2, 2024

This is a bug caused by graph tracer, we will fix it soon

from tinyneuralnetwork.

liamsun2019 avatar liamsun2019 commented on May 2, 2024

Thanks for quick feedback and look forward to your fix.

from tinyneuralnetwork.

peterjc123 avatar peterjc123 commented on May 2, 2024

@liamsun2019 A fix has been uploaded to remedy that. BTW, I updated the model description script for you, see movenet_qat.py.zip. For arithmetic operations to happen in non-quantized computation graph, it is important to write them back to torch.foo variants from the torch.nn.quantized.FloatFunctional().foo ones.

from tinyneuralnetwork.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.