alibaba / tinyneuralnetwork Goto Github PK

TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.

License: MIT License

Shell 0.02% Python 98.57% Dockerfile 0.02% Jupyter Notebook 1.40%

pytorch deep-learning model-compression pruning model-converter quantization-aware-training deep-neural-networks post-training-quantization

tinyneuralnetwork's Introduction

TinyNeuralNetwork

简体中文

TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework, which contains features like neural architecture search, pruning, quantization, model conversion and etc. It has been utilized for the deployment on devices such as Tmall Genie, Haier TV, Youku video, face recognition check-in machine, and etc, which equips over 10 million IoT devices with AI capability.

Installation

Python >= 3.6, PyTorch >= 1.4（ PyTorch >= 1.6 if quantization-aware training is involved ）

# Install the TinyNeuralNetwork framework
git clone https://github.com/alibaba/TinyNeuralNetwork.git
cd TinyNeuralNetwork
python setup.py install

# Alternatively, you may try the one-liner
pip install git+https://github.com/alibaba/TinyNeuralNetwork.git

Or you could build with docker

sudo docker build -t tinynn:pytorch1.9.0-cuda11.1 .

Contributing

We appreciate your help for improving our framework. More details are listed here.

Basic modules

Computational graph capture: The Graph Tracer in TinyNeuralNetwork captures connectivity of PyTorch operators, which automates pruning and model quantization. It also supports code generation from PyTorch models to equivalent model description files (e.g. models.py).
Dependency resolving: Modifying an operator often causes mismatch in subgraph, i.e. mismatch with other dependent operators. The Graph Modifier in TinyNeuralNetwork handles the mismatchs automatically within and between subgraphs to automate the computational graph modification.
Pruner: OneShot (L1, L2, FPGM), ADMM, NetAdapt, Gradual, End2End and other pruning algorithms have been implemented and will be opened gradually.
Quantization-aware training: TinyNeuralNetwork uses PyTorch's QAT as the backend (we also support simulated bfloat16 training) and optimizes its usability with automating the fusion of operators and quantization of computational graphs (the official implementation requires manual implementation by the user, which is a huge workload).
Model conversion: TinyNeuralNetwork supports conversion of floating-point and quantized PyTorch models to TFLite models for end-to-end deployment.

Project architecture

examples: Provides examples of each module
models: Provides pre-trained models for getting quickstart
tests: Unit tests
tinynn: Code for model compression
- graph : Foundation for computational graph capture, resolving, quantization, code generation, mask management, and etc
- prune : Pruning algorithms
- converter : Model converter
- util: Utility classes

RoadMap

Nov. 2021: A new pruner with adaptive sparsity
Dec. 2021: Model compression for Transformers

Citation

If you find this project useful in your research, please consider cite:

@misc{tinynn,
    title={TinyNeuralNetwork: An efficient deep learning model compression framework},
    author={Ding, Huanghao and Pu, Jiachen and Hu, Conggang},
    howpublished = {\url{https://github.com/alibaba/TinyNeuralNetwork}},
    year={2021}
}

Frequently Asked Questions

Because of the high complexity and frequent updates of PyTorch, we cannot ensure that all cases are covered through automated testing. When you encounter problems You can check out the FAQ, or join the Q&A group in DingTalk via the QR Code below.

tinyneuralnetwork's People

Contributors

Stargazers

Watchers

Forkers

dinghuanghao iotbo hcg8827 gengwb binzh93 bensonlp yongsun zhiqwang allensmile scott-mao wenzheliu-speech okrio khoih-prog zxf864823150 www516717402 yangrice hzwhh caprobin hilbert-yaa liudatutu zousenming steven9046 bobby20180331 spurslipu kingkie huyang19881115 sherry085 wangyibosocute zhaoxin111 cookcodes liu-b-s mickey-stone jiesonshan pengjie-song chumingqian zhetongliang templeblock superying xindongzhang zivzone 2018-summer-dl-training-program westcityinstitute korabelnikov watch-later misby jiangzongkang anminhhung liushield jervint whitefu aqsc mvpzhangqiu spartag117 zk1998 onelittlebee barbecacov noticeable kio2019 waterdropw luobo123luobo123 mrcatjun fragrantrookie moonbunnyzzz peterjc123 ezrealzhang jixiege xueyedamo521 dennistang742 910882575 jie311 majabbour veryearly maxlovecode scott306lr gaoyiyeah liangyn22 csliujw dxhpc yangzyyyy ganzooo liuxubit wudangqibujie sheephow azzam-alhussain opt-ai pyl62112991 lynnl4 unbinilium iq-scm wendy60 derekliu-hz liuzongquan skongwong juelianqvq atuxhe drewzzzz6 zhly0 hokchhaytann bewitching-coder wxbxjtu

tinyneuralnetwork's Issues

Unsupported ops: aten::remainder, aten::where

Thanks for your excellent work. when converting the model from Pytorch to TFLite, it does not support the following operations:
ERROR (TinyNeuralNetwork.tinynn.converter.base) Unsupported ops: aten::remainder, aten::where
Hope you can support these ops later. Thanks.

Quantization-aware pytorch model convert tflite

when using converter.convert()

File "/home/shanshaojie/project/TinyNeuralNetwork/tinynn/converter/base.py", line 356, in convert
self.init_jit_graph()
File "/home/shanshaojie/project/TinyNeuralNetwork/tinynn/converter/base.py", line 143, in init_jit_graph
script = torch.jit.trace(self.model, self.dummy_input)
File "/home/shanshaojie/anaconda3/lib/python3.6/site-packages/torch/jit/_trace.py", line 742, in trace
_module_class,
File "/home/shanshaojie/anaconda3/lib/python3.6/site-packages/torch/jit/_trace.py", line 928, in trace_module
module = make_module(mod, _module_class, _compilation_unit)
File "/home/shanshaojie/anaconda3/lib/python3.6/site-packages/torch/jit/_trace.py", line 560, in make_module
return _module_class(mod, _compilation_unit=_compilation_unit)
File "/home/shanshaojie/anaconda3/lib/python3.6/site-packages/torch/jit/_trace.py", line 1040, in init
submodule, TracedModule, _compilation_unit=None
File "/home/shanshaojie/anaconda3/lib/python3.6/site-packages/torch/jit/_trace.py", line 560, in make_module
return _module_class(mod, _compilation_unit=_compilation_unit)
File "/home/shanshaojie/anaconda3/lib/python3.6/site-packages/torch/jit/_trace.py", line 1040, in init
submodule, TracedModule, _compilation_unit=None
File "/home/shanshaojie/anaconda3/lib/python3.6/site-packages/torch/jit/_trace.py", line 560, in make_module
return _module_class(mod, _compilation_unit=_compilation_unit)
File "/home/shanshaojie/anaconda3/lib/python3.6/site-packages/torch/jit/_trace.py", line 1040, in init
submodule, TracedModule, _compilation_unit=None
File "/home/shanshaojie/anaconda3/lib/python3.6/site-packages/torch/jit/_trace.py", line 549, in make_module
elif torch._jit_internal.module_has_exports(mod):
File "/home/shanshaojie/anaconda3/lib/python3.6/site-packages/torch/_jit_internal.py", line 505, in module_has_exports
item = getattr(mod, name)
File "/home/shanshaojie/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 779, in getattr
type(self).name, name))
torch.nn.modules.module.ModuleAttributeError: 'ConvModule' object has no attribute 'norm'
How to debug the problem? thanks

graph tracer

Why not use torch.FX module and no need a lot of manual work in latest version.

python3.6执行install报错

Installation 说明python版本3.6及以上即可，为什么执行安装的时候却要求>=3.8

bn没有merge进TransposeConv，导致前后插入的fake量化节点

odd size of feature map can cause errors?

Hello, I use model_convert function to convert .pth from pytorch to .tflite for outside devices. I find different erros in two models. The model1 has several errors in both abs and ref diff, the model2 has only abs diff.

I doubt whether it is caused by Resize-ops or Odd length of feature map. Model1 has a feature map - 15x43, the model2 has a feature map 15x20.

1, model 1
First bad error raise up after first resizeNearest's transpose

2, model 2
First bad error raise up before first resizeBilinear's transpose

RuntimeError appears when performing QAT training

Hi Author, there's a runtime error when I do QAT training, which reports:

File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py", line 420, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

It's easy to tell from the information that the input and weights do not match as for data type. It's most possible that the model may not be set to cuda but I review related codes and cannot find out the root cause yet.

The original model (non-qat) trains normally
The code slice is as follows:

。。。。。。。。。。。。。。。。。。。。。。。。
device = get_device()
model.to(device=device)
。。。。。。。。。。。。。。。。。。。。。。。。。。
print("Start preparing the model for quantization")
quantizer = QATQuantizer(model, dummy_input, work_dir='out')
qat_model = quantizer.quantize()
print("Start quantization-aware training ===>", device)
qat_model.to(device=device)

context = DLContext()
context.device = device
context.val_loader = torch.utils.data.DataLoader(
Dataset(opt, 'val'),
batch_size=1,
shuffle=False,
num_workers=1,
pin_memory=True
)

context.train_loader = torch.utils.data.DataLoader(
Dataset(opt, 'train'),
batch_size=opt.batch_size,
shuffle=True,
num_workers=opt.num_workers,
pin_memory=True,
drop_last=True
)
context.max_epoch = 50

train(qat_model, context, trainer.train_one_epoch, trainer.validate, qat=True)

As you see, the original model is set to cuda before converting to quant model. And, the quant model is set to cuda before calling train(). I'm just confused what may be the reason for this issue. Any suggestions are appreciated. BTW, you guys are doing an excellent job !

pytorch模型直接转TFLite可以支持动态维度输入吗？

因为有些模型的输入shape是不固定的，请问转换方法支持转出的TFLite模型是动态输入大小的吗？

Converter for pytorch models that have multiple inputs

Hi, thanks for your excellent work. I've converted my pytorch model that has one input to tflite model successfully.
However, the converter seems not support pytorch models that have multiple inputs yet. Any plan for it?

assertion fails during conversion to tflite for u8 QAT

Assertion fails during conversion to tflite for u8 QAT with the following errors:

converter.convert()
File "/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+a60ddb1432bfd81eebd2d8d232e3a67bb6d2c434-py3.6.egg/tinynn/converter/base.py", line 264, in convert
self.init_operations()
File "/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+a60ddb1432bfd81eebd2d8d232e3a67bb6d2c434-py3.6.egg/tinynn/converter/base.py", line 231, in init_operations
converter.parse(node, attrs, args, self.common_graph)
File "/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+a60ddb1432bfd81eebd2d8d232e3a67bb6d2c434-py3.6.egg/tinynn/converter/operators/torch/aten.py", line 479, in parse
self.elementwise_unary(tfl.QuantizeOperator, graph_converter)
File "/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+a60ddb1432bfd81eebd2d8d232e3a67bb6d2c434-py3.6.egg/tinynn/converter/operators/torch/base.py", line 252, in elementwise_unary
outputs = self.to_tfl_tensors(self.output_names, self.output_tensors)
File "/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+a60ddb1432bfd81eebd2d8d232e3a67bb6d2c434-py3.6.egg/tinynn/converter/operators/torch/base.py", line 155, in to_tfl_tensors
t = tfl.Tensor(t, n, has_buffer=non_existent_as_buffer, asymmetric=self.asymmetric)
File "/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+a60ddb1432bfd81eebd2d8d232e3a67bb6d2c434-py3.6.egg/tinynn/converter/operators/tflite/base.py", line 180, in init
assert tensor.q_zero_point() == 128, "As for symmetric quantization, "
AssertionError: As for symmetric quantization, the zero point of the u8 tensors should be 128. This could happen if you didn't train the model after QAT preparation.

Actually, I remember u8 QAT works in a certain older tinynn version. The quantizer code of mine is written like this:
quantizer = QATQuantizer(model, dummy_input, work_dir='out-', config={'backend': "qnnpack", 'force_overwrite': True, 'asymmetric': True, 'per_tensor': True, 'rewrite_graph': True})

Quantization in qnnpack failed to create QNNPACK Average Pooling operator

Hi developer,
Thank your great work. I want to use QATQuantizer to quantize my model, but in converter, a error appears:

RuntimeError: [enforce fail at q_avgpool.cpp:369] createStatus == pytorch_qnnp_status_success. failed to create QNNPACK Average Pooling operator

I'm using Pytorch v1.10. Is this a QNNPACK issue? I try to use fbgemm and it works. Thank you!

Below is whole call stack:

     18     converter = TFLiteConverter(qat_model, dummy_input, tflite_path='tflite_model/qat_model.tflite')
---> 19     converter.convert()

/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+6bdac3b8f7d9fd53d7ef4c07920414f46d6e2e62-py3.6.egg/tinynn/converter/base.py in convert(self)
    332         """
    333         self.init_input_transpose()
--> 334         self.init_jit_graph()
    335         self.init_lowered_module()
    336         self.init_common_graph()

/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+6bdac3b8f7d9fd53d7ef4c07920414f46d6e2e62-py3.6.egg/tinynn/converter/base.py in init_jit_graph(self)
    120 
    121             with torch.no_grad():
--> 122                 script = torch.jit.trace(self.model, self.dummy_input)
    123 
    124                 # Remove reference to original model to save memory

/usr/local/lib/python3.6/dist-packages/torch/jit/_trace.py in trace(func, example_inputs, optimize, check_trace, check_inputs, check_tolerance, strict, _force_outplace, _module_class, _compilation_unit)
    748             strict,
    749             _force_outplace,
--> 750             _module_class,
    751         )
    752 

/usr/local/lib/python3.6/dist-packages/torch/jit/_trace.py in trace_module(mod, inputs, optimize, check_trace, check_inputs, check_tolerance, strict, _force_outplace, _module_class, _compilation_unit)
    963                 strict,
    964                 _force_outplace,
--> 965                 argument_names,
    966             )
    967             check_trace_method = module._c._get_method(method_name)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
   1088                 recording_scopes = False
   1089         try:
-> 1090             result = self.forward(*input, **kwargs)
   1091         finally:
   1092             if recording_scopes:

/home/okt/aidea-semantic/semantic-segmentation/out/ddrnet_qat.py in forward(self, input_1)
    376         spp_process3_1 = self.spp_process3_1(spp_process3_0)
    377         spp_process3_2 = self.spp_process3_2(spp_process3_1)
--> 378         spp_scale4_0 = self.spp_scale4_0(add_18)
    379         spp_scale4_1 = self.spp_scale4_1(spp_scale4_0)
    380         spp_scale4_2 = self.spp_scale4_2(spp_scale4_1)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
   1088                 recording_scopes = False
   1089         try:
-> 1090             result = self.forward(*input, **kwargs)
   1091         finally:
   1092             if recording_scopes:

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/pooling.py in forward(self, input)
    615     def forward(self, input: Tensor) -> Tensor:
    616         return F.avg_pool2d(input, self.kernel_size, self.stride,
--> 617                             self.padding, self.ceil_mode, self.count_include_pad, self.divisor_override)
    618 
    619 

RuntimeError: [enforce fail at q_avgpool.cpp:369] createStatus == pytorch_qnnp_status_success. failed to create QNNPACK Average Pooling operator

Unsupported ops: argmax, expand, gather

Thanks for your excellent work. It will definitely help when converting the model from Pytorch to TFLite.

When I try to convert the MoveNet (https://github.com/lee-man/movenet-pytorch) to TFLite using tinynn converter, the log reported some unsupported ops: argmax, expand and gather. Hope you can support these ops later:)

Add extra output for inference

Hi Author,

I need to add some extra output tensors which are used for inference. These tensors are not referenced during training but just for inference after the conversion to tflite. My naive intention is to put some operations in forward as many as possible so as to relief the loading of post processing which has to be implemented by c/c++ code.

For instance, some matrix ops such as reshape/sigmoid/multiply better be done by GPU/NPU instead of with CPU.

I add some logic in forward to implement this requirement and the training goes well but the conversion to tflite fails with following error message:
File "/usr/local/lib/python3.6/dist-packages/torch/nn/quantized/modules/functional_modules.py", line 160, in mul
r = ops.quantized.mul(x, y, scale=self.scale, zero_point=self.zero_point)
RuntimeError: Mul operands should have same data type.

Is there any feasible way or workaround for this scenario?
The script attached. Thanks.
movenet_qat.zip

For an int8 QAT model, the dequantized weights/bias are different from the original model

Hi, sorry to bother. This issue may not be directly related to tinynn. The scenario on my side is described as following:

I perform int8 QAT based on an official float32 model.
Only to fine tune some layers, I set requires_grad to False for all the other layers.
The training goes normally and the conversion to tflite is also done.
Use tf2onnx to dequantize the generated tflite and check the weights/bias. The weights/bias of the frozen layers are pretty different compared to the original model.

My target is to keep the weights/bias the same with the original model for the frozen layers after dequantization. Not sure if it's feasible. As you know, the official float32 model performs well for most cases and I just want to fine tune a few layers. Meanwhile, considering the inference speed on some edge devices, I have to perform QAT since the post-training quantization is really worse.

Thanks for your time.

后端可以考虑接入MNN

现在移动端部署接入了TFLite，需要实现Torch到TFLite的转换；建议后端接入MNN， MNN后端性能领先于TFLite, 且目前提供了torchscript -> mnn模型的转换工具。

oneshotpruner fails when possible channel padding exists

Hi author，

For a certain model, oneshotpruner fails when there's channel padding with following error message:

ERROR (tinynn.graph.modifier) All node's sparsity in one subgraph must be the same

Please let me know if you need the model description file. Thanks.

assert count_include_pad in (True,1)

请问这是什么原因呢？

[converter_optimizer_test] test_binary_elementwise_transpose failed

Job: https://github.com/alibaba/TinyNeuralNetwork/runs/5256300818?check_suite_focus=true
Code: https://github.com/alibaba/TinyNeuralNetwork/blob/main/tests/converter_optimizer_test.py#L213

Status:
Skipped via 488b170

[converter_op_test] test_prelu failed

Job: https://github.com/alibaba/TinyNeuralNetwork/runs/5256300818?check_suite_focus=true
Code: https://github.com/alibaba/TinyNeuralNetwork/blob/main/tests/converter_op_test.py#L515

Status:
Skipped via a5bad2b

Please support Softmax for QAT

Execute a python file like below:

import torch
from torch import nn
from tinynn.converter import TFLiteConverter
from tinynn.graph.quantization.quantizer import QATQuantizer
from tinynn.graph.tracer import model_tracer
from tinynn.util.train_util import DLContext, get_device, train

class DummyNet(nn.Module):
    def __init__(self, num_classes=4):
        super(DummyNet, self).__init__()
        
        self.input_channel = 1
        self.base_channel = 4

        def conv_bn(inp, oup, stride):
            return nn.Sequential(
                nn.Conv2d(inp, oup, 3, stride, 1, bias=False),
                nn.BatchNorm2d(oup),
                nn.ReLU(inplace=True)
            )

        def conv_dw(inp, oup, stride):
            return nn.Sequential(
                nn.Conv2d(inp, inp, 3, stride, 1, groups=inp, bias=False),
                nn.Conv2d(inp, oup, 1, 1, 0, bias=False),
                nn.BatchNorm2d(oup),
                nn.ReLU(inplace=True)
            )

        self.model = nn.Sequential(
            conv_bn(self.input_channel, self.base_channel, 2), 
            conv_dw(self.base_channel,  self.base_channel * 2, 1),
            conv_dw(self.base_channel * 2, self.base_channel * 4, 2),
            conv_dw(self.base_channel * 4, self.base_channel * 8, 2),
            conv_dw(self.base_channel * 8, self.base_channel * 16, 2),
            conv_dw(self.base_channel * 16, self.base_channel * 16, 1),
            conv_dw(self.base_channel * 16, self.base_channel * 16, 1),
            conv_dw(self.base_channel * 16, self.base_channel * 32, 2),
            conv_dw(self.base_channel * 32, self.base_channel * 32, 1),
            nn.AvgPool2d(kernel_size=(3, 6)),
            nn.Flatten(),
            nn.Linear(self.base_channel * 32, num_classes)
        )

        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):
        x = self.model(x)
        x = self.softmax(x)

        return x

    def predict(self, x):
        x = self.forward(x)
        x = torch.argmax(x, dim=1)
        
        return x

if __name__ == '__main__':
    with model_tracer():
        model = DummyNet()
        model.eval()

        dummy_input = torch.rand((1, 1, 135, 240))
        quantizer = QATQuantizer(model, dummy_input, work_dir='out')
        qat_model = quantizer.quantize()

        device = get_device()
        qat_model.to(device=device)

        with torch.no_grad():
            qat_model.eval()
            qat_model.cpu()
            qat_model = torch.quantization.convert(qat_model)
            torch.backends.quantized.engine = 'qnnpack'
            converter = TFLiteConverter(qat_model, dummy_input, tflite_path='out/dummy_qat.tflite')
            converter.convert()

And then I got below error:

  File "/root/miniconda3/lib/python3.7/site-packages/torch/jit/_trace.py", line 744, in trace
    _module_class,
  File "/root/miniconda3/lib/python3.7/site-packages/torch/jit/_trace.py", line 959, in trace_module
    argument_names,
  File "/root/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1039, in _slow_forward
    result = self.forward(*input, **kwargs)  File "out/dummynet_qat.py", line 96, in forward    softmax = self.softmax(model_11)  File "/root/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1039, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/root/miniconda3/lib/python3.7/site-packages/torch/nn/modules/activation.py", line 1256, in forward
    return F.softmax(input, self.dim, _stacklevel=5)
  File "/root/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 1679, in softmax
    ret = input.softmax(dim)
NotImplementedError: Could not run 'aten::_softmax' with arguments from the 'QuantizedCPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::_softmax' is only available for these backends: [CPU, CUDA, MkldnnCPU, BackendSelect, Named, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, UNKNOWN_TENSOR_TYPE_ID, AutogradMLC, AutogradHPU, AutogradNestedTensor, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

CPU: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/build/aten/src/ATen/RegisterCPU.cpp:16286 [kernel]
CUDA: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/build/aten/src/ATen/RegisterCUDA.cpp:20674 [kernel]
MkldnnCPU: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/build/aten/src/ATen/RegisterMkldnnCPU.cpp:563 [kernel]
BackendSelect: fallthrough registered at /opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Named: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
ADInplaceOrView: fallthrough registered at /opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/core/VariableFallbackKernel.cpp:60 [backend fallback]
AutogradOther: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/torch/csrc/autograd/generated/VariableType_0.cpp:9848 [autograd kernel]
AutogradCPU: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/torch/csrc/autograd/generated/VariableType_0.cpp:9848 [autograd kernel]
AutogradCUDA: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/torch/csrc/autograd/generated/VariableType_0.cpp:9848 [autograd kernel]
AutogradXLA: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/torch/csrc/autograd/generated/VariableType_0.cpp:9848 [autograd kernel]
UNKNOWN_TENSOR_TYPE_ID: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/torch/csrc/autograd/generated/VariableType_0.cpp:9848 [autograd kernel]
AutogradMLC: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/torch/csrc/autograd/generated/VariableType_0.cpp:9848 [autograd kernel]
AutogradHPU: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/torch/csrc/autograd/generated/VariableType_0.cpp:9848 [autograd kernel]
AutogradNestedTensor: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/torch/csrc/autograd/generated/VariableType_0.cpp:9848 [autograd kernel]
AutogradPrivateUse1: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/torch/csrc/autograd/generated/VariableType_0.cpp:9848 [autograd kernel]
AutogradPrivateUse2: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/torch/csrc/autograd/generated/VariableType_0.cpp:9848 [autograd kernel]
AutogradPrivateUse3: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/torch/csrc/autograd/generated/VariableType_0.cpp:9848 [autograd kernel]
Tracer: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/torch/csrc/autograd/generated/TraceType_0.cpp:9750 [kernel]
Autocast: fallthrough registered at /opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/autocast_mode.cpp:255 [backend fallback]
Batched: registered at /opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/BatchingRegistrations.cpp:1019 [backend fallback]
VmapMode: fallthrough registered at /opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]

It looks like Softmax is not implemented for QAT. If quantized Softmax is not supported, floating Softmax is good to me.
Could you implement it?

Many Thanks

Cannot trace any node

Sorry to bother you.

I am just wondering why the tracer cannot trace any of the internal nodes.

I use the torchvision model shown below.

But got this error.

Do you have any solution to deal with this situation?

Thanks in advance!

What's the default quantization type?

Hi, I notice that the default value to parameter 'asymmetric' is True, which indicates that the quantization should be done as uint8. But the following codes make me a little confused:

if not self.asymmetric:
sym_fq = torch_q.FakeQuantize.with_args(observer=torch_q.MovingAverageMinMaxObserver, quant_min=0, quant_max=255,
dtype=torch.quint8, qscheme=torch.per_tensor_symmetric, reduce_range=False)
qconfig = torch_q.QConfig(sym_fq, qconfig.weight)

where, even users set the 'asymmetric' to False, the codes still force to use uint8. Any suggestions? Thanks a lot.

TypeError rises up when creating QAT model

Hi, a type error happens when I generate the qat model based on a pth model:

TypeError: init() got an unexpected keyword argument 'value'

I check the generated the py file re-written by tinynn:
torch.nn.ZeroPad2d(padding=(0, 1, 0, 1), value=0.0)

There are actually codes similar to above line. It's supposed no 'value' is needed for ZeroPad2d. Any suggestions? Thanks a lot and look forward to your help.

RuntimeError rises up when converting qat model to tflite

Hi, my qat training goes well based on your sample code. But the conversion to tflite fails with the following error message:

RuntimeError: Quantized copy only works with contiguous Tensors

Based on the core trace, the error comes from:
converter.convert()

My codes are almost the same with the sample code:

print("Start converting the model to TFLite")
with torch.no_grad():
qat_model.eval()
qat_model.cpu()

# The step below converts the model to an actual quantized model, which uses the quantized kernels.
qat_model = torch.quantization.convert(qat_model)                                                      

# When converting quantized models to TFLite, please ensure the quantization backend is QNNPACK.
torch.backends.quantized.engine = 'qnnpack'

# The code section below is used to convert the model to the TFLite format
converter = TFLiteConverter(qat_model, dummy_input, tflite_path='out/qat_model.tflite')
converer.convert()

By the way, I am curious how the trained qat model affect the conversion. Looks like the conversion could be done even without training. Not familiar with that, appreciate your help.Thanks.

outputs are different between a QAT tflite and corresponding de-quantized onnx model

I got a QAT int8 per-channel tflite model. To check the accuracy, I compare the inference results between it and the de-quantized onnx model.

python3.6 -m tf2onnx.convert --opset 11 --tflite test.tflite --output temp.onnx --dequantize
python3.6 -m onnxsim temp.onnx test.onnx #test.onnx then holds float32 weights/bias
Run test.onnx and test.tflite respectively, compare the inference results based on the same input. There are big differences between the results. The onnx model has much better inference results.

I'm not pretty clear about the inference flow of a QAT tflite. Not sure such situation is normal or not. It's supposed to have similar results to that of onnx model.

Attach onnx and tflite model for reference.
test.zip

BUG of Graph generater

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.register_parameter("x", nn.Parameter(torch.ones(2, 3)))

    def forward(self, data):
        out = [data]
        for i in range(2):
            y1 = torch.cat([out[-1], self.x])
            y2 = torch.cat([out[-1], self.x])
            y = y1 + y2
            out.append(y)
        return out

class Net(torch.nn.Module):
    def __init__(self):
        super().__init__()

        self.register_parameter("tensor_1", torch.nn.Parameter(torch.empty((2, 3), dtype=torch.float32)))
        self.register_parameter("tensor_2", torch.nn.Parameter(torch.empty((2, 3), dtype=torch.float32)))
        self.register_parameter("tensor_3", torch.nn.Parameter(torch.empty((2, 3), dtype=torch.float32)))
        self.register_parameter("tensor_4", torch.nn.Parameter(torch.empty((2, 3), dtype=torch.float32)))

    def forward(self, input_1):
        cat_1 = torch.cat([input_1, self.tensor_1])
        cat_2 = torch.cat([input_1, self.tensor_2])
        add_1 = cat_1.__add__(cat_2)
        cat_3 = torch.cat([add_1, self.tensor_3])
        cat_4 = torch.cat([add_1, self.tensor_4])
        add_2 = cat_3.__add__(cat_4)
        return input_1, add_1, add_2

Not able to use converter.py to generate pytorch (mobilenet) to tflite(int8-quantized) for mobilenet model using Colab

Please see the below colab that I am using to convert mobilet v2 from pytorch to tflite(int8)
https://colab.research.google.com/drive/1eW-I0RDzB3L6Zbz364t5lkI4fxgvpGbI#scrollTo=5YtQg5Ga2wmq

Getting the below erro]rs
Traceback (most recent call last):
File "./examples/converter/convert.py", line 9, in
from examples.models.cifar10.mobilenet import DEFAULT_STATE_DICT, Mobilenet
ModuleNotFoundError: No module named 'examples.models'

[converter] TFLite schema update

According to user feedback, the current supported ops are not sufficient for model deployment. This requires a major update of the existing TFLite schema. Currently, we are using the schema of TFLite 2.3.0.

After a quick skimming of the current TF operator list (as of TFLite 2.8.0), if we consider updating it to the latest schema, we will have support for the builtin ops include

We may also add support to the following custom kernels if users choose to build TFLite from source.

torch.fx

可否与torch的新特性fx结合，为量化、剪枝提供新的解决方法。

使用 converter.convert 报错：AssertionError: Model is in training model

您好，当我使用工具转换时，报错：AssertionError: Model is in training model，详细信息如下：

2022-04-08 13:56:40.738033: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
Making model...
Load the model from ../float/SESR_S/model/model_best_rep.pt
Traceback (most recent call last):
  File "convert_and_compare.py", line 170, in <module>
    main_worker()
  File "convert_and_compare.py", line 72, in main_worker
    converter.convert()
  File "anaconda3/envs/torchenv/lib/python3.7/site-packages/TinyNeuralNetwork-0.1.0.20220407153809-py3.7.egg/tinynn/converter/base.py", line 357, in convert
    self.init_lowered_module()
  File "anaconda3/envs/torchenv/lib/python3.7/site-packages/TinyNeuralNetwork-0.1.0.20220407153809-py3.7.egg/tinynn/converter/base.py", line 191, in init_lowered_module
    ), 'Model is in training model'
AssertionError: Model is in training model

请问这个原因是为何，我应该如何解决？

Converting LiteHRNet pytorch model to TFLite, outputs don't match

Hi, this is really great work, thanks!

I am able to convert the LiteHRNet model to TFLite without running into any issues. However, the outputs don't match up.

Here is the output from sending ones through the network. Output is of shape [1,17,96,72].
I am just showing here output[0,0,0] from both pytorch and tflite:

pytorch


array([6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 1.8188522e-04, 1.7515068e-04, 1.9644469e-04,
1.6027213e-04, 1.9049855e-04, 1.5419864e-04, 1.2460010e-04,
9.0751186e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05],
dtype=float32)

tflite


array([6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
1.1580180e-04, 2.3429818e-04, 3.9018277e-04, 7.7823577e-03,
1.8948119e-02, 2.8559987e-02, 3.3612434e-02, 2.5932681e-02,
1.2074142e-02, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05,
6.4367385e-05, 6.4367385e-05, 6.4367385e-05, 6.4367385e-05],
dtype=float32)

When I convert to tflite via the onnx route, the outputs do match. So my guess is that some of the transpose/reshapes for NHWC is not happening correctly but I am not sure.
Looking for some insight into what would be the best way to debug this?

Models:
LiteHRNet pt trace
LiteHRNet tiny tflite

Unsupported ops: aten::copy_

Thanks for your excellent work. when converting the model from Pytorch to TFLite,it do not supported copy.error follows:
ERROR (tinynn.converter.base) Unsupported ops: aten::copy_
Traceback (most recent call last):
File "export_tf.py", line 25, in
converter.convert()
File "/home/demo/miniconda3/lib/python3.8/site-packages/TinyNeuralNetwork-0.1.0-py3.8.egg/tinynn/converter/base.py", line 269, in convert
raise Exception("Cannot continue due to fatal error")
Exception: Cannot continue due to fatal error

Any chance to support int16 quantization for QAT and mixed QAT?

There is a kind of quantization that quantizes model with dynamic fixed point int16. Is there any chance to support int16 quantization for QAT and mixed QAT?

sample netadapt pruner fails

I tried examples/pruner/netadapt/netadapt_prune.py and got the following errors:

INFO (tinynn.prune.netadapt_pruner) Global Target/Initial FLOPS: 437178624/582904832
INFO (tinynn.prune.netadapt_pruner) Start iteration 1
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
obj = _ForkingPickler.dumps(obj)
File "/usr/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'OneShotChannelPruner.init..'
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
obj = _ForkingPickler.dumps(obj)
File "/usr/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'OneShotChannelPruner.init..'
INFO (tinynn.prune.netadapt_pruner) Init pool process with cuda id 0

The process then blocks after displaying above messages.

unrecognized tensor type NoneType

我这里使用转换的代码如下：

import os
import torch
CURRENT_PATH = os.path.abspath(os.path.dirname(__file__))
print(CURRENT_PATH)
from tinynn.converter import TFLiteConverter

model=torch.jit.load("netG.pt")
model.eval()
print(type(model))
dummy_input = torch.rand((1, 3, 512, 512))
output_path = os.path.join(CURRENT_PATH, 'out', 'mbv1_512.tflite')
converter = TFLiteConverter(model, dummy_input, output_path)
converter.convert()

但是我得到的结果是

Traceback (most recent call last):
  File "convert_torch_2tf.py", line 14, in <module>
    converter.convert()
  File "/AN/envs/TinyNeuralNetwork/lib/python3.6/site-packages/TinyNeuralNetwork-0.1.0.20220509160335+35e27d3e883f4b7829a994ed2563a438d1c90efd-py3.6.egg/tinynn/converter/base.py", line 374, in convert
    self.init_operations()
  File "/AN/envs/TinyNeuralNetwork/lib/python3.6/site-packages/TinyNeuralNetwork-0.1.0.20220509160335+35e27d3e883f4b7829a994ed2563a438d1c90efd-py3.6.egg/tinynn/converter/base.py", line 339, in init_operations
    converter.parse(node, attrs, args, self.common_graph)
  File "/AN/envs/TinyNeuralNetwork/lib/python3.6/site-packages/TinyNeuralNetwork-0.1.0.20220509160335+35e27d3e883f4b7829a994ed2563a438d1c90efd-py3.6.egg/tinynn/converter/operators/torch/aten.py", line 198, in parse
    inputs = [self.find_or_create_input(i, graph_converter) for i in range(5)]
  File "/AN/envs/TinyNeuralNetwork/lib/python3.6/site-packages/TinyNeuralNetwork-0.1.0.20220509160335+35e27d3e883f4b7829a994ed2563a438d1c90efd-py3.6.egg/tinynn/converter/operators/torch/aten.py", line 198, in <listcomp>
    inputs = [self.find_or_create_input(i, graph_converter) for i in range(5)]
  File "//AN/envs/TinyNeuralNetwork/lib/python3.6/site-packages/TinyNeuralNetwork-0.1.0.20220509160335+35e27d3e883f4b7829a994ed2563a438d1c90efd-py3.6.egg/tinynn/converter/operators/torch/base.py", line 176, in find_or_create_input
    return tfl.Tensor(tensor, name, has_buffer=True, asymmetric=self.asymmetric, q_type=self.q_type)
  File "/AN/envs/TinyNeuralNetwork/lib/python3.6/site-packages/TinyNeuralNetwork-0.1.0.20220509160335+35e27d3e883f4b7829a994ed2563a438d1c90efd-py3.6.egg/tinynn/converter/operators/tflite/base.py", line 247, in __init__
    assert False, f"unrecognized tensor type {type(tensor).__name__}"
AssertionError: unrecognized tensor type NoneType

我去打印了，我的tensor得到了None，所以报错了，这个是神原因呢

Conversion to tflite fails for a specific network

I applied QAT to a model and the final conversion to tflite fails with the following error messages:

torch.jit._trace.TracingCheckError: Tracing failed sanity checks!
ERROR: Tensor-valued Constant nodes differed in value across invocations. This often indicates that the tracer has encountered untraceable code.
Node:
%data_1 : Tensor = prim::Constantvalue= # out-2022-01-17-08-27-39/detector_qat.py:259:0
Source Location:
out-2022-01-17-08-27-39/detector_qat.py(259): forward
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py(709): _slow_forward
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py(725): _call_impl
/usr/local/lib/python3.6/dist-packages/torch/jit/_trace.py(940): trace_module
/usr/local/lib/python3.6/dist-packages/torch/jit/_trace.py(742): trace
/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+e35ef92faba445830bb6156c916b1e838801c07e-py3.6.egg/tinynn/converter/base.py(97): init_jit_graph
/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+e35ef92faba445830bb6156c916b1e838801c07e-py3.6.egg/tinynn/converter/base.py(281): convert
main.py(133): main
main.py(193):
Comparison exception: promoteTypes with quantized numbers is not handled yet; figure out what the correct rules should be, offending types: QUInt8 Float

Please refer to the attachment. Thanks.
test.zip

"quantized::instance_norm" Support

Use TFLiteConvert to convert a quantized model containing InstanceNorm2D like below

converter = TFLiteConverter(model, dummy_input, tflite_path=args.output)
converter.convert()

And the error occured like below

ERROR (tinynn.converter.base) Unsupported ops: quantized::instance_norm
Traceback (most recent call last):
  File "export/export_model_to_tflite.py", line 131, in <module>
    main()
  File "export/export_model_to_tflite.py", line 117, in main
    converter.convert()
  File "/root/miniconda3/lib/python3.7/site-packages/tinynn/converter/base.py", line 269, in convert
    raise Exception("Cannot continue due to fatal error")
Exception: Cannot continue due to fatal error

Look like op "quantized::instance_norm" is unsupported. Could you support "quantized::instance_norm"?
Many Thanks

How To Add NMS in pytorch model so that it gets converted into TFLITE

First off, thanks for this amazing repo. Im working on a ssd model in pytorch and i want to add post processing (NMS) into the tflite, how can i add it into my model so that it gets translated to tflite's NMS OP. thanks

Is dynamic fixed point int16 quantization supported for QAT?

Symmetric int16 quantization is now supported by TinyNN. Any chance support fixed point int16 quantization for QAT?

How to perform per-channel pat?

Hi, I check the QAT model based on tinynn and it seems that the QAT is done with per-tensor schema. I am very concerned with less accuracy loss, so I wonder if there's any way to do per-channel QAT. Thanks.

transformer compression roadmap

Hi, just wonder if Vit transformer will be supported? Many vision transformer worthy to support such as VIT, Swin-transformer, or some e2e OD like DETR.

Will it plann to support in the future?

Error from `avg_pool3d` & `Conv3D ` in 3D_CNN

非常棒的Repo，但我想使用convert.py将PyTorch训练好的3DCNN分类模型转换为.tflite模型，发生了错误：

问题1：x = F.avg_pool3d(x, x.data.size()[-3:])不支持

ERROR (tinynn.converter.base) Unsupported ops: aten::avg_pool3d

问题2：我将Temporal通道Squeeze掉了，但还是发生了错误：

Traceback (most recent call last):
  File "/home/leovin/Dynamic Gesture Detection/3D-CNNs/TinyNeuralNetwork/examples/converter/convert_for_3DCNNs.py", line 50, in <module>
    main_worker()
  File "/home/leovin/Dynamic Gesture Detection/3D-CNNs/TinyNeuralNetwork/examples/converter/convert_for_3DCNNs.py", line 46, in main_worker
    converter.convert()
  File "/home/leovin/anaconda3/envs/dgr/lib/python3.9/site-packages/TinyNeuralNetwork-0.1.20220310142810-py3.9.egg/tinynn/converter/base.py", line 347, in convert
    optimizer.optimize()
  File "/home/leovin/anaconda3/envs/dgr/lib/python3.9/site-packages/TinyNeuralNetwork-0.1.20220310142810-py3.9.egg/tinynn/converter/operators/optimize.py", line 1377, in optimize
    self.transform_graph()
  File "/home/leovin/anaconda3/envs/dgr/lib/python3.9/site-packages/TinyNeuralNetwork-0.1.20220310142810-py3.9.egg/tinynn/converter/operators/optimize.py", line 210, in transform_graph
    op.transform(self.graph, mapping)
  File "/home/leovin/anaconda3/envs/dgr/lib/python3.9/site-packages/TinyNeuralNetwork-0.1.20220310142810-py3.9.egg/tinynn/converter/operators/tflite/transformable.py", line 235, in transform
    assert False, "Only Conv[Transpose]1d/2d is supported"
AssertionError: Only Conv[Transpose]1d/2d is supported

Process finished with exit code 1

Only Conv[Transpose]1d/2d is supported请问是因为不支持Conv3d吗？

问题3：如果支持Conv3d，我该怎么做？

期待您的回答！

TF 1.13.2 inference Error : Didn't find op for builtin opcode

Hi developer,
Thank your great work. For some reason my tflite inference environment is limited to Tensorflow 1.13.2 version. When I converted my model to tflite and inference this tflite file, there is a problem

ValueError: Didn't find op for builtin opcode 'CONV_2D' version '2' Registration failed.

But it is all ok in tf 1.15.5 and output result is perfect. How do I modify converter code to fit into tf ver 1.13.2 ? Thank you!

The recent version generates incorrect import operations

I update to the recent version which supports per-channel QAT. It seems that some incorrect import operations are generated to the model description script, e.g:

import self.float_functional_simple_0
import self.float_functional_simple_1
import self.float_functional_simple_10
import self.float_functional_simple_11
import self.float_functional_simple_12
import self.float_functional_simple_2
import self.float_functional_simple_3
import self.float_functional_simple_4
import self.float_functional_simple_5
import self.float_functional_simple_6
import self.float_functional_simple_7
import self.float_functional_simple_8
import self.float_functional_simple_9

The training then cannot be done due to following error:
import self.float_functional_simple_0
ModuleNotFoundError: No module named 'self'

ERROR (tinynn.converter.base) Unsupported ops: aten::sum

提示不支持这个

OneShotChannelPruner get error when reshape / view to 3 (or higher) dimension

Hi, I got a problem when I use OneShotChannelPruner() to prune my model. I checked the example for mobilenet, it was fine when we use reshape or view like :

x = x.reshape(-1,2049) or
view_1 = model_14.view(shape_1[0], -1)

However, when I reshape to higher dimensions like: x = x.reshape(-1, batch_size, 1024), I will get error:
AttributeError: 'TraceNode' object has no attribute 'modifier'
Did I do something wrong? How can I solve the problem? Thanks.

Converting LSTM Error： AssertionError: Some output nodes are missing: ['34', '35']

你好，我在尝试你们的LSTM转换的时候遇到报错，

代码如下：

import io
import os
import sys

import torch

sys.path.append('../../')

from tinynn.converter import TFLiteConverter

CURRENT_PATH = os.path.abspath(os.path.dirname(__file__))


def main_worker():
    input_size = 80
    seq_length = 10
    hidden_size = 512
    with torch.no_grad():
        model = torch.nn.LSTM(input_size=input_size, hidden_size=hidden_size, bidirectional=False, bias=False, batch_first=True).cuda()
    model.cpu()
    model.eval()

    dummy_input = torch.rand((1, 10, input_size))

    output_path = os.path.join(CURRENT_PATH, 'out', 'lstm_80.tflite')

    # When converting quantized models, please ensure the quantization backend is set.
    torch.backends.quantized.engine = 'qnnpack'

    # The code section below is used to convert the model to the TFLite format
    # If you want perform dynamic quantization on the float models,
    # you may pass the following arguments.
    #   `asymmetric=True, quantize_target_type='int8', hybrid_quantization_from_float=True, hybrid_per_channel=False`
    converter = TFLiteConverter(model, dummy_input, output_path)
    converter.convert()


if __name__ == '__main__':
    main_worker()

请问我这种方式是否正确，如果不正确，请问有其他转换LSTM相关的参考代码吗？谢谢。

QAT LSTM with hidden state (h0, c0) will get error

Hi,

I want to apply hidden state on my LSTM model and run QAT. The model can run normally without Quantizer, but will get the error below when I apply QATQuantizer:

ERROR (tinynn.graph.tracer) inputs: ['input_1', 'input_2', 'input_3']
ERROR (tinynn.graph.tracer) forwards: []
ERROR (tinynn.graph.tracer) outputs: []
ERROR (tinynn.graph.tracer) constants: []

The attachment is an example file, thanks.
torch_lstm_test.zip

quantizer.quantize()中load模型权重报Missing key(s)， Unexpected key(s)，size mismatch

main_worker(args)

File "post_train_quantization.py", line 78, in main_worker
ptq_model = quantizer.quantize()
File "/home/yaoxinghua/miniconda3/envs/yolo/lib/python3.7/site-packages/TinyNeuralNetwork-0.1.0.20220512170349+d0053782b9ca90a8554d211660f82c7da1e36962-py3.7.egg/tinynn/graph/quantization/quantizer.py", line 194, in quantize
rewritten_model.load_state_dict(torch.load(model_weights_path))
File "/home/yaoxinghua/miniconda3/envs/yolo/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1407, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Stereo_qat:
Missing key(s) in state_dict: "stereo_stem_4_0.weight", "stereo_stem_4_0.bias", "stereo_stem_4_0_1.weight", "stereo_stem_4_0_1.bias", "stereo_cost_agg_conv_agg_1_1_activate.weight", "stereo_cost_agg_conv_agg_1_1_activate.bias", "stereo_cost_agg_conv_agg_1_1_activate.running_mean", "stereo_cost_agg_conv_agg_1_1_activate.running_var".
Unexpected key(s) in state_dict: "stereo_feature_block0_0_0_conv_dw.weight", "stereo_feature_block0_0.0.conv_dw.weight", "stereo_feature_block0_0.0.bn1.weight", "stereo_feature_block0_0.0.bn1.bias", "stereo_feature_block0_0.0.bn1.running_mean", "stereo_feature_block0_0.0.bn1.running_var", "stereo_feature_block0_0.0.bn1.num_batches_tracked", "stereo_feature_block0_0.0.conv_pw.weight", "stereo_feature_block0_0.0.bn2.weight", "stereo_feature_block0_0.0.bn2.bias", "stereo_feature_block0_0.0.bn2.running_mean", "stereo_feature_block0_0.0.bn2.running_var", "stereo_feature_block0_0.0.bn2.num_batches_tracked", "stereo_feature_block0_0_0_conv_dw_1.weight", "stereo_feature_block0_0_1.0.conv_dw.weight", "stereo_feature_block0_0_1.0.bn1.weight", "stereo_feature_block0_0_1.0.bn1.bias", "stereo_feature_block0_0_1.0.bn1.running_mean", "stereo_feature_block0_0_1.0.bn1.running_var", "stereo_feature_block0_0_1.0.bn1.num_batches_tracked", "stereo_feature_block0_0_1.0.conv_pw.weight", "stereo_feature_block0_0_1.0.bn2.weight", "stereo_feature_block0_0_1.0.bn2.bias", "stereo_feature_block0_0_1.0.bn2.running_mean", "stereo_feature_block0_0_1.0.bn2.running_var", "stereo_feature_block0_0_1.0.bn2.num_batches_tracked", "stereo_stem_4_0.conv.weight", "stereo_stem_4_0.bn.weight", "stereo_stem_4_0.bn.bias", "stereo_stem_4_0.bn.running_mean", "stereo_stem_4_0.bn.running_var", "stereo_stem_4_0.bn.num_batches_tracked", "stereo_stem_4_0_1.conv.weight", "stereo_stem_4_0_1.bn.weight", "stereo_stem_4_0_1.bn.bias", "stereo_stem_4_0_1.bn.running_mean", "stereo_stem_4_0_1.bn.running_var", "stereo_stem_4_0_1.bn.num_batches_tracked".
size mismatch for stereo_cost_agg_conv_skip_1_bn.weight: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([960]).
size mismatch for stereo_cost_agg_conv_skip_1_bn.bias: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([960]).
size mismatch for stereo_cost_agg_conv_skip_1_bn.running_mean: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([960]).
size mismatch for stereo_cost_agg_conv_skip_1_bn.running_var: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([960]).

Symmetric affine int8 but appering -128 in convolution weight when using 16x8 quantization

Perform below code

if __name__ == '__main__':

    with model_tracer():
        
        model = QfMobileNet()
        model.eval()
        model.softmax = nn.LogSoftmax(dim=1)
        
        dummy_input = torch.rand((1, 1, 135, 240))
        quantizer = QATQuantizer(
            model, dummy_input, work_dir=output_dir, config={
                'backend': "qnnpack", 
                'force_overwrite': True, 
                'asymmetric': False, 
                'per_tensor': False, 
                'rewrite_graph': True
            }
        )
        qat_model = quantizer.quantize()
        
        device = get_device()
        qat_model.to(device=device)
        
        context = DLContext()
        context.device = device
        context.train_loader = training_loader
        context.val_loader = testing_loader
        context.max_epoch = 100
        context.criterion = nn.NLLLoss()
        context.optimizer = torch.optim.Adam(qat_model.parameters(), lr=0.001)
        if isinstance(context.criterion, nn.Module):
            context.criterion = context.criterion.to(device=context.device)

        qat_model(torch.randn(1, 1, 135, 240).cuda())

        with torch.no_grad():
            qat_model.eval()
            qat_model.cpu()
            qat_model.softmax = nn.Softmax(dim=1)
            qat_model = torch.quantization.convert(qat_model)
            torch.backends.quantized.engine = 'qnnpack'
            converter = TFLiteConverter(qat_model, dummy_input, tflite_path=f'{output_dir}/qat.tflite', quantize_target_type='int16', strict_symmetric_check=True)
            converter.convert()

Then I got qat.tflite but -128 is at convolution weight which input name is "input.4_te_transform_1_te_transform_2" even though setting asymmetric as False (Range shoud be [-127, 127])

qat.tflite is compressed as qat.zip