Comments (8)
Seems to be related to #117.
Also, it looks like the NNCF path used per-channel activation quantization, while the POT path used per-tensor quantization. This might be a sign of HW config file misalignment between NNCF and POT (our fault), or mismatching configurations between POT and NNCF if you did not use specific HW configuration. Try forcing per-tensor in NNCF via config file parameters ("per_channel": false
) as a workaround in NNCF and see if it improves the situation.
from nncf.
Try checking out #124 to see if it fixes the FakeQuantize order in NNCF.
from nncf.
Seems to be related to #117.
Also, it looks like the NNCF path used per-channel activation quantization, while the POT path used per-tensor quantization. This might be a sign of HW config file misalignment between NNCF and POT (our fault), or mismatching configurations between POT and NNCF if you did not use specific HW configuration. Try forcing per-tensor in NNCF via config file parameters (
"per_channel": true
) as a workaround in NNCF and see if it improves the situation.
Hi, thank you for the fast reply :-)
I used ("hw_config_type": "cpu"
) on the NNCF config. Did you mean that directly add ("per_channel": true
) on the NNCF config? Thank you.
from nncf.
@summer110669 - sorry, made a mistake there - should read "per_channel": false
, of course. I edited the comment so as not to mislead future readers. Add the "per_channel": false
line at the same level in the .json structure as the "algorithm": "quantization"
key-value pair.
from nncf.
@summer110669 - sorry, made a mistake there - should read
"per_channel": false
, of course. I edited the comment so as not to mislead future readers. Add the"per_channel": false
line at the same level in the .json structure as the"algorithm": "quantization"
key-value pair.
I added the "per_channel": false in config as you said, and here's the error:
ERROR:nncf:Invalid NNCF config supplied!
Traceback (most recent call last):
File "train.py", line 247, in
main(args)
File "train.py", line 160, in main
nncf_config = NNCFConfig.from_json(args.nncf_config)
File "/usr/local/lib/python3.6/dist-packages/nncf/config.py", line 45, in from_json
NNCFConfig.validate(loaded_json)
File "/usr/local/lib/python3.6/dist-packages/nncf/config.py", line 81, in validate
validate_single_compression_algo_schema(compression_section)
File "/usr/local/lib/python3.6/dist-packages/nncf/config_schema.py", line 595, in validate_single_compression_algo_schema
raise type(e)("For algorithm: '{}'\n".format(algo_name) + str(e)).with_traceback(sys.exc_info()[2])
File "/usr/local/lib/python3.6/dist-packages/nncf/config_schema.py", line 592, in validate_single_compression_algo_schema
jsonschema.validate(single_compression_algo_dict, schema=REF_VS_ALGO_SCHEMA[algo_name])
File "/usr/local/lib/python3.6/dist-packages/jsonschema/validators.py", line 934, in validate
raise error
jsonschema.exceptions.ValidationError: For algorithm: 'quantization'
Additional properties are not allowed ('per_channel' was unexpected)
Failed validating 'additionalProperties' in schema:
{'additionalProperties': False,
'properties': {'activations': {'additionalProperties': False,
'description': 'Constraints to be '
'applied to model '
'activations '
'quantization only. '
'Overrides higher-level '
'settings.',
'properties': {'bits': {'description': 'Bitwidth '
'to '
'quantize '
'to.',
'type': 'number'},
'ignored_scopes': {'description': 'A '
'list '
'of '
'model '
'control '
'flow '
'graph '
'node '
'scopes '
'to '
'be '
'ignored '
'for '
'this '
'operation '
'- '
'functions '
'as '
'a '
"'blacklist'. "
'Optional.',
'oneOf': [{'items': {'type': 'string'},
'type': 'array'},
{'type': 'string'}]},
'mode': {'description': 'Mode '
'of '
'quantization',
'type': 'string'},
'per_channel': {'description': 'Whether '
'to '
'quantize '
'inputs '
'per '
'channel '
'(i.e. '
'per '
'0-th '
'dimension '
'for '
'weight '
'quantization, '
'and '
'per '
'1-st '
'dimension '
'for '
'activation '
'quantization)',
'type': 'boolean'},
'signed': {'description': 'Whether '
'to '
'use '
'signed '
'or '
'unsigned '
'input/output '
'values '
'for '
'quantization. '
'If '
'specified '
'as '
'unsigned '
'and '
'the '
'input '
'values '
'during '
'initialization '
'have '
'differing '
'signs, '
'will '
'reset '
'to '
'performing '
'signed '
'quantization '
'instead.',
'type': 'boolean'},
'target_scopes': {'description': 'A '
'list '
'of '
'model '
'control '
'flow '
'graph '
'node '
'scopes '
'to '
'be '
'considered '
'for '
'this '
'operation '
'- '
'functions '
'as '
'a '
"'whitelist'. "
'Optional.',
'oneOf': [{'items': {'type': 'string'},
'type': 'array'},
{'type': 'string'}]}},
'type': 'object'},
'algorithm': {'const': 'quantization'},
'export_to_onnx_standard_ops': {'description': 'Determines '
'how '
'should '
'the '
'additional '
'quantization '
'operations '
'be '
'exported '
'into '
'the '
'ONNX '
'format. '
'Set '
'this '
'to '
'false '
'for '
'export '
'to '
'OpenVINO-supported '
'FakeQuantize '
'ONNX, '
'or to '
'true '
'for '
'export '
'to '
'ONNX '
'standard '
'QuantizeLinear-DequantizeLinear '
'node '
'pairs '
'(8-bit '
'quantization '
'only '
'in the '
'latter '
'case). '
'Default: '
'false',
'type': 'boolean'},
'ignored_scopes': {'description': 'A list of model '
'control flow graph '
'node scopes to be '
'ignored for this '
'operation - '
'functions as a '
"'blacklist'. "
'Optional.',
'items': {'type': 'string'},
'type': ['array', 'string']},
'initializer': {'additionalProperties': False,
'properties': {'batchnorm_adaptation': {'additionalProperties': False,
'properties': {'num_bn_adaptation_steps': {'description': 'Number '
'of '
'batches '
'from '
'the '
'training '
'dataset '
'to '
'use '
'for '
'model '
'inference '
'during '
'the '
'BatchNorm '
'statistics '
'adaptation '
'procedure '
'for '
'the '
'compressed '
'model',
'type': 'number'},
'num_bn_forget_steps': {'description': 'Number '
'of '
'batches '
'from '
'the '
'training '
'dataset '
'to '
'use '
'for '
'model '
'inference '
'during '
'the '
'BatchNorm '
'statistics '
'adaptation '
'in '
'the '
'initial '
'statistics '
'forgetting '
'step',
'type': 'number'}},
'type': 'object'},
'precision': {'additionalProperties': False,
'properties': {'bits': {'description': 'A '
'list '
'of '
'bitwidth '
'to '
'choose '
'from '
'when '
'performing '
'precision '
'initialization.',
'examples': [[4,
8]],
'items': {'type': 'number'},
'type': 'array'},
'bitwidth_per_scope': {'description': 'Manual '
'settings '
'for '
'the '
'quantizer '
'bitwidths. '
'Scopes '
'are '
'used '
'to '
'identify '
'the '
'quantizers.',
'items': {'description': 'A '
'tuple '
'of '
'a '
'bitwidth '
'and '
'a '
'scope '
'of '
'the '
'quantizer '
'to '
'assign '
'the '
'bitwidth '
'to.',
'items': [{'type': 'number'},
{'type': 'string'}],
'type': 'array'},
'type': 'array'},
'iter_number': {'description': 'Maximum '
'number '
'of '
'iterations '
'of '
'Hutchinson '
'algorithm '
'to '
'Estimate '
'Hessian '
'trace, '
'200 '
'by '
'default',
'type': 'number'},
'num_data_points': {'description': 'Number '
'of '
'data '
'points '
'to '
'iteratively '
'estimate '
'Hessian '
'trace, '
'200 '
'by '
'default.',
'type': 'number'},
'tolerance': {'description': 'Minimum '
'relative '
'tolerance '
'for '
'stopping '
'the '
'Hutchinson '
'algorithm. '
"It's "
'calculated '
'between '
'mean '
'average '
'trace '
'from '
'previous '
'iteration '
'and '
'current '
'one. '
'1e-5 '
'by '
'defaultbitwidth_per_scope',
'type': 'number'},
'type': {'description': 'Type '
'of '
'precision '
'initialization.',
'type': 'string'}},
'type': 'object'},
'range': {'additionalProperties': False,
'properties': {'max_percentile': {'description': 'For '
"'percentile' "
'type '
'- '
'specify '
'the '
'percentile '
'of '
'input '
'value '
'histograms '
'to '
'be '
'set '
'as '
'the '
'initial '
'value '
'for '
'maximum '
'quantizer '
'input',
'type': 'number'},
'min_percentile': {'description': 'For '
"'percentile' "
'type '
'- '
'specify '
'the '
'percentile '
'of '
'input '
'value '
'histograms '
'to '
'be '
'set '
'as '
'the '
'initial '
'value '
'for '
'minimum '
'quantizer '
'input',
'type': 'number'},
'num_init_steps': {'description': 'Number '
'of '
'batches '
'from '
'the '
'training '
'dataset '
'to '
'consume '
'as '
'sample '
'model '
'inputs '
'for '
'purposes '
'of '
'setting '
'initial '
'minimum '
'and '
'maximum '
'quantization '
'ranges',
'type': 'number'},
'type': {'description': 'Type '
'of '
'the '
'initializer '
'- '
'determines '
'which '
'statistics '
'gathered '
'during '
'initialization '
'will '
'be '
'used '
'to '
'initialize '
'the '
'quantization '
'ranges',
'type': 'string'}},
'type': 'object'}},
'type': 'object'},
'params': {'additionalProperties': False,
'properties': {'activations_quant_start_epoch': {'description': 'Epoch '
'to '
'start '
'binarizing '
'activations',
'type': 'number'},
'base_lr': {'description': 'Initial '
'value '
'of '
'learning '
'rate',
'type': 'number'},
'base_wd': {'description': 'Initial '
'value '
'of '
'weight '
'decay',
'type': 'number'},
'batch_multiplier': {'description': 'Gradients '
'will '
'be '
'accumulated '
'for '
'this '
'number '
'of '
'batches '
'before '
'doing '
'a '
"'backward' "
'call. '
'Increasing '
'this '
'may '
'improve '
'training '
'quality, '
'since '
'binarized '
'networks '
'exhibit '
'noisy '
'gradients '
'requiring '
'larger '
'batch '
'sizes '
'than '
'could '
'be '
'accomodated '
'by '
'GPUs',
'type': 'number'},
'disable_wd_start_epoch': {'description': 'Epoch '
'to '
'disable '
'weight '
'decay '
'in '
'the '
'optimizer',
'type': 'number'},
'lr_poly_drop_duration_epochs': {'description': 'Duration, '
'in '
'epochs, '
'of '
'the '
'learning '
'rate '
'dropping '
'process.',
'type': 'number'},
'lr_poly_drop_start_epoch': {'description': 'Epoch '
'to '
'start '
'dropping '
'the '
'learning '
'rate',
'type': 'number'},
'weights_quant_start_epoch': {'description': 'Epoch '
'to '
'start '
'binarizing '
'weights',
'type': 'number'}},
'type': 'object'},
'quantizable_subgraph_patterns': {'description': 'Each '
'sub-list '
'in '
'this '
'list '
'will '
'correspond '
'to a '
'sequence '
'of '
'operations '
'in '
'the '
'model '
'control '
'flow '
'graph '
'that '
'will '
'have '
'a '
'quantizer '
'appended '
'at '
'the '
'end '
'of '
'the '
'sequence',
'examples': [['cat',
'batch_norm'],
'h_swish'],
'items': {'items': {'type': 'string'},
'type': ['array',
'string']},
'type': 'array'},
'quantize_inputs': {'default': True,
'description': 'Whether the model '
'inputs should be '
'immediately '
'quantized prior to '
'any other model '
'operations.',
'type': 'boolean'},
'quantize_outputs': {'default': False,
'description': 'Whether the model '
'outputs should be '
'additionally '
'quantized.',
'type': 'boolean'},
'scope_overrides': {'description': 'This option is '
'used to specify '
'overriding '
'quantization '
'constraints for '
'specific '
'scope,e.g. in case '
'you need to '
'quantize a single '
'operation '
'differently than '
'the rest of the '
'model.',
'patternProperties': {'.*': {'additionalProperties': False,
'properties': {'bits': {'description': 'Bitwidth '
'to '
'quantize '
'to.',
'type': 'number'},
'mode': {'description': 'Mode '
'of '
'quantization',
'type': 'string'},
'per_channel': {'description': 'Whether '
'to '
'quantize '
'inputs '
'per '
'channel '
'(i.e. '
'per '
'0-th '
'dimension '
'for '
'weight '
'quantization, '
'and '
'per '
'1-st '
'dimension '
'for '
'activation '
'quantization)',
'type': 'boolean'},
'signed': {'description': 'Whether '
'to '
'use '
'signed '
'or '
'unsigned '
'input/output '
'values '
'for '
'quantization. '
'If '
'specified '
'as '
'unsigned '
'and '
'the '
'input '
'values '
'during '
'initialization '
'have '
'differing '
'signs, '
'will '
'reset '
'to '
'performing '
'signed '
'quantization '
'instead.',
'type': 'boolean'}},
'type': 'object'}},
'type': 'object'},
'target_scopes': {'description': 'A list of model '
'control flow graph '
'node scopes to be '
'considered for this '
'operation - '
'functions as a '
"'whitelist'. "
'Optional.',
'items': {'type': 'string'},
'type': ['array', 'string']},
'weights': {'additionalProperties': False,
'description': 'Constraints to be applied '
'to model weights '
'quantization only. '
'Overrides higher-level '
'settings.',
'properties': {'bits': {'description': 'Bitwidth '
'to '
'quantize '
'to.',
'type': 'number'},
'ignored_scopes': {'description': 'A '
'list '
'of '
'model '
'control '
'flow '
'graph '
'node '
'scopes '
'to '
'be '
'ignored '
'for '
'this '
'operation '
'- '
'functions '
'as '
'a '
"'blacklist'. "
'Optional.',
'oneOf': [{'items': {'type': 'string'},
'type': 'array'},
{'type': 'string'}]},
'mode': {'description': 'Mode '
'of '
'quantization',
'type': 'string'},
'per_channel': {'description': 'Whether '
'to '
'quantize '
'inputs '
'per '
'channel '
'(i.e. '
'per '
'0-th '
'dimension '
'for '
'weight '
'quantization, '
'and '
'per '
'1-st '
'dimension '
'for '
'activation '
'quantization)',
'type': 'boolean'},
'signed': {'description': 'Whether '
'to '
'use '
'signed '
'or '
'unsigned '
'input/output '
'values '
'for '
'quantization. '
'If '
'specified '
'as '
'unsigned '
'and '
'the '
'input '
'values '
'during '
'initialization '
'have '
'differing '
'signs, '
'will '
'reset '
'to '
'performing '
'signed '
'quantization '
'instead.',
'type': 'boolean'},
'target_scopes': {'description': 'A '
'list '
'of '
'model '
'control '
'flow '
'graph '
'node '
'scopes '
'to '
'be '
'considered '
'for '
'this '
'operation '
'- '
'functions '
'as '
'a '
"'whitelist'. "
'type': 'array'},
{'type': 'string'}]}},
'type': 'object'}},
'required': ['algorithm'],
'type': 'object'}
On instance:
{'algorithm': 'quantization',
'initializer': {'range': {'num_init_steps': 50}},
'per_channel': False}
I also tried 'per_channel': False in the config, but it is still incorrect. The error shows as follows:
Traceback (most recent call last):
File "train.py", line 247, in
main(args)
File "train.py", line 160, in main
nncf_config = NNCFConfig.from_json(args.nncf_config)
File "/usr/local/lib/python3.6/dist-packages/nncf/config.py", line 44, in from_json
loaded_json = json.load(f)
File "/usr/local/lib/python3.6/dist-packages/jstyleson.py", line 127, in load
return loads(fp.read(), **kwargs)
File "/usr/local/lib/python3.6/dist-packages/jstyleson.py", line 123, in loads
return json.loads(dispose(text), **kwargs)
File "/usr/lib/python3.6/json/init.py", line 354, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 10 column 18 (char 215)
from nncf.
Try "activations": { "per_channel": false }
at the same JSON level then
from nncf.
@summer110669 have you had any luck with this yet?
from nncf.
@summer110669 Have you been able to improve nncf inference time?
from nncf.
Related Issues (20)
- Compressed models that call torch.is_floating_point() during inference are traced with runtime error.
- nncf + ultralytics yolov8 training-time compression HOT 7
- Ultralytics yolov8 QAT example HOT 1
- [Good First Issue] [NNCF] Make NNCF common utils code pass mypy checks HOT 23
- [Good First Issue] [NNCF] Make NNCF common accuracy aware training code pass mypy checks HOT 17
- [Good First Issue] [NNCF] Make NNCF common tensor statistics code pass mypy checks HOT 9
- Thanks to our Contributors HOT 1
- [Good First Issue][NNCF]: Add INT8 weight compression conformance test for Tinyllama-1.1b PyTorch model HOT 19
- [Good First Issue][NNCF]: Fixing NNCFGraph export for visualization in Netron HOT 6
- Why doesn't the size and precision of the model change after INT4 quantization? HOT 2
- [Good First Issue][NNCF]: Optimize memory footprint by removing redundant collected statistics HOT 8
- [Good First Issue][NNCF]: Dump actual_subset_size to ov.Model HOT 8
- [Good First Issue][NNCF]: dump the ignored scope more gracefully HOT 4
- [Good First Issue][NNCF]: check number of u8, u4 constants in weight compression tests HOT 10
- PTQ of Fast R-CNN crashes in PyTorch backend HOT 1
- [Good First Issue][NNCF]: fix invalid error reporting in JSON schema HOT 19
- [Good First Issue][NNCF]: Add tests for torch device utils HOT 5
- [Good First Issue][NNCF]: Remove compress_to_fp16=False from examples HOT 3
- AttributeError: 'list' object has no attribute 'keys' when executing yolov8_quantize_with_accuracy_control example HOT 4
- The question about function create_compressed_model():RuntimeError: CUDA error: device-side assert triggered HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nncf.