GithubHelp home page GithubHelp logo

Comments (8)

vshampor avatar vshampor commented on May 29, 2024

Seems to be related to #117.

Also, it looks like the NNCF path used per-channel activation quantization, while the POT path used per-tensor quantization. This might be a sign of HW config file misalignment between NNCF and POT (our fault), or mismatching configurations between POT and NNCF if you did not use specific HW configuration. Try forcing per-tensor in NNCF via config file parameters ("per_channel": false) as a workaround in NNCF and see if it improves the situation.

from nncf.

vshampor avatar vshampor commented on May 29, 2024

Try checking out #124 to see if it fixes the FakeQuantize order in NNCF.

from nncf.

seanxxia avatar seanxxia commented on May 29, 2024

Seems to be related to #117.

Also, it looks like the NNCF path used per-channel activation quantization, while the POT path used per-tensor quantization. This might be a sign of HW config file misalignment between NNCF and POT (our fault), or mismatching configurations between POT and NNCF if you did not use specific HW configuration. Try forcing per-tensor in NNCF via config file parameters ("per_channel": true) as a workaround in NNCF and see if it improves the situation.

Hi, thank you for the fast reply :-)
I used ("hw_config_type": "cpu") on the NNCF config. Did you mean that directly add ("per_channel": true) on the NNCF config? Thank you.

from nncf.

vshampor avatar vshampor commented on May 29, 2024

@summer110669 - sorry, made a mistake there - should read "per_channel": false, of course. I edited the comment so as not to mislead future readers. Add the "per_channel": false line at the same level in the .json structure as the "algorithm": "quantization" key-value pair.

from nncf.

seanxxia avatar seanxxia commented on May 29, 2024

@summer110669 - sorry, made a mistake there - should read "per_channel": false, of course. I edited the comment so as not to mislead future readers. Add the "per_channel": false line at the same level in the .json structure as the "algorithm": "quantization" key-value pair.

I added the "per_channel": false in config as you said, and here's the error:
ERROR:nncf:Invalid NNCF config supplied!
Traceback (most recent call last):
File "train.py", line 247, in
main(args)
File "train.py", line 160, in main
nncf_config = NNCFConfig.from_json(args.nncf_config)
File "/usr/local/lib/python3.6/dist-packages/nncf/config.py", line 45, in from_json
NNCFConfig.validate(loaded_json)
File "/usr/local/lib/python3.6/dist-packages/nncf/config.py", line 81, in validate
validate_single_compression_algo_schema(compression_section)
File "/usr/local/lib/python3.6/dist-packages/nncf/config_schema.py", line 595, in validate_single_compression_algo_schema
raise type(e)("For algorithm: '{}'\n".format(algo_name) + str(e)).with_traceback(sys.exc_info()[2])
File "/usr/local/lib/python3.6/dist-packages/nncf/config_schema.py", line 592, in validate_single_compression_algo_schema
jsonschema.validate(single_compression_algo_dict, schema=REF_VS_ALGO_SCHEMA[algo_name])
File "/usr/local/lib/python3.6/dist-packages/jsonschema/validators.py", line 934, in validate
raise error
jsonschema.exceptions.ValidationError: For algorithm: 'quantization'
Additional properties are not allowed ('per_channel' was unexpected)

Failed validating 'additionalProperties' in schema:
{'additionalProperties': False,
'properties': {'activations': {'additionalProperties': False,
'description': 'Constraints to be '
'applied to model '
'activations '
'quantization only. '
'Overrides higher-level '
'settings.',
'properties': {'bits': {'description': 'Bitwidth '
'to '
'quantize '
'to.',
'type': 'number'},
'ignored_scopes': {'description': 'A '
'list '
'of '
'model '
'control '
'flow '
'graph '
'node '
'scopes '
'to '
'be '
'ignored '
'for '
'this '
'operation '
'- '
'functions '
'as '
'a '
"'blacklist'. "
'Optional.',
'oneOf': [{'items': {'type': 'string'},
'type': 'array'},
{'type': 'string'}]},
'mode': {'description': 'Mode '
'of '
'quantization',
'type': 'string'},
'per_channel': {'description': 'Whether '
'to '
'quantize '
'inputs '
'per '
'channel '
'(i.e. '
'per '
'0-th '
'dimension '
'for '
'weight '
'quantization, '
'and '
'per '
'1-st '
'dimension '
'for '
'activation '
'quantization)',
'type': 'boolean'},
'signed': {'description': 'Whether '
'to '
'use '
'signed '
'or '
'unsigned '
'input/output '
'values '
'for '
'quantization. '
'If '
'specified '
'as '
'unsigned '
'and '
'the '
'input '
'values '
'during '
'initialization '
'have '
'differing '
'signs, '
'will '
'reset '
'to '
'performing '
'signed '
'quantization '
'instead.',
'type': 'boolean'},
'target_scopes': {'description': 'A '
'list '
'of '
'model '
'control '
'flow '
'graph '
'node '
'scopes '
'to '
'be '
'considered '
'for '
'this '
'operation '
'- '
'functions '
'as '
'a '
"'whitelist'. "
'Optional.',
'oneOf': [{'items': {'type': 'string'},
'type': 'array'},
{'type': 'string'}]}},
'type': 'object'},
'algorithm': {'const': 'quantization'},
'export_to_onnx_standard_ops': {'description': 'Determines '
'how '
'should '
'the '
'additional '
'quantization '
'operations '
'be '
'exported '
'into '
'the '
'ONNX '
'format. '
'Set '
'this '
'to '
'false '
'for '
'export '
'to '
'OpenVINO-supported '
'FakeQuantize '
'ONNX, '
'or to '
'true '
'for '
'export '
'to '
'ONNX '
'standard '
'QuantizeLinear-DequantizeLinear '
'node '
'pairs '
'(8-bit '
'quantization '
'only '
'in the '
'latter '
'case). '
'Default: '
'false',
'type': 'boolean'},
'ignored_scopes': {'description': 'A list of model '
'control flow graph '
'node scopes to be '
'ignored for this '
'operation - '
'functions as a '
"'blacklist'. "
'Optional.',
'items': {'type': 'string'},
'type': ['array', 'string']},
'initializer': {'additionalProperties': False,
'properties': {'batchnorm_adaptation': {'additionalProperties': False,
'properties': {'num_bn_adaptation_steps': {'description': 'Number '
'of '
'batches '
'from '
'the '
'training '
'dataset '
'to '
'use '
'for '
'model '
'inference '
'during '
'the '
'BatchNorm '
'statistics '
'adaptation '
'procedure '
'for '
'the '
'compressed '
'model',
'type': 'number'},
'num_bn_forget_steps': {'description': 'Number '
'of '
'batches '
'from '
'the '
'training '
'dataset '
'to '
'use '
'for '
'model '
'inference '
'during '
'the '
'BatchNorm '
'statistics '
'adaptation '
'in '
'the '
'initial '
'statistics '
'forgetting '
'step',
'type': 'number'}},
'type': 'object'},
'precision': {'additionalProperties': False,
'properties': {'bits': {'description': 'A '
'list '
'of '
'bitwidth '
'to '
'choose '
'from '
'when '
'performing '
'precision '
'initialization.',
'examples': [[4,
8]],
'items': {'type': 'number'},
'type': 'array'},
'bitwidth_per_scope': {'description': 'Manual '
'settings '
'for '
'the '
'quantizer '
'bitwidths. '
'Scopes '
'are '
'used '
'to '
'identify '
'the '
'quantizers.',
'items': {'description': 'A '
'tuple '
'of '
'a '
'bitwidth '
'and '
'a '
'scope '
'of '
'the '
'quantizer '
'to '
'assign '
'the '
'bitwidth '
'to.',
'items': [{'type': 'number'},
{'type': 'string'}],
'type': 'array'},
'type': 'array'},
'iter_number': {'description': 'Maximum '
'number '
'of '
'iterations '
'of '
'Hutchinson '
'algorithm '
'to '
'Estimate '
'Hessian '
'trace, '
'200 '
'by '
'default',
'type': 'number'},
'num_data_points': {'description': 'Number '
'of '
'data '
'points '
'to '
'iteratively '
'estimate '
'Hessian '
'trace, '
'200 '
'by '
'default.',
'type': 'number'},
'tolerance': {'description': 'Minimum '
'relative '
'tolerance '
'for '
'stopping '
'the '
'Hutchinson '
'algorithm. '
"It's "
'calculated '
'between '
'mean '
'average '
'trace '
'from '
'previous '
'iteration '
'and '
'current '
'one. '
'1e-5 '
'by '
'defaultbitwidth_per_scope',
'type': 'number'},
'type': {'description': 'Type '
'of '
'precision '
'initialization.',
'type': 'string'}},
'type': 'object'},
'range': {'additionalProperties': False,
'properties': {'max_percentile': {'description': 'For '
"'percentile' "
'type '
'- '
'specify '
'the '
'percentile '
'of '
'input '
'value '
'histograms '
'to '
'be '
'set '
'as '
'the '
'initial '
'value '
'for '
'maximum '
'quantizer '
'input',
'type': 'number'},
'min_percentile': {'description': 'For '
"'percentile' "
'type '
'- '
'specify '
'the '
'percentile '
'of '
'input '
'value '
'histograms '
'to '
'be '
'set '
'as '
'the '
'initial '
'value '
'for '
'minimum '
'quantizer '
'input',
'type': 'number'},
'num_init_steps': {'description': 'Number '
'of '
'batches '
'from '
'the '
'training '
'dataset '
'to '
'consume '
'as '
'sample '
'model '
'inputs '
'for '
'purposes '
'of '
'setting '
'initial '
'minimum '
'and '
'maximum '
'quantization '
'ranges',
'type': 'number'},
'type': {'description': 'Type '
'of '
'the '
'initializer '
'- '
'determines '
'which '
'statistics '
'gathered '
'during '
'initialization '
'will '
'be '
'used '
'to '
'initialize '
'the '
'quantization '
'ranges',
'type': 'string'}},
'type': 'object'}},
'type': 'object'},
'params': {'additionalProperties': False,
'properties': {'activations_quant_start_epoch': {'description': 'Epoch '
'to '
'start '
'binarizing '
'activations',
'type': 'number'},
'base_lr': {'description': 'Initial '
'value '
'of '
'learning '
'rate',
'type': 'number'},
'base_wd': {'description': 'Initial '
'value '
'of '
'weight '
'decay',
'type': 'number'},
'batch_multiplier': {'description': 'Gradients '
'will '
'be '
'accumulated '
'for '
'this '
'number '
'of '
'batches '
'before '
'doing '
'a '
"'backward' "
'call. '
'Increasing '
'this '
'may '
'improve '
'training '
'quality, '
'since '
'binarized '
'networks '
'exhibit '
'noisy '
'gradients '
'requiring '
'larger '
'batch '
'sizes '
'than '
'could '
'be '
'accomodated '
'by '
'GPUs',
'type': 'number'},
'disable_wd_start_epoch': {'description': 'Epoch '
'to '
'disable '
'weight '
'decay '
'in '
'the '
'optimizer',
'type': 'number'},
'lr_poly_drop_duration_epochs': {'description': 'Duration, '
'in '
'epochs, '
'of '
'the '
'learning '
'rate '
'dropping '
'process.',
'type': 'number'},
'lr_poly_drop_start_epoch': {'description': 'Epoch '
'to '
'start '
'dropping '
'the '
'learning '
'rate',
'type': 'number'},
'weights_quant_start_epoch': {'description': 'Epoch '
'to '
'start '
'binarizing '
'weights',
'type': 'number'}},
'type': 'object'},
'quantizable_subgraph_patterns': {'description': 'Each '
'sub-list '
'in '
'this '
'list '
'will '
'correspond '
'to a '
'sequence '
'of '
'operations '
'in '
'the '
'model '
'control '
'flow '
'graph '
'that '
'will '
'have '
'a '
'quantizer '
'appended '
'at '
'the '
'end '
'of '
'the '
'sequence',
'examples': [['cat',
'batch_norm'],
'h_swish'],
'items': {'items': {'type': 'string'},
'type': ['array',
'string']},
'type': 'array'},
'quantize_inputs': {'default': True,
'description': 'Whether the model '
'inputs should be '
'immediately '
'quantized prior to '
'any other model '
'operations.',
'type': 'boolean'},
'quantize_outputs': {'default': False,
'description': 'Whether the model '
'outputs should be '
'additionally '
'quantized.',
'type': 'boolean'},
'scope_overrides': {'description': 'This option is '
'used to specify '
'overriding '
'quantization '
'constraints for '
'specific '
'scope,e.g. in case '
'you need to '
'quantize a single '
'operation '
'differently than '
'the rest of the '
'model.',
'patternProperties': {'.*': {'additionalProperties': False,
'properties': {'bits': {'description': 'Bitwidth '
'to '
'quantize '
'to.',
'type': 'number'},
'mode': {'description': 'Mode '
'of '
'quantization',
'type': 'string'},
'per_channel': {'description': 'Whether '
'to '
'quantize '
'inputs '
'per '
'channel '
'(i.e. '
'per '
'0-th '
'dimension '
'for '
'weight '
'quantization, '
'and '
'per '
'1-st '
'dimension '
'for '
'activation '
'quantization)',
'type': 'boolean'},
'signed': {'description': 'Whether '
'to '
'use '
'signed '
'or '
'unsigned '
'input/output '
'values '
'for '
'quantization. '
'If '
'specified '
'as '
'unsigned '
'and '
'the '
'input '
'values '
'during '
'initialization '
'have '
'differing '
'signs, '
'will '
'reset '
'to '
'performing '
'signed '
'quantization '
'instead.',
'type': 'boolean'}},
'type': 'object'}},
'type': 'object'},
'target_scopes': {'description': 'A list of model '
'control flow graph '
'node scopes to be '
'considered for this '
'operation - '
'functions as a '
"'whitelist'. "
'Optional.',
'items': {'type': 'string'},
'type': ['array', 'string']},
'weights': {'additionalProperties': False,
'description': 'Constraints to be applied '
'to model weights '
'quantization only. '
'Overrides higher-level '
'settings.',
'properties': {'bits': {'description': 'Bitwidth '
'to '
'quantize '
'to.',
'type': 'number'},
'ignored_scopes': {'description': 'A '
'list '
'of '
'model '
'control '
'flow '
'graph '
'node '
'scopes '
'to '
'be '
'ignored '
'for '
'this '
'operation '
'- '
'functions '
'as '
'a '
"'blacklist'. "
'Optional.',
'oneOf': [{'items': {'type': 'string'},
'type': 'array'},
{'type': 'string'}]},
'mode': {'description': 'Mode '
'of '
'quantization',
'type': 'string'},
'per_channel': {'description': 'Whether '
'to '
'quantize '
'inputs '
'per '
'channel '
'(i.e. '
'per '
'0-th '
'dimension '
'for '
'weight '
'quantization, '
'and '
'per '
'1-st '
'dimension '
'for '
'activation '
'quantization)',
'type': 'boolean'},
'signed': {'description': 'Whether '
'to '
'use '
'signed '
'or '
'unsigned '
'input/output '
'values '
'for '
'quantization. '
'If '
'specified '
'as '
'unsigned '
'and '
'the '
'input '
'values '
'during '
'initialization '
'have '
'differing '
'signs, '
'will '
'reset '
'to '
'performing '
'signed '
'quantization '
'instead.',
'type': 'boolean'},
'target_scopes': {'description': 'A '
'list '
'of '
'model '
'control '
'flow '
'graph '
'node '
'scopes '
'to '
'be '
'considered '
'for '
'this '
'operation '
'- '
'functions '
'as '
'a '
"'whitelist'. "
'type': 'array'},
{'type': 'string'}]}},
'type': 'object'}},
'required': ['algorithm'],
'type': 'object'}

On instance:
{'algorithm': 'quantization',
'initializer': {'range': {'num_init_steps': 50}},
'per_channel': False}


I also tried 'per_channel': False in the config, but it is still incorrect. The error shows as follows:
Traceback (most recent call last):
File "train.py", line 247, in
main(args)
File "train.py", line 160, in main
nncf_config = NNCFConfig.from_json(args.nncf_config)
File "/usr/local/lib/python3.6/dist-packages/nncf/config.py", line 44, in from_json
loaded_json = json.load(f)
File "/usr/local/lib/python3.6/dist-packages/jstyleson.py", line 127, in load
return loads(fp.read(), **kwargs)
File "/usr/local/lib/python3.6/dist-packages/jstyleson.py", line 123, in loads
return json.loads(dispose(text), **kwargs)
File "/usr/lib/python3.6/json/init.py", line 354, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 10 column 18 (char 215)

from nncf.

vshampor avatar vshampor commented on May 29, 2024

Try "activations": { "per_channel": false } at the same JSON level then

from nncf.

vshampor avatar vshampor commented on May 29, 2024

@summer110669 have you had any luck with this yet?

from nncf.

rosspleban avatar rosspleban commented on May 29, 2024

@summer110669 Have you been able to improve nncf inference time?

from nncf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.