GithubHelp home page GithubHelp logo

Comments (9)

vshampor avatar vshampor commented on May 28, 2024 1

@sroot0, no problem - in the future you may also post the solution here so that others may use it if the stumble upon this thread 😉

from nncf.

vshampor avatar vshampor commented on May 28, 2024

@sroot0, could you resolve the issue on your own? This kind of error is most likely to happen if you did not install NNCF according to the instructions.

from nncf.

sroot0 avatar sroot0 commented on May 28, 2024

@ sroot0,您可以自己解决问题吗?如果您没有按照说明安装NNCF,则很可能发生这种错误。

I solved this problem. I wanted to withdraw it, but I didn’t succeed. I’m sorry to bother you.

from nncf.

sroot0 avatar sroot0 commented on May 28, 2024

@ sroot0,您可以自己解决问题吗?如果您没有按照说明安装NNCF,则很可能发生这种错误

@ sroot0,没问题-将来您也可以在此处发布解决方案,以便其他人可以在遇到此线程时使用它😉

“add the repository root folder to the PYTHONPATH environment variable ”This is how I solved it, but I have already installed the package and still need to add it manually. there is a new problem:
python main.py -m train --config configs/quantization/test.json --data /home/sroot/work/nncf_pytorch-master/examples/classification/cifar10 --log-dir=../../results/quantizat/ion/resnet18/ --cpu-only
batch_size : 64
checkpoint_save_dir : ../../results/quantizat/ion/resnet18/resnet18_imagenet_int8/2020-08-27__19-36-57
compression : {'algorithm': 'quantization', 'initializer': {'range': {'num_init_steps': 10}}}
config : configs/quantization/test.json
cpu_only : True
current_gpu : None
dataset : None
dataset_dir : /home/sroot/work/nncf_pytorch-master/examples/classification/cifar10
device : cpu
dist_backend : nccl
dist_url : tcp://127.0.0.1:8899
distributed : False
epochs : 2
execution_mode : cpu_only
gpu_id : None
hw_config_type : None
imshow_batch : False
input_info : {'sample_size': [1, 3, 32, 32]}
intermediate_checkpoints_path: ../../results/quantizat/ion/resnet18/resnet18_imagenet_int8/2020-08-27__19-36-57/intermediate_checkpoints
log_dir : ../../results/quantizat/ion/resnet18/resnet18_imagenet_int8/2020-08-27__19-36-57
metrics_dump : None
mode : train
model : resnet18
multiprocessing_distributed: False
name : resnet18_imagenet_int8
nncf_config : {'model': 'resnet18', 'pretrained': True, 'input_info': {'sample_size': [1, 3, 32, 32]}, 'num_classes': 10, 'batch_size': 64, 'epochs': 2, 'optimizer': {'type': 'Adam', 'base_lr': 1e-05, 'schedule_type': 'multistep', 'steps': [5]}, 'compression': {'algorithm': 'quantization', 'initializer': {'range': {'num_init_steps': 10}}}, 'log_dir': '../../results/quantizat/ion/resnet18/resnet18_imagenet_int8/2020-08-27__19-36-57'}
num_classes : 10
optimizer : {'type': 'Adam', 'base_lr': 1e-05, 'schedule_type': 'multistep', 'steps': [5]}
pretrained : True
print_freq : 10
print_step : False
rank : 0
resuming_checkpoint_path : None
save_freq : 5
seed : None
start_epoch : 0
tb : <tensorboardX.writer.SummaryWriter object at 0x7f895bdbcf10>
test_every_n_epochs : 1
to_onnx : None
weights : None
workers : 4
world_size : 1
Loading model: resnet18
Downloading: "https://download.pytorch.org/models/resnet18-5c106cde.pth" to /home/sroot/.cache/torch/hub/checkpoints/resnet18-5c106cde.pth
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 44.7M/44.7M [00:26<00:00, 1.79MB/s]

Traceback (most recent call last):
File "main.py", line 539, in
main(sys.argv[1:])
File "main.py", line 97, in main
start_worker(main_worker, config)
File "/home/sroot/work/nncf_pytorch-master/examples/common/execution.py", line 95, in start_worker
main_worker(current_gpu=None, config=config)
File "main.py", line 141, in main_worker
model = load_model(model_name,
File "/home/sroot/work/nncf_pytorch-master/examples/common/model_loader.py", line 36, in load_model
loaded_model = safe_thread_call(load_model_fn)
File "/home/sroot/work/nncf_pytorch-master/nncf/utils.py", line 282, in safe_thread_call
result = main_call_fn()
File "/home/sroot/anaconda3/envs/NNCF/lib/python3.8/site-packages/torchvision/models/resnet.py", line 240, in resnet18
return _resnet('resnet18', BasicBlock, [2, 2, 2, 2], pretrained, progress,
File "/home/sroot/anaconda3/envs/NNCF/lib/python3.8/site-packages/torchvision/models/resnet.py", line 228, in _resnet
model.load_state_dict(state_dict)
File "/home/sroot/anaconda3/envs/NNCF/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1044, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ResNet:
size mismatch for fc.weight: copying a param with shape torch.Size([1000, 512]) from checkpoint, the shape in current model is torch.Size([10, 512]).
size mismatch for fc.bias: copying a param with shape torch.Size([1000]) from checkpoint, the shape in current model is torch.Size([10]).

from nncf.

vshampor avatar vshampor commented on May 28, 2024

@sroot0 you are trying to load an ImageNet pre-trained checkpoint (1000 classes) into a CIFAR10-model (10 classes), that's why you are getting troubles with the last fully connected layer. Download the .pth manually, set "pretrained": False in the NNCF config .json file and pass an additional --weights *path_to_resnet18_imagenet_pth* to the main.py launch command line, if you want to fine-tune for CIFAR10 starting with ImageNet weights.

from nncf.

sroot0 avatar sroot0 commented on May 28, 2024

@ sroot0,您正在尝试将ImageNet预先训练的检查点(1000个类)加载到CIFAR10模型(10个类)中,这就是为什么在最后一个完全连接的层上遇到麻烦的原因。如果要从ImageNet权重开始对CIFAR10进行微调,请手动下载.pth,"pretrained": False在NNCF config .json文件中进行设置,然后将其他内容传递--weights *path_to_resnet18_imagenet_pth*给main.py启动命令行。

thank you very much

from nncf.

sroot0 avatar sroot0 commented on May 28, 2024

@ sroot0,您正在尝试将ImageNet预先训练的检查点(1000个类)加载到CIFAR10模型(10个类)中,这就是为什么在最后一个完全连接的层上遇到麻烦的原因。如果要从ImageNet权重开始对CIFAR10进行微调,请手动下载.pth,"pretrained": False在NNCF config .json文件中进行设置,然后将其他内容传递--weights *path_to_resnet18_imagenet_pth*给main.py启动命令行。

Sorry, I have been thinking about it for two days, but this problem has not been solved. Why do I load the imagenet checkpoint? How can the model load the cifar10 checkpoint? Or is it not possible to train from scratch without loading the checkpoint file?

from nncf.

vshampor avatar vshampor commented on May 28, 2024

The ImageNet checkpoint is being loaded because this is what is available pretrained in Torchvision. I'm fairly sure that the CIFAR-10 pretrained checkpoint is not available there. ImageNet-pretrained feature extractors have a better chance to be fine-tuned to CIFAR-10 and keep the accuracy on the target dataset than vice versa. It is also possible to train from scratch without loading the checkpoint file - remove the "pretrained": true entry from the NNCF .json config file, and omit any --weights or --resume parameters from the command line.

from nncf.

sroot0 avatar sroot0 commented on May 28, 2024

正在加载ImageNet检查点,因为这是Torchvision中预先训练的可用内容。我相当确定CIFAR-10预训练的检查点在那里不可用。反之,ImageNet预先训练的特征提取器有更好的机会被微调到CIFAR-10,并保持目标数据集的准确性。也可以从零开始训练而无需加载检查点文件- "pretrained": true从NNCF .json配置文件中删除条目,并从命令行省略任何--weights--resume参数。

Thank you very much, I’m sorry to trouble you so much, I’ll give it a try,

from nncf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.