Comments (6)
Greetings, @sroot0 !
Can you provide an exact command line you used and console error logs?
from nncf.
I want to use the provided classification code to train the CIFAR10 data set. After downloading the data set, I manually processed the separated val and test image sets, and then when I run the main.py file in the image classification instance, I always get an error. The following is Command and error logs,
python main.py -m train --config configs/quantization/test.json --data /home/sroot/下载/nncf_pytorch-master/examples/classification/cifar10 --log-dir=../../results/quantization/resnet18/ --cpu-only
batch_size : 64
checkpoint_save_dir : ../../results/quantization/resnet18/resnet18_CIFAR10_int8/2020-08-26__21-22-53
compression : {'algorithm': 'quantization', 'initializer': {'range': {'num_init_steps': 10}}}
config : configs/quantization/test.json
cpu_only : True
current_gpu : None
dataset : CIFAR10
dataset_dir : /home/sroot/下载/nncf_pytorch-master/examples/classification/cifar10
device : cpu
dist_backend : nccl
dist_url : tcp://127.0.0.1:8899
distributed : False
epochs : 2
execution_mode : cpu_only
gpu_id : None
hw_config_type : None
imshow_batch : False
input_info : {'sample_size': [1, 3, 32, 32]}
intermediate_checkpoints_path: ../../results/quantization/resnet18/resnet18_CIFAR10_int8/2020-08-26__21-22-53/intermediate_checkpoints
log_dir : ../../results/quantization/resnet18/resnet18_CIFAR10_int8/2020-08-26__21-22-53
metrics_dump : None
mode : train
model : resnet18
multiprocessing_distributed: False
name : resnet18_CIFAR10_int8
nncf_config : {'model': 'resnet18', 'pretrained': True, 'input_info': {'sample_size': [1, 3, 32, 32]}, 'num_classes': 10, 'batch_size': 64, 'epochs': 2, 'optimizer': {'type': 'Adam', 'base_lr': 1e-05, 'schedule_type': 'multistep', 'steps': [5]}, 'compression': {'algorithm': 'quantization', 'initializer': {'range': {'num_init_steps': 10}}}, 'dataset': 'CIFAR10'}
num_classes : 10
optimizer : {'type': 'Adam', 'base_lr': 1e-05, 'schedule_type': 'multistep', 'steps': [5]}
pretrained : True
print_freq : 10
print_step : False
rank : 0
resuming_checkpoint_path : None
save_freq : 5
seed : None
start_epoch : 0
tb : <tensorboardX.writer.SummaryWriter object at 0x7fef4aecd640>
test_every_n_epochs : 1
to_onnx : None
weights : None
workers : 4
world_size : 1
Loading model: resnet18
Traceback (most recent call last):
File "main.py", line 530, in
main(sys.argv[1:])
File "main.py", line 95, in main
start_worker(main_worker, config)
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/nncf-1.3.1-py3.8.egg/examples/common/execution.py", line 95, in start_worker
main_worker(current_gpu=None, config=config)
File "main.py", line 140, in main_worker
model = load_model(model_name,
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/nncf-1.3.1-py3.8.egg/examples/common/model_loader.py", line 36, in load_model
loaded_model = safe_thread_call(load_model_fn)
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/nncf-1.3.1-py3.8.egg/nncf/utils.py", line 281, in safe_thread_call
result = main_call_fn()
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/torchvision/models/resnet.py", line 240, in resnet18
return _resnet('resnet18', BasicBlock, [2, 2, 2, 2], pretrained, progress,
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/torchvision/models/resnet.py", line 226, in _resnet
state_dict = load_state_dict_from_url(model_urls[arch],
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/hub.py", line 509, in load_state_dict_from_url
return torch.load(cached_file, map_location=map_location)
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/serialization.py", line 593, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/serialization.py", line 763, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: unpickling stack underflow
@ vshampor, Thanks for your reply, sorry to trouble you
from nncf.
问候,@ sroot0!
您能否提供您使用的确切命令行和控制台错误日志?
thank you
from nncf.
Doesn't look like any error we've encountered yet. Also from your logs it seems that the problem is with loading the pretrained ResNet, rather than with the dataset.
Try using only Latin alphanumerics in the paths to your repository, dataset and virtual environment folders (e.g. no Chinese characters such as 下载), maybe?
from nncf.
看来我们还没有遇到任何错误。同样从您的日志来看,问题似乎出在加载预训练的ResNet,而不是数据集。
尝试在存储库,数据集和虚拟环境文件夹的路径中仅使用拉丁字母数字(例如,没有汉字,例如“下载”)吗?
The problem with the data set later I changed the default console path so no error was reported, but then the current problem appeared. I just tried to change the path and still reported the same error,so sad!
python main.py -m train --config configs/quantization/test.json --data /home/sroot/work/nncf_pytorch-master/examples/classification/cifar10 --log-dir=../../results/quantizat/ion/resnet18/ --cpu-only
Traceback (most recent call last):
File "main.py", line 530, in
main(sys.argv[1:])
File "main.py", line 95, in main
start_worker(main_worker, config)
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/nncf-1.3.1-py3.8.egg/examples/common/execution.py", line 95, in start_worker
main_worker(current_gpu=None, config=config)
File "main.py", line 140, in main_worker
model = load_model(model_name,
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/nncf-1.3.1-py3.8.egg/examples/common/model_loader.py", line 36, in load_model
loaded_model = safe_thread_call(load_model_fn)
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/nncf-1.3.1-py3.8.egg/nncf/utils.py", line 281, in safe_thread_call
result = main_call_fn()
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/torchvision/models/resnet.py", line 240, in resnet18
return _resnet('resnet18', BasicBlock, [2, 2, 2, 2], pretrained, progress,
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/torchvision/models/resnet.py", line 226, in _resnet
state_dict = load_state_dict_from_url(model_urls[arch],
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/hub.py", line 509, in load_state_dict_from_url
return torch.load(cached_file, map_location=map_location)
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/serialization.py", line 593, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/sroot/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/serialization.py", line 763, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: unpickling stack underflow
from nncf.
看来我们还没有遇到任何错误。同样从您的日志来看,问题似乎出在加载预训练的ResNet,而不是数据集。
尝试在存储库,数据集和虚拟环境文件夹的路径中仅使用拉丁字母数字(例如,没有汉字,例如“下载”)吗?
For the previous question, thank you very much for your help, I reinstalled it today, and now there is no such problem
from nncf.
Related Issues (20)
- Compressed models that call torch.is_floating_point() during inference are traced with runtime error.
- nncf + ultralytics yolov8 training-time compression HOT 7
- Ultralytics yolov8 QAT example HOT 1
- [Good First Issue] [NNCF] Make NNCF common utils code pass mypy checks HOT 23
- [Good First Issue] [NNCF] Make NNCF common accuracy aware training code pass mypy checks HOT 17
- [Good First Issue] [NNCF] Make NNCF common tensor statistics code pass mypy checks HOT 9
- Thanks to our Contributors HOT 1
- [Good First Issue][NNCF]: Add INT8 weight compression conformance test for Tinyllama-1.1b PyTorch model HOT 19
- [Good First Issue][NNCF]: Fixing NNCFGraph export for visualization in Netron HOT 6
- Why doesn't the size and precision of the model change after INT4 quantization? HOT 2
- [Good First Issue][NNCF]: Optimize memory footprint by removing redundant collected statistics HOT 8
- [Good First Issue][NNCF]: Dump actual_subset_size to ov.Model HOT 8
- [Good First Issue][NNCF]: dump the ignored scope more gracefully HOT 4
- [Good First Issue][NNCF]: check number of u8, u4 constants in weight compression tests HOT 10
- PTQ of Fast R-CNN crashes in PyTorch backend HOT 1
- [Good First Issue][NNCF]: fix invalid error reporting in JSON schema HOT 19
- [Good First Issue][NNCF]: Add tests for torch device utils HOT 5
- [Good First Issue][NNCF]: Remove compress_to_fp16=False from examples HOT 3
- AttributeError: 'list' object has no attribute 'keys' when executing yolov8_quantize_with_accuracy_control example HOT 4
- The question about function create_compressed_model():RuntimeError: CUDA error: device-side assert triggered HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nncf.