I run the command 'python -m train --algorithm ERM --dataset ODIR --task Retinopathy --attr age --data_dir /data/user3/datasets/ODIR-5K/odir5k/ODIR-5K --store_name exp1', but it reports "core dumped" problem. The Backtrace is as follow:
*** Error in `python': free(): invalid pointer: 0x00000000006dda70 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7340f)[0x7f437252a40f]
/lib64/libc.so.6(+0x78c7e)[0x7f437252fc7e]
/lib64/libc.so.6(+0x79957)[0x7f4372530957]
/lib64/ld-linux-x86-64.so.2(_dl_deallocate_tls+0x39)[0x7f437319b589]
/lib64/libpthread.so.0(+0x7237)[0x7f4372f73237]
/lib64/libpthread.so.0(+0x734f)[0x7f4372f7334f]
/lib64/libpthread.so.0(pthread_join+0xdb)[0x7f4372f7569b]
/data/user3/miniconda3/envs/fair/lib/python3.7/site-packages/scipy/special/../../scipy.libs/libopenblasp-r0-085ca80a.3.9.so(blas_thread_shutdown_+0xca)[0x7f4157a32a6a]
/lib64/libc.so.6(__libc_fork+0x52)[0x7f437256d8c2]
......
Environment:
Python: 3.7.16
PyTorch: 1.13.0+cu117
Torchvision: 0.14.0+cu117
CUDA: 11.7
CUDNN: 8500
NumPy: 1.19.5
PIL: 9.5.0
Args:
algorithm: ERM
attr: age
aug: basic2
checkpoint_freq: None
data_dir: /data/user3/datasets/ODIR-5K/odir5k/ODIR-5K
dataset: ['ODIR']
debug: False
es_metric: min_group:accuracy
es_patience: 5
es_strategy: metric
group_def: group
hparams: None
hparams_seed: 0
image_arch: densenet_sup_in1k
log_all: False
log_online: False
output_dir: output
resume:
seed: 0
skip_model_save: False
skip_ood_eval: False
stage1_folder: None
steps: None
store_name: exp1
stratified_erm_subset: None
task: Retinopathy
use_es: False
HParams:
attr: age
attr_balanced: False
batch_size: 64
data_augmentation: basic2
group_balanced: False
group_def: group
image_arch: densenet_sup_in1k
last_layer_dropout: 0.0
lr: 0.001
nonlinear_classifier: False
optimizer: adam
pretrained: True
resnet18: False
task: Retinopathy
weight_decay: 0.0001
cuda
Dataset:
[train] 4524
[val] 1044
I have never met such problem before in pytorch. Moreover, I found that if i use wandb, the bug will be triggered earlier, where Hparams would not be printed. Can anyone help me?