Core dumped Problem #1

nightrain-vampire · 2024-07-05T13:32:51Z

I run the command 'python -m train --algorithm ERM --dataset ODIR --task Retinopathy --attr age --data_dir /data/user3/datasets/ODIR-5K/odir5k/ODIR-5K --store_name exp1', but it reports "core dumped" problem. The Backtrace is as follow:

*** Error in `python': free(): invalid pointer: 0x00000000006dda70 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7340f)[0x7f437252a40f]
/lib64/libc.so.6(+0x78c7e)[0x7f437252fc7e]
/lib64/libc.so.6(+0x79957)[0x7f4372530957]
/lib64/ld-linux-x86-64.so.2(_dl_deallocate_tls+0x39)[0x7f437319b589]
/lib64/libpthread.so.0(+0x7237)[0x7f4372f73237]
/lib64/libpthread.so.0(+0x734f)[0x7f4372f7334f]
/lib64/libpthread.so.0(pthread_join+0xdb)[0x7f4372f7569b]
/data/user3/miniconda3/envs/fair/lib/python3.7/site-packages/scipy/special/../../scipy.libs/libopenblasp-r0-085ca80a.3.9.so(blas_thread_shutdown_+0xca)[0x7f4157a32a6a]
/lib64/libc.so.6(__libc_fork+0x52)[0x7f437256d8c2]
......

The other outputs are:

Environment:
        Python: 3.7.16
        PyTorch: 1.13.0+cu117
        Torchvision: 0.14.0+cu117
        CUDA: 11.7
        CUDNN: 8500
        NumPy: 1.19.5
        PIL: 9.5.0
Args:
        algorithm: ERM
        attr: age
        aug: basic2
        checkpoint_freq: None
        data_dir: /data/user3/datasets/ODIR-5K/odir5k/ODIR-5K
        dataset: ['ODIR']
        debug: False
        es_metric: min_group:accuracy
        es_patience: 5
        es_strategy: metric
        group_def: group
        hparams: None
        hparams_seed: 0
        image_arch: densenet_sup_in1k
        log_all: False
        log_online: False
        output_dir: output
        resume: 
        seed: 0
        skip_model_save: False
        skip_ood_eval: False
        stage1_folder: None
        steps: None
        store_name: exp1
        stratified_erm_subset: None
        task: Retinopathy
        use_es: False
HParams:
        attr: age
        attr_balanced: False
        batch_size: 64
        data_augmentation: basic2
        group_balanced: False
        group_def: group
        image_arch: densenet_sup_in1k
        last_layer_dropout: 0.0
        lr: 0.001
        nonlinear_classifier: False
        optimizer: adam
        pretrained: True
        resnet18: False
        task: Retinopathy
        weight_decay: 0.0001
cuda
Dataset:
        [train] 4524
        [val]   1044

I have never met such problem before in pytorch. Moreover, I found that if i use wandb, the bug will be triggered earlier, where Hparams would not be printed. Can anyone help me?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core dumped Problem #1

Core dumped Problem #1

nightrain-vampire commented Jul 5, 2024

Core dumped Problem #1

Core dumped Problem #1

Comments

nightrain-vampire commented Jul 5, 2024