Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core dumped Problem #1

Open
nightrain-vampire opened this issue Jul 5, 2024 · 0 comments
Open

Core dumped Problem #1

nightrain-vampire opened this issue Jul 5, 2024 · 0 comments

Comments

@nightrain-vampire
Copy link

I run the command 'python -m train --algorithm ERM --dataset ODIR --task Retinopathy --attr age --data_dir /data/user3/datasets/ODIR-5K/odir5k/ODIR-5K --store_name exp1', but it reports "core dumped" problem. The Backtrace is as follow:

*** Error in `python': free(): invalid pointer: 0x00000000006dda70 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7340f)[0x7f437252a40f]
/lib64/libc.so.6(+0x78c7e)[0x7f437252fc7e]
/lib64/libc.so.6(+0x79957)[0x7f4372530957]
/lib64/ld-linux-x86-64.so.2(_dl_deallocate_tls+0x39)[0x7f437319b589]
/lib64/libpthread.so.0(+0x7237)[0x7f4372f73237]
/lib64/libpthread.so.0(+0x734f)[0x7f4372f7334f]
/lib64/libpthread.so.0(pthread_join+0xdb)[0x7f4372f7569b]
/data/user3/miniconda3/envs/fair/lib/python3.7/site-packages/scipy/special/../../scipy.libs/libopenblasp-r0-085ca80a.3.9.so(blas_thread_shutdown_+0xca)[0x7f4157a32a6a]
/lib64/libc.so.6(__libc_fork+0x52)[0x7f437256d8c2]
......

The other outputs are:

Environment:
        Python: 3.7.16
        PyTorch: 1.13.0+cu117
        Torchvision: 0.14.0+cu117
        CUDA: 11.7
        CUDNN: 8500
        NumPy: 1.19.5
        PIL: 9.5.0
Args:
        algorithm: ERM
        attr: age
        aug: basic2
        checkpoint_freq: None
        data_dir: /data/user3/datasets/ODIR-5K/odir5k/ODIR-5K
        dataset: ['ODIR']
        debug: False
        es_metric: min_group:accuracy
        es_patience: 5
        es_strategy: metric
        group_def: group
        hparams: None
        hparams_seed: 0
        image_arch: densenet_sup_in1k
        log_all: False
        log_online: False
        output_dir: output
        resume: 
        seed: 0
        skip_model_save: False
        skip_ood_eval: False
        stage1_folder: None
        steps: None
        store_name: exp1
        stratified_erm_subset: None
        task: Retinopathy
        use_es: False
HParams:
        attr: age
        attr_balanced: False
        batch_size: 64
        data_augmentation: basic2
        group_balanced: False
        group_def: group
        image_arch: densenet_sup_in1k
        last_layer_dropout: 0.0
        lr: 0.001
        nonlinear_classifier: False
        optimizer: adam
        pretrained: True
        resnet18: False
        task: Retinopathy
        weight_decay: 0.0001
cuda
Dataset:
        [train] 4524
        [val]   1044

I have never met such problem before in pytorch. Moreover, I found that if i use wandb, the bug will be triggered earlier, where Hparams would not be printed. Can anyone help me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant