IOErrors when running multiples processes at once. #108

alexanderwhatley · 2017-07-07T01:43:18Z

I'm running ~200 jobs that use NetMHCpan at once on a supercomputer cluster. Some of these jobs are throwing this error message:

sh: fork: retry: Resource temporarily unavailable
sh: fork: retry: Resource temporarily unavailable
sh: fork: retry: Resource temporarily unavailable
Traceback (most recent call last):
  File "NetMHCpan_trials_all.py", line 120, in <module>
    epitope_predictions = get_epitope_predictions(HLA_alleles, vcf_file)
  File "NetMHCpan_trials_all.py", line 39, in get_epitope_predictions
    original_epitope_predictions = predictor.predict_subsequences(original_sequences).to_dataframe()
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/base_predictor.py", line 128, in predict_subsequences
    binding_predictions = self.predict_peptides(sorted(peptide_set))
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/base_commandline_predictor.py", line 309, in predict_peptides
    temp_dir_list=dirs)
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/base_commandline_predictor.py", line 256, in _run_commands_and_collect_predictions
    process_limit=self.process_limit)
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/process_helpers.py", line 141, in run_multiple_commands_redirect_stdout
    add_to_queue(p)
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/process_helpers.py", line 126, in add_to_queue
    process.start()
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/process_helpers.py", line 51, in start
    self.process = Popen(self.args, stdout=stdout, stderr=stderr)
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/subprocess.py", line 707, in __init__
    restore_signals, start_new_session)
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/subprocess.py", line 1260, in _execute_child
    restore_signals, start_new_session, preexec_fn)
BlockingIOError: [Errno 11] Resource temporarily unavailable
sh: fork: retry: Resource temporarily unavailable

Do you have any advice on how to deal with this situation? Would placing a mutex on the file be the right thing to do, or should I perhaps retry the prediction line if it fails due to the resource being temporarily unavailable? Thanks for your help.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IOErrors when running multiples processes at once. #108

IOErrors when running multiples processes at once. #108

alexanderwhatley commented Jul 7, 2017

IOErrors when running multiples processes at once. #108

IOErrors when running multiples processes at once. #108

Comments

alexanderwhatley commented Jul 7, 2017