Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: error handling in autoreporting tool #70

Closed
wants to merge 4 commits into from

Conversation

Lipastomies
Copy link
Collaborator

This PR is a draft for how handling could be.

The basic idea is that users do not want to see stack traces, so they aren't printed when the script errors out. Instead, human-readable explanation is printed to stdout where it is possible (aka where the correct variables are available), and the stack trace is printed to a file (and/or stderr?). A return code of 1 will be sent in case of any error, and 0 in case of no errors.

I'd be happy to hear if this sounds sensible, and if there are other possible ways that might be more suitable.

also aims to fix #68

@Lipastomies Lipastomies requested review from juhis and Fedja January 31, 2020 15:42
except KeyError:
cols=list(df.columns)
columns_not_in_cols=[a for a in columns.values() if a not in cols]
print("A KeyError happened while parsing the input file {}. It is likely that the columns given to the script do not match with actual input file columns.".format(fname))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can explicitly check that and give the exact column not in the data.

Scripts/main.py Outdated
except Exception as e:
print("Exception occurred. Check the above errors. The tool will now abort.")
fname=die()
print("Traceback printed in file {}".format(fname))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be good to start using actual logger package (https://docs.python.org/3/library/logging.html) ... it would be easier to control what level of logging goes where. Also keeping in mind usage of this in pipelining... now we would miss the stacktrace when running via WDL or if we add that to WDL output, we error out as the file might not exist. logging package would allow different configuration options depending on running mode.

@Fedja
Copy link
Contributor

Fedja commented Jan 31, 2020

make that annotation file required in argparse if it's expected.....

autoreporting/Scripts/main.py flagship/meta/finngen_R4_T2D_meta.gz --ld-api online --gnomad-genome-path "../autoreport_resources/gnomad.genomes.r2.1.sites.liftover.b38.finngen.r2pos.af.ac.an.tsv.gz" --gnomad-exome-path "../autoreport_resources/gnomad.exomes.r2.1.sites.liftover.b38.finngen.r2pos.af.ac.an.tsv.gz" --functional-path "../autoreport_resources/fin_enriched_genomes_select_columns.txt.gz" --column-labels "#CHR" "POS" "REF" "ALT" "ESTONIA_pval" "ESTONIA_beta" "FINNGEN_maf_controls" --sign-treshold 5.0e-8 --ld-r2 0.4 --alt-sign-treshold 1e-4  --ignore-region '6:23000000-38000000' --efo-codes EFO_0001360
input file: flagship/meta/finngen_R4_T2D_meta.gz
filter & group SNPs
Annotate SNPs
Traceback (most recent call last):
  File "autoreporting/Scripts/main.py", line 125, in <module>
    main(args)
  File "autoreporting/Scripts/main.py", line 53, in main
    functional_path=args.functional_path, prefix=args.prefix, columns=columns)
  File "/mnt/disks/r4/mitja/autoreporting/Scripts/annotate.py", line 81, in annotate
    raise FileNotFoundError("Tabix index for file {} not found. Make sure that the file is properly indexed.".format(finngen_path))
FileNotFoundError: Tabix index for file  not found. Make sure that the file is properly indexed.

@Lipastomies
Copy link
Collaborator Author

Here's a try at using the logging crate in the repository. @Fedja do you think something like this could work? The basic idea is that everything is reported in pretty much the same way as before, except that it's done through a logger. In the beginning of the script, the logging level is set, which means that e.g. when using the widdle, we can log everything, and when run normally, the logging can be less distracting.

There's quite a lot of merge conflicts but those are fairly straightforward to clear, mainly due to code changing in similar places.

@Lipastomies Lipastomies requested a review from Fedja February 25, 2020 13:23
@Lipastomies Lipastomies deleted the handle_wrong_column_values_error branch April 19, 2021 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

error if special characters in column names. replace illegal characters or warning or error out early
2 participants