You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current tool has a huge amount of flags that can be set, ranging from column names to parameters used in grouping, file inputs etc.
For example, setting the column names is quite fiddly, especially as column names differ between inputs (summstats, FG annotations, gnoMAD annotations, GWAS Catalog all have their own column names, and some differ in formats, e.g. chromosome being "chrXX" instead of "XX", having chromosome 23 be X/23, etc. Currently this information is hardcoded to the scripts, which is hard to change and read, as well as making the data scattered across the codebase.
Would it make sense to create a simple configuration file, from which these parameters could be alternatively set? This could be as simple as a json file with predefined fields.
@Fedja what do you think? It would add some complexity, but most of it could be implemented prior to the actual analysis scripts, so they would not have to be modified much.
Also, currently the defaults for values are hardcoded in the scripts. By having a default configuration file, the default values would be more easily accessible to people not developing this tool (i.e. Not-me). This could also open up some other possibilities in making the tool's calculations more general, e.g. when calculating enrichment for finns vs different groups of populations, the AF/AC/AN column names could be defined outside the script -> if we change the input data layout, the script would not need to be modified, OR if we want to calculate them differently, the script would not need to be modified. This more ambitious goal would of course add more work.
The negatives that would come with this change would be
Additional complexity
More bugs
Lots of work. This brings the risk of delaying other, more important work.
The text was updated successfully, but these errors were encountered:
The current tool has a huge amount of flags that can be set, ranging from column names to parameters used in grouping, file inputs etc.
For example, setting the column names is quite fiddly, especially as column names differ between inputs (summstats, FG annotations, gnoMAD annotations, GWAS Catalog all have their own column names, and some differ in formats, e.g. chromosome being "chrXX" instead of "XX", having chromosome 23 be X/23, etc. Currently this information is hardcoded to the scripts, which is hard to change and read, as well as making the data scattered across the codebase.
Would it make sense to create a simple configuration file, from which these parameters could be alternatively set? This could be as simple as a json file with predefined fields.
@Fedja what do you think? It would add some complexity, but most of it could be implemented prior to the actual analysis scripts, so they would not have to be modified much.
Also, currently the defaults for values are hardcoded in the scripts. By having a default configuration file, the default values would be more easily accessible to people not developing this tool (i.e. Not-me). This could also open up some other possibilities in making the tool's calculations more general, e.g. when calculating enrichment for finns vs different groups of populations, the AF/AC/AN column names could be defined outside the script -> if we change the input data layout, the script would not need to be modified, OR if we want to calculate them differently, the script would not need to be modified. This more ambitious goal would of course add more work.
The negatives that would come with this change would be
The text was updated successfully, but these errors were encountered: