Skip to content

Commit

Permalink
Merge pull request #149 from timothymillar/f/128-simplify-input-param…
Browse files Browse the repository at this point in the history
…eters

Simplify input parameters
  • Loading branch information
timothymillar authored Nov 8, 2022
2 parents d767873 + d8e0983 commit 55cbba1
Show file tree
Hide file tree
Showing 9 changed files with 259 additions and 367 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.7, 3.8, 3.9]
python-version: ["3.8", "3.9", "3.10"]

steps:
- uses: actions/checkout@v2
Expand Down
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,14 @@

## Unreleased

New Features:
- Combine `--bam`, `--bam-list` and `--sample-bam` arguments #128
- Combine `--ploidy` and `--sample-ploidy` arguments #128
- Combine `--inbreeding` and `--sample-inbreeding` arguments #128
- Combine `--mcmc-temperatures` and `--sample-mcmc-temperatures` arguments #128



## Beta v0.7.0

New Features:
Expand All @@ -18,6 +26,8 @@ VCF Changes:
- Added `NOA` filter to indicate loci where no alleles were observed (e.g., masked reference only)
- Added `AF0` filter to indicate invalid prior allele frequencies in which all frequencies were zero



## Beta v0.6.0

New Features:
Expand Down
95 changes: 39 additions & 56 deletions cli-assemble-help.txt
Original file line number Diff line number Diff line change
@@ -1,11 +1,7 @@
usage: MCMC haplotype assembly [-h] [--region REGION] [--region-id REGION_ID]
[--targets TARGETS] [--variants VARIANTS]
[--reference REFERENCE] [--bam [BAM ...]]
[--bam-list BAM_LIST] [--sample-bam SAMPLE_BAM]
[--ploidy PLOIDY]
[--sample-ploidy SAMPLE_PLOIDY]
[--inbreeding INBREEDING]
[--sample-inbreeding SAMPLE_INBREEDING]
[--reference REFERENCE] [--bam BAM [BAM ...]]
[--ploidy PLOIDY] [--inbreeding INBREEDING]
[--sample-pool SAMPLE_POOL]
[--base-error-rate BASE_ERROR_RATE]
[--use-base-phred-scores]
Expand All @@ -24,7 +20,6 @@ usage: MCMC haplotype assembly [-h] [--region REGION] [--region-id REGION_ID]
[--mcmc-dosage-step-probability MCMC_DOSAGE_STEP_PROBABILITY]
[--mcmc-partial-dosage-step-probability MCMC_PARTIAL_DOSAGE_STEP_PROBABILITY]
[--mcmc-temperatures [MCMC_TEMPERATURES ...]]
[--sample-mcmc-temperatures SAMPLE_MCMC_TEMPERATURES]
[--haplotype-posterior-threshold HAPLOTYPE_POSTERIOR_THRESHOLD]

optional arguments:
Expand All @@ -49,42 +44,34 @@ optional arguments:
within this file.
--reference REFERENCE
Indexed fasta file containing the reference genome.
--bam [BAM ...] A list of 0 or more bam files. All samples found
within the listed bam files will be genotypes unless
the --sample-list parameter is used.
--bam-list BAM_LIST A file containing a list of bam file paths (one per
line). This can optionally be used in place of or
combined with the --bam parameter.
--sample-bam SAMPLE_BAM
A file containing a list of samples with bam file
paths. Each line of the file should be a sample
identifier followed by a tab and then a bam file path.
This can optionally be used in place the --bam and
--bam-list parameters. This is faster than using those
parameters when running many small jobs. An error will
be thrown if a sample is not found within its
specified bam file.
--ploidy PLOIDY Default ploidy for all samples (default = 2). This
value is used for all samples which are not specified
using the --sample-ploidy parameter
--sample-ploidy SAMPLE_PLOIDY
A file containing a list of samples with a ploidy
value used to indicate where their ploidy differs from
the default value. Each line should contain a sample
identifier followed by a tab and then an integer
ploidy value.
--bam BAM [BAM ...] Bam file(s) to use in analysis. This may be (1) a list
of one or more bam filepaths, (2) a plain-text file
containing a single bam filepath on each line, (3) a
plain-text file containing a sample identifier and its
corresponding bam filepath on each line separated by a
tab. If options (1) or (2) are used then all samples
within each bam will be used within the analysis. If
option (3) is used then only the specified sample will
be extracted from each bam file and An error will be
raised if a sample is not found within its specified
bam file.
--ploidy PLOIDY Specify sample ploidy (default = 2).This may be (1) a
single integer used to specify the ploidy of all
samples or (2) a file containing a list of all samples
and their ploidy. If option (2) is used then each line
of the plaintext file must contain a single sample
identifier and the ploidy of that sample separated by
a tab.
--inbreeding INBREEDING
Default inbreeding coefficient for all samples
(default = 0.0). This value is used for all samples
which are not specified using the --sample-inbreeding
parameter.
--sample-inbreeding SAMPLE_INBREEDING
A file containing a list of samples with an inbreeding
coefficient used to indicate where their expected
inbreeding coefficient default value. Each line should
contain a sample identifier followed by a tab and then
a inbreeding coefficient value within the interval [0,
1].
Specify expected sample inbreeding coefficient
(default = 0.0).This may be (1) a single floating
point value in the interval [0, 1] used to specify the
inbreeding coefficient of all samples or (2) a file
containing a list of all samples and their inbreeding
coefficient. If option (2) is used then each line of
the plaintext file must contain a single sample
identifier and the inbreeding coefficient of that
sample separated by a tab.
--sample-pool SAMPLE_POOL
A name used to pool all sample reads into a single
sample. WARNING: this is an experimental feature.
Expand Down Expand Up @@ -163,20 +150,16 @@ optional arguments:
sub-step during each step of the MCMC. (default =
0.5).
--mcmc-temperatures [MCMC_TEMPERATURES ...]
A list of inverse-temperatures to use for parallel
tempered chains. These values must be between 0 and 1
and will automatically be sorted in ascending order.
The cold chain value of 1.0 will be added
automatically if it is not specified.
--sample-mcmc-temperatures SAMPLE_MCMC_TEMPERATURES
A file containing a list of samples with mcmc
(inverse) temperatures. Each line of the file should
start with a sample identifier followed by tab
seperated numeric values between 0 and 1. The number
of temperatures specified may vary between samples.
Samples not listed in this file will use the default
values specified with the --mcmc-temperatures
argument.
Specify inverse-temperatures to use for parallel
tempered chains (default = 1.0 i.e., no tempering).
This may be either (1) a list of floating point values
or (2) a file containing a list of samples with mcmc
inverse-temperatures. If option (2) is used then the
file must contain a single sample per line followed by
a list of tab separated inverse temperatures. The
number of inverse-temperatures may differ between
samples and any samples not included in the list will
default to not using tempering.
--haplotype-posterior-threshold HAPLOTYPE_POSTERIOR_THRESHOLD
Posterior probability required for a haplotype to be
included in the output VCF as an alternative allele.
Expand Down
67 changes: 28 additions & 39 deletions cli-call-exact-help.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,8 @@ usage: Exact haplotype calling [-h] [--haplotypes HAPLOTYPES]
[--haplotype-frequencies HAPLOTYPE_FREQUENCIES]
[--haplotype-frequencies-prior]
[--skip-rare-haplotypes SKIP_RARE_HAPLOTYPES]
[--bam [BAM ...]] [--bam-list BAM_LIST]
[--sample-bam SAMPLE_BAM] [--ploidy PLOIDY]
[--sample-ploidy SAMPLE_PLOIDY]
[--bam BAM [BAM ...]] [--ploidy PLOIDY]
[--inbreeding INBREEDING]
[--sample-inbreeding SAMPLE_INBREEDING]
[--sample-pool SAMPLE_POOL]
[--base-error-rate BASE_ERROR_RATE]
[--use-base-phred-scores]
Expand Down Expand Up @@ -37,42 +34,34 @@ optional arguments:
if their frequency within that file is less than the
specified value. This requires that the --haplotype-
frequencies parameter is also specified.
--bam [BAM ...] A list of 0 or more bam files. All samples found
within the listed bam files will be genotypes unless
the --sample-list parameter is used.
--bam-list BAM_LIST A file containing a list of bam file paths (one per
line). This can optionally be used in place of or
combined with the --bam parameter.
--sample-bam SAMPLE_BAM
A file containing a list of samples with bam file
paths. Each line of the file should be a sample
identifier followed by a tab and then a bam file path.
This can optionally be used in place the --bam and
--bam-list parameters. This is faster than using those
parameters when running many small jobs. An error will
be thrown if a sample is not found within its
specified bam file.
--ploidy PLOIDY Default ploidy for all samples (default = 2). This
value is used for all samples which are not specified
using the --sample-ploidy parameter
--sample-ploidy SAMPLE_PLOIDY
A file containing a list of samples with a ploidy
value used to indicate where their ploidy differs from
the default value. Each line should contain a sample
identifier followed by a tab and then an integer
ploidy value.
--bam BAM [BAM ...] Bam file(s) to use in analysis. This may be (1) a list
of one or more bam filepaths, (2) a plain-text file
containing a single bam filepath on each line, (3) a
plain-text file containing a sample identifier and its
corresponding bam filepath on each line separated by a
tab. If options (1) or (2) are used then all samples
within each bam will be used within the analysis. If
option (3) is used then only the specified sample will
be extracted from each bam file and An error will be
raised if a sample is not found within its specified
bam file.
--ploidy PLOIDY Specify sample ploidy (default = 2).This may be (1) a
single integer used to specify the ploidy of all
samples or (2) a file containing a list of all samples
and their ploidy. If option (2) is used then each line
of the plaintext file must contain a single sample
identifier and the ploidy of that sample separated by
a tab.
--inbreeding INBREEDING
Default inbreeding coefficient for all samples
(default = 0.0). This value is used for all samples
which are not specified using the --sample-inbreeding
parameter.
--sample-inbreeding SAMPLE_INBREEDING
A file containing a list of samples with an inbreeding
coefficient used to indicate where their expected
inbreeding coefficient default value. Each line should
contain a sample identifier followed by a tab and then
a inbreeding coefficient value within the interval [0,
1].
Specify expected sample inbreeding coefficient
(default = 0.0).This may be (1) a single floating
point value in the interval [0, 1] used to specify the
inbreeding coefficient of all samples or (2) a file
containing a list of all samples and their inbreeding
coefficient. If option (2) is used then each line of
the plaintext file must contain a single sample
identifier and the inbreeding coefficient of that
sample separated by a tab.
--sample-pool SAMPLE_POOL
A name used to pool all sample reads into a single
sample. WARNING: this is an experimental feature.
Expand Down
Loading

0 comments on commit 55cbba1

Please sign in to comment.