Skip to content

Commit

Permalink
Automated doc build for refs/heads/main
Browse files Browse the repository at this point in the history
  • Loading branch information
PopSim-bot committed Dec 13, 2024
1 parent bb0c5c6 commit 789ad8a
Show file tree
Hide file tree
Showing 15 changed files with 137 additions and 1 deletion.
Binary file modified main/_images/sec_catalog_anapla_models_mallardblackduck_2l19.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified main/_images/sec_catalog_homsap_models_americanadmixture_4b11.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified main/_images/sec_catalog_homsap_models_outofafrica_2t12.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified main/_images/sec_catalog_homsap_models_outofafrica_3g09.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified main/_images/sec_catalog_homsap_models_outofafrica_4j17.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified main/_images/sec_catalog_pantro_models_bonoboghost_4k19.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified main/_images/sec_catalog_ponabe_models_twospecies_2l11.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
73 changes: 73 additions & 0 deletions main/_sources/tutorial.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1439,6 +1439,79 @@ To make the example quick, we've only simulated the first 100Kb;
a more realistic example would apply it to the exons, available
as a :ref:`annotation <sec_catalog_phosin_annotations>`.

.. _sec_tute_recapitation:

Tips, tricks, and gotchas
=========================

Here are a few things about the whole process that it might be useful to know.
Maybe this will save you some time,
or let you do new things!

.. _sec_tute_missing_data:

Missing data and coordinates
----------------------------

Suppose as above that we've simulated just a portion of a chromosome,
using the `left` and `right` arguments to `species.get_contig( )`:

.. code-block:: python
species = stdpopsim.get_species("HomSap")
model = species.get_demographic_model("Africa_1T12")
contig = species.get_contig(
"chr22", left=10e6, right=20e6, mutation_rate=model.mutation_rate
)
samples = {"AFR": 100}
engine = stdpopsim.get_engine("msprime")
ts = engine.simulate(model, contig, samples)
print(
f"Sequence length: {ts.sequence_length}\n"
f" First variant: {ts.sites_position[0]}\n"
f" Last variant: {ts.sites_position[-1]}\n"
)
# Sequence length: 50818468.0
# First variant: 10000142.0
# Last variant: 19999926.0
We would like the output to preserve the coordinate system,
so all variants we'd see in a VCF file (for instance) are between
10Mb and 20Mb. (And, if you're just getting a VCF, then no need to read
the rest of this!) However, for the tree sequence to
retain the same coordinates, it must start at position 0,
and end at the sequence length of human chromosome 22.
So, the rest of the tree sequence contains "misssing data",
which is encoded as, basically, a big "tree" where no-one is
related to anyone else on those segments (in other words,
before 10Mb and after 20Mb).

This can lead to surprising things.
For instance, the first tree (the tree describing relationships
at position 0 along the sequence) has 200 roots:

.. code-block:: python
t = ts.first()
t.num_roots
# 200
Of course, that's just one root per sample: in other words,
there's actually no trees on this portion of the genome.
If we check all the trees using the `root_threshold` argument
to :meth:`tskit.TreeSequence.trees`, then we'll correctly see
that in fact all trees have fully coalesced (as they should have,
because as discussed above, we have recapitated them):

.. code-block:: python
max([t.num_roots for t in ts.trees(root_threshold=2)])
# 1
To read more about using tree sequences,
see `tskit's documentation <https://tskit.dev/tskit/docs/latest/data-model.html>`__.

.. _sec_tute_analyses:

*******************************
Expand Down
Binary file modified main/objects.inv
Binary file not shown.
2 changes: 1 addition & 1 deletion main/searchindex.js

Large diffs are not rendered by default.

63 changes: 63 additions & 0 deletions main/tutorial.html
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,10 @@
<li class="toctree-l4"><a class="reference internal" href="#using-a-dfe-from-one-species-in-another-species">5. Using a DFE from one species in another species</a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="#tips-tricks-and-gotchas">Tips, tricks, and gotchas</a><ul>
<li class="toctree-l4"><a class="reference internal" href="#missing-data-and-coordinates">Missing data and coordinates</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="#example-analyses-with-stdpopsim">Example analyses with stdpopsim</a><ul>
Expand Down Expand Up @@ -1485,6 +1489,65 @@ <h5>Outputting the <code class="docutils literal notranslate"><span class="pre">
as a <a class="reference internal" href="catalog.html#sec_catalog_phosin_annotations"><span class="std std-ref">annotation</span></a>.</p>
</section>
</section>
<section id="tips-tricks-and-gotchas">
<span id="sec-tute-recapitation"></span><h3>Tips, tricks, and gotchas<a class="headerlink" href="#tips-tricks-and-gotchas" title="Permalink to this heading"></a></h3>
<p>Here are a few things about the whole process that it might be useful to know.
Maybe this will save you some time,
or let you do new things!</p>
<section id="missing-data-and-coordinates">
<span id="sec-tute-missing-data"></span><h4>Missing data and coordinates<a class="headerlink" href="#missing-data-and-coordinates" title="Permalink to this heading"></a></h4>
<p>Suppose as above that we’ve simulated just a portion of a chromosome,
using the <cite>left</cite> and <cite>right</cite> arguments to <cite>species.get_contig( )</cite>:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">species</span> <span class="o">=</span> <span class="n">stdpopsim</span><span class="o">.</span><span class="n">get_species</span><span class="p">(</span><span class="s2">&quot;HomSap&quot;</span><span class="p">)</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">species</span><span class="o">.</span><span class="n">get_demographic_model</span><span class="p">(</span><span class="s2">&quot;Africa_1T12&quot;</span><span class="p">)</span>
<span class="n">contig</span> <span class="o">=</span> <span class="n">species</span><span class="o">.</span><span class="n">get_contig</span><span class="p">(</span>
<span class="s2">&quot;chr22&quot;</span><span class="p">,</span> <span class="n">left</span><span class="o">=</span><span class="mf">10e6</span><span class="p">,</span> <span class="n">right</span><span class="o">=</span><span class="mf">20e6</span><span class="p">,</span> <span class="n">mutation_rate</span><span class="o">=</span><span class="n">model</span><span class="o">.</span><span class="n">mutation_rate</span>
<span class="p">)</span>
<span class="n">samples</span> <span class="o">=</span> <span class="p">{</span><span class="s2">&quot;AFR&quot;</span><span class="p">:</span> <span class="mi">100</span><span class="p">}</span>
<span class="n">engine</span> <span class="o">=</span> <span class="n">stdpopsim</span><span class="o">.</span><span class="n">get_engine</span><span class="p">(</span><span class="s2">&quot;msprime&quot;</span><span class="p">)</span>
<span class="n">ts</span> <span class="o">=</span> <span class="n">engine</span><span class="o">.</span><span class="n">simulate</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">contig</span><span class="p">,</span> <span class="n">samples</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span>
<span class="sa">f</span><span class="s2">&quot;Sequence length: </span><span class="si">{</span><span class="n">ts</span><span class="o">.</span><span class="n">sequence_length</span><span class="si">}</span><span class="se">\n</span><span class="s2">&quot;</span>
<span class="sa">f</span><span class="s2">&quot; First variant: </span><span class="si">{</span><span class="n">ts</span><span class="o">.</span><span class="n">sites_position</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="si">}</span><span class="se">\n</span><span class="s2">&quot;</span>
<span class="sa">f</span><span class="s2">&quot; Last variant: </span><span class="si">{</span><span class="n">ts</span><span class="o">.</span><span class="n">sites_position</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span><span class="si">}</span><span class="se">\n</span><span class="s2">&quot;</span>
<span class="p">)</span>
<span class="c1"># Sequence length: 50818468.0</span>
<span class="c1"># First variant: 10000142.0</span>
<span class="c1"># Last variant: 19999926.0</span>
</pre></div>
</div>
<p>We would like the output to preserve the coordinate system,
so all variants we’d see in a VCF file (for instance) are between
10Mb and 20Mb. (And, if you’re just getting a VCF, then no need to read
the rest of this!) However, for the tree sequence to
retain the same coordinates, it must start at position 0,
and end at the sequence length of human chromosome 22.
So, the rest of the tree sequence contains “misssing data”,
which is encoded as, basically, a big “tree” where no-one is
related to anyone else on those segments (in other words,
before 10Mb and after 20Mb).</p>
<p>This can lead to surprising things.
For instance, the first tree (the tree describing relationships
at position 0 along the sequence) has 200 roots:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">t</span> <span class="o">=</span> <span class="n">ts</span><span class="o">.</span><span class="n">first</span><span class="p">()</span>
<span class="n">t</span><span class="o">.</span><span class="n">num_roots</span>
<span class="c1"># 200</span>
</pre></div>
</div>
<p>Of course, that’s just one root per sample: in other words,
there’s actually no trees on this portion of the genome.
If we check all the trees using the <cite>root_threshold</cite> argument
to <a class="reference external" href="https://tskit.dev/tskit/docs/stable/python-api.html#tskit.TreeSequence.trees" title="(in Project name not set)"><code class="xref py py-meth docutils literal notranslate"><span class="pre">tskit.TreeSequence.trees()</span></code></a>, then we’ll correctly see
that in fact all trees have fully coalesced (as they should have,
because as discussed above, we have recapitated them):</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="nb">max</span><span class="p">([</span><span class="n">t</span><span class="o">.</span><span class="n">num_roots</span> <span class="k">for</span> <span class="n">t</span> <span class="ow">in</span> <span class="n">ts</span><span class="o">.</span><span class="n">trees</span><span class="p">(</span><span class="n">root_threshold</span><span class="o">=</span><span class="mi">2</span><span class="p">)])</span>
<span class="c1"># 1</span>
</pre></div>
</div>
<p>To read more about using tree sequences,
see <a class="reference external" href="https://tskit.dev/tskit/docs/latest/data-model.html">tskit’s documentation</a>.</p>
</section>
</section>
</section>
<section id="example-analyses-with-stdpopsim">
<span id="sec-tute-analyses"></span><h2>Example analyses with stdpopsim<a class="headerlink" href="#example-analyses-with-stdpopsim" title="Permalink to this heading"></a></h2>
Expand Down

0 comments on commit 789ad8a

Please sign in to comment.