Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
kevinheavey committed Mar 24, 2024
1 parent 36cc928 commit dcd3d96
Show file tree
Hide file tree
Showing 14 changed files with 343 additions and 349 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
29800017
2064708a
2 changes: 1 addition & 1 deletion index.html
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,7 @@ <h2 class="anchored" data-anchor-id="running-the-code-yourself">Running the code
<p>You can install the exact packages that the book uses with the <a href="https://github.com/kevinheavey/modern-polars/blob/master/env.yml">env.yml</a> file:</p>
<pre class="shell"><code>mamba env create -f env.yml</code></pre>
<p>If you’re not using mamba/conda you can install the following package versions and it should work:</p>
<pre><code>polars: 0.20.2
<pre><code>polars: 0.20.16
pyarrow: 10.0.1
pandas: 2.1.1
numpy: 1.23.5
Expand Down
4 changes: 2 additions & 2 deletions indexing.html
Original file line number Diff line number Diff line change
Expand Up @@ -354,7 +354,7 @@ <h2 data-number="1.2" class="anchored" data-anchor-id="read-the-data"><span clas
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a>df_pd <span class="op">=</span> pd.read_csv(extracted)</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a>df_pd</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>/tmp/ipykernel_13811/2805799744.py:3: DtypeWarning:
<pre><code>/tmp/ipykernel_28258/2805799744.py:3: DtypeWarning:

Columns (76,77,84) have mixed types. Specify dtype option on import or set low_memory=False.
</code></pre>
Expand Down Expand Up @@ -816,7 +816,7 @@ <h2 data-number="1.5" class="anchored" data-anchor-id="settingwithcopy"><span cl
<span id="cb14-2"><a href="#cb14-2" aria-hidden="true" tabindex="-1"></a>f[f[<span class="st">'a'</span>] <span class="op">&lt;=</span> <span class="dv">3</span>][<span class="st">'b'</span>] <span class="op">=</span> f[<span class="st">'b'</span>] <span class="op">//</span> <span class="dv">10</span></span>
<span id="cb14-3"><a href="#cb14-3" aria-hidden="true" tabindex="-1"></a>f</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>/tmp/ipykernel_13811/1317853993.py:2: SettingWithCopyWarning:
<pre><code>/tmp/ipykernel_28258/1317853993.py:2: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Expand Down
4 changes: 2 additions & 2 deletions method_chaining.html
Original file line number Diff line number Diff line change
Expand Up @@ -690,7 +690,7 @@ <h3 data-number="2.4.1" class="anchored" data-anchor-id="daily-flights"><span cl
<span id="cb11-27"><a href="#cb11-27" aria-hidden="true" tabindex="-1"></a> .plot()</span>
<span id="cb11-28"><a href="#cb11-28" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>/tmp/ipykernel_13915/4223446110.py:17: FutureWarning:
<pre><code>/tmp/ipykernel_28342/4223446110.py:17: FutureWarning:

The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
</code></pre>
Expand Down Expand Up @@ -775,7 +775,7 @@ <h3 data-number="2.4.2" class="anchored" data-anchor-id="planes-with-multiple-da
<span id="cb16-18"><a href="#cb16-18" aria-hidden="true" tabindex="-1"></a>sns.boxplot(x<span class="op">=</span><span class="st">"turn"</span>, y<span class="op">=</span><span class="st">"DepDelay"</span>, data<span class="op">=</span>flights_pd, ax<span class="op">=</span>ax)</span>
<span id="cb16-19"><a href="#cb16-19" aria-hidden="true" tabindex="-1"></a>ax.set_ylim(<span class="op">-</span><span class="dv">50</span>, <span class="dv">50</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>/tmp/ipykernel_13915/2848021590.py:12: FutureWarning:
<pre><code>/tmp/ipykernel_28342/2848021590.py:12: FutureWarning:

The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
</code></pre>
Expand Down
18 changes: 9 additions & 9 deletions performance.html
Original file line number Diff line number Diff line change
Expand Up @@ -739,8 +739,8 @@ <h3 data-number="3.2.3" class="anchored" data-anchor-id="performance-comparison"
<span id="cb7-21"><a href="#cb7-21" aria-hidden="true" tabindex="-1"></a> .collect()</span>
<span id="cb7-22"><a href="#cb7-22" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>CPU times: user 182 ms, sys: 26.7 ms, total: 209 ms
Wall time: 52 ms</code></pre>
<pre><code>CPU times: user 149 ms, sys: 20.8 ms, total: 170 ms
Wall time: 42.8 ms</code></pre>
</div>
</div>
</div>
Expand All @@ -762,8 +762,8 @@ <h3 data-number="3.2.3" class="anchored" data-anchor-id="performance-comparison"
<span id="cb9-14"><a href="#cb9-14" aria-hidden="true" tabindex="-1"></a> .rename(columns<span class="op">=</span>{<span class="st">"↓OVA"</span>: <span class="st">"OVA"</span>})</span>
<span id="cb9-15"><a href="#cb9-15" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>CPU times: user 7.14 s, sys: 401 ms, total: 7.54 s
Wall time: 7.57 s</code></pre>
<pre><code>CPU times: user 6.53 s, sys: 336 ms, total: 6.87 s
Wall time: 6.86 s</code></pre>
</div>
</div>
</div>
Expand Down Expand Up @@ -1027,7 +1027,7 @@ <h3 data-number="3.3.2" class="anchored" data-anchor-id="calculate-great-circle-
<span id="cb16-8"><a href="#cb16-8" aria-hidden="true" tabindex="-1"></a> )</span>
<span id="cb16-9"><a href="#cb16-9" aria-hidden="true" tabindex="-1"></a>).collect()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>4 s ± 159 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)</code></pre>
<pre><code>3.91 s ± 53 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)</code></pre>
</div>
</div>
<p>On my machine the NumPy version tends to be 5-20% faster than the pure Polars version:</p>
Expand All @@ -1042,7 +1042,7 @@ <h3 data-number="3.3.2" class="anchored" data-anchor-id="calculate-great-circle-
<span id="cb18-8"><a href="#cb18-8" aria-hidden="true" tabindex="-1"></a> )</span>
<span id="cb18-9"><a href="#cb18-9" aria-hidden="true" tabindex="-1"></a>).collect()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>5.19 s ± 296 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)</code></pre>
<pre><code>4.64 s ± 51.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)</code></pre>
</div>
</div>
<p>This may not be a huge performance difference, but it at least means you don’t sacrifice speed when relying on NumPy. There are some <a href="https://pola-rs.github.io/polars-book/user-guide/howcani/interop/numpy.html">gotchas</a> though so watch out for those.</p>
Expand All @@ -1057,7 +1057,7 @@ <h3 data-number="3.3.2" class="anchored" data-anchor-id="calculate-great-circle-
<span id="cb20-7"><a href="#cb20-7" aria-hidden="true" tabindex="-1"></a> collected[<span class="st">"LONGITUDE_right"</span>].to_numpy()</span>
<span id="cb20-8"><a href="#cb20-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>6.23 s ± 79.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)</code></pre>
<pre><code>5.22 s ± 187 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)</code></pre>
</div>
</div>
</section>
Expand Down Expand Up @@ -1092,15 +1092,15 @@ <h2 data-number="3.4" class="anchored" data-anchor-id="polars-can-be-slower-than
<div class="cell" data-execution_count="18">
<div class="sourceCode cell-code" id="cb23"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb23-1"><a href="#cb23-1" aria-hidden="true" tabindex="-1"></a><span class="op">%</span>timeit rand_df_pl.select(polars_transform())</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>3.32 s ± 91.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)</code></pre>
<pre><code>3.08 s ± 153 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)</code></pre>
</div>
</div>
</div>
<div id="tabset-6-2" class="tab-pane" role="tabpanel" aria-labelledby="tabset-6-2-tab">
<div class="cell" data-execution_count="19">
<div class="sourceCode cell-code" id="cb25"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb25-1"><a href="#cb25-1" aria-hidden="true" tabindex="-1"></a><span class="op">%</span>timeit pandas_transform(rand_df_pd)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>2.18 s ± 34.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)</code></pre>
<pre><code>2.18 s ± 27 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)</code></pre>
</div>
</div>
</div>
Expand Down
10 changes: 5 additions & 5 deletions scaling.html
Original file line number Diff line number Diff line change
Expand Up @@ -497,8 +497,8 @@ <h2 data-number="6.3" class="anchored" data-anchor-id="executing-multiple-querie
<span id="cb7-21"><a href="#cb7-21" aria-hidden="true" tabindex="-1"></a> comm_subplan_elim<span class="op">=</span><span class="va">False</span>, <span class="co"># cannot use CSE with streaming</span></span>
<span id="cb7-22"><a href="#cb7-22" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>CPU times: user 14.7 s, sys: 2.99 s, total: 17.7 s
Wall time: 5.78 s</code></pre>
<pre><code>CPU times: user 13.4 s, sys: 2.66 s, total: 16 s
Wall time: 5.09 s</code></pre>
</div>
</div>
</div>
Expand All @@ -523,8 +523,8 @@ <h2 data-number="6.3" class="anchored" data-anchor-id="executing-multiple-querie
<span id="cb9-17"><a href="#cb9-17" aria-hidden="true" tabindex="-1"></a> avg_transaction_lazy_dd, total_by_employer_lazy_dd, avg_by_occupation_lazy_dd</span>
<span id="cb9-18"><a href="#cb9-18" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>CPU times: user 25.8 s, sys: 3.92 s, total: 29.8 s
Wall time: 17.2 s</code></pre>
<pre><code>CPU times: user 25.5 s, sys: 3.99 s, total: 29.5 s
Wall time: 17 s</code></pre>
</div>
</div>
</div>
Expand Down Expand Up @@ -620,7 +620,7 @@ <h3 data-number="6.3.3" class="anchored" data-anchor-id="avg_by_occupation"><spa
white-space: pre-wrap;
}
</style>
<small>shape: (10, 2)</small><table class="dataframe table table-sm table-striped"><thead><tr><th>OCCUPATION</th><th>TRANSACTION_AMT</th></tr><tr><td>cat</td><td>f64</td></tr></thead><tbody><tr><td>"CHAIRMAN CEO &amp;…</td><td>1.0233e6</td></tr><tr><td>"PAULSON AND CO…</td><td>1e6</td></tr><tr><td>"CO-FOUNDING DI…</td><td>875000.0</td></tr><tr><td></td><td></td></tr><tr><td>"CHIEF EXECUTIV</td><td>500000.0</td></tr><tr><td>"MOORE CAPITAL …</td><td>500000.0</td></tr></tbody></table></div>
<small>shape: (10, 2)</small><table class="dataframe table table-sm table-striped"><thead><tr><th>OCCUPATION</th><th>TRANSACTION_AMT</th></tr><tr><td>cat</td><td>f64</td></tr></thead><tbody><tr><td>"CHAIRMAN CEO &amp;…</td><td>1.0233e6</td></tr><tr><td>"PAULSON AND CO…</td><td>1e6</td></tr><tr><td>"CO-FOUNDING DI…</td><td>875000.0</td></tr><tr><td></td><td></td></tr><tr><td>"MOORE CAPITAL </td><td>500000.0</td></tr><tr><td>"PERRY HOMES"</td><td>500000.0</td></tr></tbody></table></div>
</div>
</div>
</div>
Expand Down
Loading

0 comments on commit dcd3d96

Please sign in to comment.