Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

align fix; theorem style; condition number #137

Merged
merged 3 commits into from
Oct 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions quarto/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,5 @@
/_freeze/
/*/*_files/
/*/*.ipynb/
/*/references.bib
weave_support.jl
9 changes: 6 additions & 3 deletions quarto/ODEs/differential_equations.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -71,12 +71,13 @@ $$

The author's apply this model to flu statistics from Hong Kong where:


$$
\begin{align*}
S(0) &= 7,900,000\\
I(0) &= 10\\
R(0) &= 0\\
\end{align*}
$$

In `Julia` we define these, `N` to model the total population, and `u0` to be the proportions.

Expand Down Expand Up @@ -130,12 +131,13 @@ The plot shows steady decay, as there is no mixing of infected with others.

Adding in the interaction requires a bit more work. We now have what is known as a *system* of equations:


$$
\begin{align*}
\frac{ds}{dt} &= -b \cdot s(t) \cdot i(t)\\
\frac{di}{dt} &= b \cdot s(t) \cdot i(t) - k \cdot i(t)\\
\frac{dr}{dt} &= k \cdot i(t)\\
\end{align*}
$$

Systems of equations can be solved in a similar manner as a single ordinary differential equation, though adjustments are made to accommodate the multiple functions.

Expand Down Expand Up @@ -277,11 +279,12 @@ We now solve numerically the problem of a trajectory with a drag force from air

The general model is:


$$
\begin{align*}
x''(t) &= - W(t,x(t), x'(t), y(t), y'(t)) \cdot x'(t)\\
y''(t) &= -g - W(t,x(t), x'(t), y(t), y'(t)) \cdot y'(t)\\
\end{align*}
$$

with initial conditions: $x(0) = y(0) = 0$ and $x'(0) = v_0 \cos(\theta), y'(0) = v_0 \sin(\theta)$.

Expand Down
3 changes: 2 additions & 1 deletion quarto/ODEs/euler.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -602,12 +602,13 @@ $$

We can try the Euler method here. A simple approach might be this iteration scheme:


$$
\begin{align*}
x_{n+1} &= x_n + h,\\
u_{n+1} &= u_n + h v_n,\\
v_{n+1} &= v_n - h \cdot g/l \cdot \sin(u_n).
\end{align*}
$$

Here we need *two* initial conditions: one for the initial value $u(t_0)$ and the initial value of $u'(t_0)$. We have seen if we start at an angle $a$ and release the bob from rest, so $u'(0)=0$ we get a sinusoidal answer to the linearized model. What happens here? We let $a=1$, $l=5$ and $g=9.8$:

Expand Down
9 changes: 6 additions & 3 deletions quarto/ODEs/odes.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -86,12 +86,13 @@ $$

Again, we can integrate to get an answer for any value $t$:


$$
\begin{align*}
x(t) - x(t_0) &= \int_{t_0}^t \frac{dx}{dt} dt \\
&= (v_0t + \frac{1}{2}a t^2 - at_0 t) |_{t_0}^t \\
&= (v_0 - at_0)(t - t_0) + \frac{1}{2} a (t^2 - t_0^2).
\end{align*}
$$

There are three constants: the initial value for the independent variable, $t_0$, and the two initial values for the velocity and position, $v_0, x_0$. Assuming $t_0 = 0$, we can simplify the above to get a formula familiar from introductory physics:

Expand Down Expand Up @@ -336,11 +337,12 @@ Differential equations are classified according to their type. Different types h

The first-order initial value equations we have seen can be described generally by


$$
\begin{align*}
y'(x) &= F(y,x),\\
y(x_0) &= x_0.
\end{align*}
$$

Special cases include:

Expand Down Expand Up @@ -667,12 +669,13 @@ Though `y` is messy, it can be seen that the answer is a quadratic polynomial in

In a resistive medium, there are drag forces at play. If this force is proportional to the velocity, say, with proportion $\gamma$, then the equations become:


$$
\begin{align*}
x''(t) &= -\gamma x'(t), & \quad y''(t) &= -\gamma y'(t) -g, \\
x(0) &= x_0, &\quad y(0) &= y_0,\\
x'(0) &= v_0\cos(\alpha),&\quad y'(0) &= v_0 \sin(\alpha).
\end{align*}
$$

We now attempt to solve these.

Expand Down
4 changes: 3 additions & 1 deletion quarto/_make_pdf.jl
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,13 @@ execute:
format:
typst:
toc: false
section-numbering: "1."
section-numbering: "1.1.1"
number-depth: 3
keep-typ: false
include-before-body:
- text: |
#set figure(placement: auto)
bibliography: references.bib
---
"""

Expand Down
2 changes: 1 addition & 1 deletion quarto/_quarto.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
version: "0.21"
version: "0.22"
engine: julia

project:
Expand Down
3 changes: 2 additions & 1 deletion quarto/alternatives/SciML.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -573,13 +573,14 @@ As well, suppose we wanted to parameterize our function and then differentiate.

Consider $d/dp \int_0^\pi \sin(px) dx$. We can do this integral directly to get


$$
\begin{align*}
\frac{d}{dp} \int_0^\pi \sin(px)dx
&= \frac{d}{dp}\left( \frac{-1}{p} \cos(px)\Big\rvert_0^\pi\right)\\
&= \frac{d}{dp}\left( -\frac{\cos(p\cdot\pi)-1}{p}\right)\\
&= \frac{\cos(p\cdot \pi) - 1}{p^2} + \frac{\pi\cdot\sin(p\cdot\pi)}{p}
\end{align*}
$$

Using `Integrals` with `QuadGK` we have:

Expand Down
46 changes: 24 additions & 22 deletions quarto/derivatives/condition.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,73 +2,74 @@

In Part III of @doi:10.1137/1.9781611977165 we find language of numerical analysis useful to formally describe the zero-finding problem. Key concepts are errors, conditioning, and stability.

Abstractly a *problem* is a mapping, or function, from a domain $X$ of data to a range $Y$ of solutions. Both $X$ and $Y$ have a sense of distance given by a *norm*. A norm is a generalization of the absolute value.
Abstractly a *problem* is a mapping, $F$, from a domain $X$ of data to a range $Y$ of solutions. Both $X$ and $Y$ have a sense of distance given by a *norm*. A norm is a generalization of the absolute value and gives quantitative meaning to terms like small and large.


> A *well-conditioned* problem is one with the property that all small perturbations of $x$ lead to only small changes in $f(x)$.
> A *well-conditioned* problem is one with the property that all small perturbations of $x$ lead to only small changes in $F(x)$.

This sense of "small" is quantified through a *condition number*.
This sense of "small" is measured through a *condition number*.

If we let $\delta_x$ be a small perturbation of $x$ then $\delta_f = f(x + \delta_x) - f(x)$.
If we let $\delta_x$ be a small perturbation of $x$ then $\delta_F = F(x + \delta_x) - F(x)$.

The *forward error* is $\lVert\delta_f\rVert = \lVert f(x+\delta_x) - f(x)\rVert$, the *relative forward error* is $\lVert\delta_f\rVert/\lVert f\rVert = \lVert f(x+\delta_x) - f(x)\rVert/ \lVert f(x)\rVert$.
The *forward error* is $\lVert\delta_F\rVert = \lVert F(x+\delta_x) - F(x)\rVert$, the *relative forward error* is $\lVert\delta_F\rVert/\lVert F\rVert = \lVert F(x+\delta_x) - F(x)\rVert/ \lVert F(x)\rVert$.

The *backward error* is $\lVert\delta_x\rVert$, the *relative backward error* is $\lVert\delta_x\rVert / \lVert x\rVert$.

The *absolute condition number* $\hat{\kappa}$ is worst case of this ratio $\lVert\delta_f\rVert/ \lVert\delta_x\rVert$ as the perturbation size shrinks to $0$.
The relative condition number $\kappa$ divides $\lVert\delta_f\rVert$ by $\lVert f(x)\rVert$ and $\lVert\delta_x\rVert$ by $\lVert x\rVert$ before taking the ratio.
The *absolute condition number* $\hat{\kappa}$ is worst case of this ratio $\lVert\delta_F\rVert/ \lVert\delta_x\rVert$ as the perturbation size shrinks to $0$.
The relative condition number $\kappa$ divides $\lVert\delta_F\rVert$ by $\lVert F(x)\rVert$ and $\lVert\delta_x\rVert$ by $\lVert x\rVert$ before taking the ratio.


A *problem* is a mathematical concept, an *algorithm* the computational version. Algorithms may differ for many reasons, such as floating point errors, tolerances, etc. We use notation $\tilde{f}$ to indicate the algorithm.
A *problem* is a mathematical concept, an *algorithm* the computational version. Algorithms may differ for many reasons, such as floating point errors, tolerances, etc. We use notation $\tilde{F}$ to indicate the algorithm.

The absolute error in the algorithm is $\lVert\tilde{f}(x) - f(x)\rVert$, the relative error divides by $\lVert f(x)\rVert$. A good algorithm would have smaller relative errors.
The absolute error in the algorithm is $\lVert\tilde{F}(x) - F(x)\rVert$, the relative error divides by $\lVert F(x)\rVert$. A good algorithm would have smaller relative errors.

An algorithm is called *stable* if

$$
\frac{\lVert\tilde{f}(x) - f(\tilde{x})\rVert}{\lVert f(\tilde{x})\rVert}
\frac{\lVert\tilde{F}(x) - F(\tilde{x})\rVert}{\lVert F(\tilde{x})\rVert}
$$

is *small* for *some* $\tilde{x}$ relatively near $x$, $\lVert\tilde{x}-x\rVert/\lVert x\rVert$.

> "A *stable* algorithm gives nearly the right answer to nearly the right question."
> A *stable* algorithm gives nearly the right answer to nearly the right question.

(The answer it gives is $\tilde{f}(x)$, the nearly right question is $f(\tilde{x})$.)
(The answer it gives is $\tilde{F}(x)$, the nearly right question: what is $F(\tilde{x})$?)

A related concept is an algorithm $\tilde{f}$ for a problem $f$ is *backward stable* if for each $x \in X$,
A related concept is an algorithm $\tilde{F}$ for a problem $F$ is *backward stable* if for each $x \in X$,

$$
\tilde{f}(x) = f(\tilde{x})
\tilde{F}(x) = F(\tilde{x})
$$

for some $\tilde{x}$ with $\lVert\tilde{x} - x\rVert/\lVert x\rVert$ is small.
for some $\tilde{x}$ with $\lVert\tilde{x} - x\rVert/\lVert x\rVert$ is small.

> "A backward stable algorithm gives exactly the right answer to nearly the right question."


The concepts are related by Trefethen and Bao's Theorem 15.1 which says for a backward stable algorithm the relative error $\lVert\tilde{f}(x) - f(x)\rVert/\lVert f(x)\rVert$ is small in a manner proportional to the relative condition number.
The concepts are related by Trefethen and Bao's Theorem 15.1 which says for a backward stable algorithm the relative error $\lVert\tilde{F}(x) - F(x)\rVert/\lVert F(x)\rVert$ is small in a manner proportional to the relative condition number.

Applying this to the zero-finding we follow @doi:10.1137/1.9781611975086.

To be specific, the problem is finding a zero of $f$ starting at an initial point $x_0$. The data is $(f, x_0)$, the solution is $r$ a zero of $f$.
To be specific, the problem, $F$, is finding a zero of a function $f$ starting at an initial point $x_0$. The data is $(f, x_0)$, the solution is $r$ a zero of $f$.

Take the algorithm as Newton's method. Any implementation must incorporate tolerances, so this is a computational approximation to the problem. The data is the same, but technically we use $\tilde{f}$ for the function, as any computation is dependent on machine implementations. The output is $\tilde{r}$ an *approximate* zero.

Suppose for sake of argument that $\tilde{f}(x) = f(x) + \epsilon$ and $r$ is a root of $f$ and $\tilde{r}$ is a root of $\tilde{f}$. Then by linearization:
Suppose for sake of argument that $\tilde{f}(x) = f(x) + \epsilon$, $f$ has a continuous derivative, and $r$ is a root of $f$ and $\tilde{r}$ is a root of $\tilde{f}$. Then by linearization:

$$
\begin{align*}
0 &= \tilde{f}(\tilde r) \\
&= f(r + \delta) + \epsilon\\
&\approx f(r) + f'(r)\delta + \epsilon\\
&= 0 + f'(r)\delta + \epsilon
\end{align*}

Rearranging gives $\lVert\delta/\epsilon\rVert \approx 1/\lVert f'(r)\rVert$ leading to:
$$
Rearranging gives $\lVert\delta/\epsilon\rVert \approx 1/\lVert f'(r)\rVert$. But the $|\delta|/|\epsilon|$ ratio is related to the the condition number:

> The absolute condition number is $\hat{\kappa}_r = |f'(r)|^{-1}$.


The error formula in Newton's method includes the derivative in the denominator, so we see large condition numbers are tied into larger errors.
The error formula in Newton's method measuring the distance between the actual root and an approximation includes the derivative in the denominator, so we see large condition numbers are tied into possibly larger errors.

Now consider $g(x) = f(x) - f(\tilde{r})$. Call $f(\tilde{r})$ the residual. We have $g$ is near $f$ if the residual is small. The algorithm will solve $(g, x_0)$ with $\tilde{r}$, so with a small residual an exact solution to an approximate question will be found. Driscoll and Braun state

Expand All @@ -83,4 +84,5 @@ Practically these two observations lead to

For the first observation, the example of Wilkinson's polynomial is often used where $f(x) = (x-1)\cdot(x-2)\cdot \cdots\cdot(x-20)$. When expanded this function has exactness issues of typical floating point values, the condition number is large and some of the roots found are quite different from the mathematical values.

The second observation helps explain why a problem like finding the zero of $f(x) = x \cdot \exp(x)$ using Newton's method starting at $2$ might return a value like $5.89\dots$. The residual is checked to be zero in a *relative* manner which would basically use a tolerance of `atol + abs(xn)*rtol`. Functions with asymptotes of $0$ will eventually be smaller than this value.

The second observation follows from $f(x_n)$ monitoring the backward error and the product of the condition number and the backward error monitoring the forward error. This product is on the order of $|f(x_n)/f'(x_n)|$ or $|x_{n+1} - x_n|$.
Loading