jverzani · jverzani · Oct 31, 2024 · Oct 31, 2024 · Oct 31, 2024 · Oct 31, 2024
diff --git a/quarto/.gitignore b/quarto/.gitignore
@@ -3,4 +3,5 @@
 /_freeze/
 /*/*_files/
 /*/*.ipynb/
+/*/references.bib
 weave_support.jl
diff --git a/quarto/ODEs/differential_equations.qmd b/quarto/ODEs/differential_equations.qmd
@@ -71,12 +71,13 @@ $$
 
 The author's apply this model to  flu statistics from Hong Kong where:
 
-
+$$
 \begin{align*}
 S(0) &= 7,900,000\\
 I(0) &= 10\\
 R(0) &= 0\\
 \end{align*}
+$$
 
 In `Julia` we define these, `N` to model the total population, and `u0` to be the proportions.
 
@@ -130,12 +131,13 @@ The plot shows steady decay, as there is no mixing of infected with others.
 
 Adding in the interaction requires a bit more work. We now have what is known as a *system* of equations:
 
-
+$$
 \begin{align*}
 \frac{ds}{dt} &= -b   \cdot s(t)  \cdot  i(t)\\
 \frac{di}{dt} &= b  \cdot  s(t)  \cdot  i(t) - k  \cdot  i(t)\\
 \frac{dr}{dt} &= k  \cdot  i(t)\\
 \end{align*}
+$$
 
 Systems of equations can be solved in a similar manner as a single ordinary differential equation, though adjustments are made to accommodate the multiple functions.
 
@@ -277,11 +279,12 @@ We now solve numerically the problem of a trajectory with a drag force from air
 
 The general model is:
 
-
+$$
 \begin{align*}
 x''(t) &= - W(t,x(t), x'(t), y(t), y'(t)) \cdot x'(t)\\
 y''(t) &= -g - W(t,x(t), x'(t), y(t), y'(t)) \cdot y'(t)\\
 \end{align*}
+$$
 
 with initial conditions: $x(0) = y(0) = 0$ and $x'(0) = v_0 \cos(\theta), y'(0) = v_0 \sin(\theta)$.
 

diff --git a/quarto/ODEs/euler.qmd b/quarto/ODEs/euler.qmd
@@ -602,12 +602,13 @@ $$
 
 We can try the Euler method here. A simple approach might be this iteration scheme:
 
-
+$$
 \begin{align*}
 x_{n+1} &= x_n + h,\\
 u_{n+1} &= u_n + h v_n,\\
 v_{n+1} &= v_n - h \cdot g/l \cdot \sin(u_n).
 \end{align*}
+$$
 
 Here we need *two* initial conditions: one for the initial value $u(t_0)$ and the initial value of $u'(t_0)$. We have seen if we start at an angle $a$ and release the bob from rest, so $u'(0)=0$ we get a sinusoidal answer to the linearized model. What happens here? We let $a=1$, $l=5$ and $g=9.8$:
 

diff --git a/quarto/ODEs/odes.qmd b/quarto/ODEs/odes.qmd
@@ -86,12 +86,13 @@ $$
 
 Again, we can integrate to get an answer for any value $t$:
 
-
+$$
 \begin{align*}
 x(t) - x(t_0) &= \int_{t_0}^t \frac{dx}{dt} dt     \\
 &= (v_0t + \frac{1}{2}a t^2 - at_0 t) |_{t_0}^t    \\
 &= (v_0 - at_0)(t - t_0) + \frac{1}{2} a (t^2 - t_0^2).
 \end{align*}
+$$
 
 There are three constants: the initial value for the independent variable, $t_0$, and the two initial values for the velocity and position, $v_0, x_0$.  Assuming $t_0 = 0$, we can simplify the above to get a formula familiar from introductory physics:
 
@@ -336,11 +337,12 @@ Differential equations are classified according to their type. Different types h
 
 The first-order initial value equations we have seen can be described generally by
 
-
+$$
 \begin{align*}
 y'(x) &= F(y,x),\\
 y(x_0) &= x_0.
 \end{align*}
+$$
 
 Special cases include:
 
@@ -667,12 +669,13 @@ Though `y` is messy, it can be seen that the answer is a quadratic polynomial in
 
 In a resistive medium, there are drag forces at play. If this force is proportional to the velocity, say, with proportion $\gamma$, then the equations become:
 
-
+$$
 \begin{align*}
 x''(t) &= -\gamma x'(t), & \quad y''(t) &= -\gamma y'(t) -g, \\
 x(0) &= x_0, &\quad y(0) &= y_0,\\
 x'(0) &= v_0\cos(\alpha),&\quad y'(0) &= v_0 \sin(\alpha).
 \end{align*}
+$$
 
 We now attempt to solve these.
 

diff --git a/quarto/_make_pdf.jl b/quarto/_make_pdf.jl
@@ -17,11 +17,13 @@ execute:
 format:
   typst:
     toc: false
-    section-numbering: "1."
+    section-numbering: "1.1.1"
+    number-depth: 3
     keep-typ: false
     include-before-body:
       - text: |
         #set figure(placement: auto)
+bibliography: references.bib
 ---
 """
 

diff --git a/quarto/_quarto.yml b/quarto/_quarto.yml
@@ -1,4 +1,4 @@
-version: "0.21"
+version: "0.22"
 engine: julia
 
 project:

diff --git a/quarto/alternatives/SciML.qmd b/quarto/alternatives/SciML.qmd
@@ -573,13 +573,14 @@ As well, suppose we wanted to parameterize our function and then differentiate.
 
 Consider $d/dp \int_0^\pi \sin(px) dx$. We can do this integral directly to get
 
-
+$$
 \begin{align*}
 \frac{d}{dp} \int_0^\pi \sin(px)dx
 &= \frac{d}{dp}\left( \frac{-1}{p} \cos(px)\Big\rvert_0^\pi\right)\\
 &= \frac{d}{dp}\left( -\frac{\cos(p\cdot\pi)-1}{p}\right)\\
 &= \frac{\cos(p\cdot \pi) - 1}{p^2} + \frac{\pi\cdot\sin(p\cdot\pi)}{p}
 \end{align*}
+$$
 
 Using `Integrals` with `QuadGK` we have:
 

diff --git a/quarto/derivatives/condition.qmd b/quarto/derivatives/condition.qmd
@@ -2,73 +2,74 @@
 
 In Part III of @doi:10.1137/1.9781611977165 we find language of numerical analysis useful to  formally describe the zero-finding problem. Key concepts are errors, conditioning, and stability.
 
-Abstractly a *problem* is a mapping, or function, from a domain $X$ of data to a range $Y$ of solutions. Both $X$ and $Y$ have a sense of distance given by a *norm*. A norm is a generalization of the absolute value.
+Abstractly a *problem* is a mapping, $F$, from a domain $X$ of data to a range $Y$ of solutions. Both $X$ and $Y$ have a sense of distance given by a *norm*. A norm is a generalization of the absolute value and gives quantitative meaning to terms like small and large.
 
 
-> A *well-conditioned* problem is one with the property that all small perturbations of $x$ lead to only small changes in $f(x)$.
+> A *well-conditioned* problem is one with the property that all small perturbations of $x$ lead to only small changes in $F(x)$.
 
-This sense of "small" is quantified through a *condition number*.
+This sense of "small" is measured through a *condition number*.
 
-If we let $\delta_x$ be a small perturbation of $x$ then $\delta_f = f(x + \delta_x) - f(x)$.
+If we let $\delta_x$ be a small perturbation of $x$ then $\delta_F = F(x + \delta_x) - F(x)$.
 
-The *forward error* is $\lVert\delta_f\rVert = \lVert f(x+\delta_x) - f(x)\rVert$, the *relative forward error* is  $\lVert\delta_f\rVert/\lVert f\rVert = \lVert f(x+\delta_x) - f(x)\rVert/ \lVert f(x)\rVert$.
+The *forward error* is $\lVert\delta_F\rVert = \lVert F(x+\delta_x) - F(x)\rVert$, the *relative forward error* is  $\lVert\delta_F\rVert/\lVert F\rVert = \lVert F(x+\delta_x) - F(x)\rVert/ \lVert F(x)\rVert$.
 
 The *backward error* is $\lVert\delta_x\rVert$, the *relative backward error* is $\lVert\delta_x\rVert / \lVert x\rVert$.
 
- The *absolute condition number* $\hat{\kappa}$ is worst case of this ratio $\lVert\delta_f\rVert/ \lVert\delta_x\rVert$ as the perturbation size shrinks to $0$.
-The relative condition number $\kappa$ divides $\lVert\delta_f\rVert$ by $\lVert f(x)\rVert$ and $\lVert\delta_x\rVert$ by $\lVert x\rVert$ before taking the ratio.
+ The *absolute condition number* $\hat{\kappa}$ is worst case of this ratio $\lVert\delta_F\rVert/ \lVert\delta_x\rVert$ as the perturbation size shrinks to $0$.
+The relative condition number $\kappa$ divides $\lVert\delta_F\rVert$ by $\lVert F(x)\rVert$ and $\lVert\delta_x\rVert$ by $\lVert x\rVert$ before taking the ratio.
 
 
-A *problem* is a mathematical concept, an *algorithm* the computational version. Algorithms may differ for many reasons, such as floating point errors, tolerances, etc. We use notation $\tilde{f}$ to indicate the algorithm.
+A *problem* is a mathematical concept, an *algorithm* the computational version. Algorithms may differ for many reasons, such as floating point errors, tolerances, etc. We use notation $\tilde{F}$ to indicate the algorithm.
 
-The absolute error in the algorithm is $\lVert\tilde{f}(x) - f(x)\rVert$, the relative error divides by $\lVert f(x)\rVert$. A good algorithm would have smaller relative errors.
+The absolute error in the algorithm is $\lVert\tilde{F}(x) - F(x)\rVert$, the relative error divides by $\lVert F(x)\rVert$. A good algorithm would have smaller relative errors.
 
 An algorithm is called *stable* if
 
 $$
-\frac{\lVert\tilde{f}(x) - f(\tilde{x})\rVert}{\lVert f(\tilde{x})\rVert}
+\frac{\lVert\tilde{F}(x) - F(\tilde{x})\rVert}{\lVert F(\tilde{x})\rVert}
 $$
 
 is *small* for *some* $\tilde{x}$ relatively near $x$, $\lVert\tilde{x}-x\rVert/\lVert x\rVert$.
 
-> "A *stable* algorithm gives nearly the right answer to nearly the right question."
+> A *stable* algorithm gives nearly the right answer to nearly the right question.
 
-(The answer it gives is $\tilde{f}(x)$, the nearly right question is $f(\tilde{x})$.)
+(The answer it gives is $\tilde{F}(x)$, the nearly right question: what is $F(\tilde{x})$?)
 
-A related concept is an algorithm $\tilde{f}$ for a problem $f$ is *backward stable* if for each $x \in X$,
+A related concept is an algorithm $\tilde{F}$ for a problem $F$ is *backward stable* if for each $x \in X$,
 
 $$
-\tilde{f}(x) = f(\tilde{x})
+\tilde{F}(x) = F(\tilde{x})
 $$
 
-	for some $\tilde{x}$ with $\lVert\tilde{x} - x\rVert/\lVert x\rVert$ is small.
+for some $\tilde{x}$ with $\lVert\tilde{x} - x\rVert/\lVert x\rVert$ is small.
 
 > "A backward stable algorithm gives exactly the right answer to nearly the right question."
 
 
-The concepts are related by Trefethen and Bao's Theorem 15.1 which says  for a backward stable algorithm the relative error $\lVert\tilde{f}(x) - f(x)\rVert/\lVert f(x)\rVert$ is small in a manner proportional to the relative condition number.
+The concepts are related by Trefethen and Bao's Theorem 15.1 which says  for a backward stable algorithm the relative error $\lVert\tilde{F}(x) - F(x)\rVert/\lVert F(x)\rVert$ is small in a manner proportional to the relative condition number.
 
 Applying this to the zero-finding we follow @doi:10.1137/1.9781611975086.
 
-To be specific, the problem is finding a zero of $f$ starting at an initial point $x_0$. The data is $(f, x_0)$, the solution is $r$ a zero of $f$.
+To be specific, the problem, $F$, is finding a zero of a function $f$ starting at an initial point $x_0$. The data is $(f, x_0)$, the solution is $r$ a zero of $f$.
 
 Take the algorithm as Newton's method. Any implementation must incorporate tolerances, so this is a computational approximation to the problem. The data is the same, but technically we use $\tilde{f}$ for the function, as any computation is dependent on machine implementations. The output is $\tilde{r}$ an *approximate* zero.
 
-Suppose for sake of argument that $\tilde{f}(x) = f(x) + \epsilon$ and $r$ is a root of $f$ and $\tilde{r}$ is a root of $\tilde{f}$. Then by linearization:
+Suppose for sake of argument that $\tilde{f}(x) = f(x) + \epsilon$, $f$ has a continuous derivative, and $r$ is a root of $f$ and $\tilde{r}$ is a root of $\tilde{f}$. Then by linearization:
 
+$$
 \begin{align*}
 0 &= \tilde{f}(\tilde r) \\
   &= f(r + \delta) + \epsilon\\
   &\approx f(r) + f'(r)\delta + \epsilon\\
   &= 0 + f'(r)\delta + \epsilon
 \end{align*}
-
-Rearranging gives $\lVert\delta/\epsilon\rVert \approx 1/\lVert f'(r)\rVert$ leading to:
+$$
+Rearranging gives $\lVert\delta/\epsilon\rVert \approx 1/\lVert f'(r)\rVert$. But the $|\delta|/|\epsilon|$ ratio is related to the the condition number:
 
 > The absolute condition number is $\hat{\kappa}_r = |f'(r)|^{-1}$.
 
 
-The error formula in Newton's method includes the derivative in the denominator, so we see large condition numbers are tied into larger errors.
+The error formula in Newton's method measuring the distance between the actual root and an approximation includes the derivative in the denominator, so we see large condition numbers are tied into possibly larger errors.
 
 Now consider $g(x) = f(x) - f(\tilde{r})$. Call $f(\tilde{r})$ the residual. We have $g$ is near $f$ if the residual is small. The algorithm will solve $(g, x_0)$ with $\tilde{r}$, so with a small residual an exact solution to an approximate question will be found. Driscoll and Braun state
 
@@ -83,4 +84,5 @@ Practically these two observations lead to
 
 For the first observation, the example of Wilkinson's polynomial is often used where $f(x) = (x-1)\cdot(x-2)\cdot \cdots\cdot(x-20)$. When expanded this function has exactness issues of typical floating point values, the condition number is large and some of the roots found are quite different from the mathematical values.
 
-The second observation helps explain why a problem like finding the zero of $f(x) = x \cdot \exp(x)$ using Newton's method starting at $2$ might return a value like $5.89\dots$. The residual is checked to be zero in a *relative* manner which would basically use a tolerance of `atol + abs(xn)*rtol`. Functions with asymptotes of $0$ will eventually be smaller than this value.
+
+The second observation follows from $f(x_n)$ monitoring the backward error and the product of the condition number and the backward error monitoring the forward error. This product is on the order of $|f(x_n)/f'(x_n)|$ or $|x_{n+1} - x_n|$.