Skip to content

Commit

Permalink
proofreading
Browse files Browse the repository at this point in the history
  • Loading branch information
lrnv committed Feb 20, 2024
1 parent 8655de5 commit 1f591f7
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 19 deletions.
35 changes: 17 additions & 18 deletions docs/src/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,10 @@ CurrentModule = Copulas

## Multivariate random vectors

This section gives some general definitions and tools about dependence structures. Along this journey through the mathematical theory of copulas, we link to the rest of the documentation for more specific and detailed arguments on particular points, or simply to link the technical documentation of our implementation.

The interested theoretical reader can take a look at the standard books on the subject [joe1997,cherubini2004,nelsen2006,joe2014](@cite) or more recently [mai2017, durante2015a, czado2019,grosser2021](@cite). We start here by defining a few concepts about dependence structures and copulas.

This section gives some general definitions and tools about dependence structures, multivariate random vectors and copulas. Along this journey through the mathematical theory of copulas, we link to the rest of the documentation for more specific and detailed arguments on particular points, or simply to the technical documentation of the actual implementation.
The interested theoretical reader can take a look at the standard books on the subject [joe1997,cherubini2004,nelsen2006,joe2014](@cite) or more recently [mai2017, durante2015a, czado2019,grosser2021](@cite).

We start here by defining a few concepts about dependence structures and copulas.
Consider a real valued random vector $\bm X = \left(X_1,...,X_d\right): \Omega \to \mathbb R^d$. The random variables $X_1,...,X_d$ are called the marginals of the random vector $\bm X$.

!!! info "Constructing random variables in Julia via `Distributions.jl`"
Expand All @@ -23,10 +22,10 @@ Consider a real valued random vector $\bm X = \left(X_1,...,X_d\right): \Omega \
nothing # hide
```

We refer to [Distributions.jl's documentation](https://github.com/JuliaStats/Distributions.jl) for more details on what you can do with these objects, but here we assume that you are familiar with their API.
We refer to [Distributions.jl's documentation](https://github.com/JuliaStats/Distributions.jl) for more details on what you can do with these objects. We assume here that you are familiar with their API.


The distribution of the random vector $\bm X$ can be characterized by its distribution function $F$:
The probability distribution of the random vector $\bm X$ can be characterized by its *distribution function* $F$:
```math
\begin{align*}
F(\bm x) &= \mathbb P\left(\bm X \le \bm x\right)\\
Expand Down Expand Up @@ -70,7 +69,7 @@ u = rand(C,10)
cdf(C,u)
```

At the grounds of the theory of copulas lies Sklar's Theorem [sklar1959](@cite), dating back from 1959.
One of the reasons that makes copulas so useful is discovered by Sklar [sklar1959](@cite) in 1959:

> **Theorem (Sklar):** For every random vector $\bm X$, there exists a copula $C$ such that
>
Expand Down Expand Up @@ -112,19 +111,19 @@ On the other hand, the [`pseudo()`](@ref Pseudo-observations) function computes

!!! note "Independent random vectors"

Distributions.jl proposes the [`product_distribution`](https://juliastats.org/Distributions.jl/stable/multivariate/#Product-distributions) function to create those independent random vectors with given marginals. But you'll see that our approach is much more powerfull.
Distributions.jl proposes the [`product_distribution`](https://juliastats.org/Distributions.jl/stable/multivariate/#Product-distributions) function to create those independent random vectors with given marginals. But you can already see that our approach generalizes to other dependence structres, and is thus much powerfull.

Copulas are bounded functions
Copulas are bounded functions with values in [0,1] since they correspond to probabilities. But their range can be bounded more precisely:

> **Property (Fréchet-Hoeffding bounds [lux2017](@cite)):** For all $\bm x \in [0,1]^d$, every copula $C$ satisfies :
>
>$\langle \bm 1, \bm x - 1 + d^{-1}\rangle_{+} \le C(\bm x) \le \min \bm x,$
>where $y_{+} = \max(0,y)$.
> **Example (Fréchet-Hoeffding bounds [lux2017](@cite)):** The function $M : \bm x \mapsto \min\bm x$, called the upper Fréchet-Hoeffding bound, is a copula. The function $W : \bm x \mapsto \langle \bm 1, \bm x - 1 + d^{-1}\rangle_{+}$, called the lower Fréchet-Hoeffding bound, is on the other hand a copula only when $d=2$.
The function $M : \bm x \mapsto \min\bm x$, called the upper Fréchet-Hoeffding bound, is a copula. The function $W : \bm x \mapsto \langle \bm 1, \bm x - 1 + d^{-1}\rangle_{+}$, called the lower Fréchet-Hoeffding bound, is on the other hand a copula only when $d=2$.
These two copulas can be constructed through [`MCopula(d)`](@ref MGenerator) and [`WCopula(2)`](@ref WGenerator).


These two copulas can be constructed through [`MCopula(d)`](@ref MGenerator) and [`WCopula(2)`](@ref WGenerator). The upper Fréchet-Hoeffding bound corresponds to the case of comonotone random vector: a random vector $\bm X$ is said to be comonotone, i.e., to have copula $M$, when each of its marginals can be written as a non-decreasing transformation of the same random variable (say with $\mathcal U\left([0,1]\right)$ distribution). This is a simple but important dependence structure. See e.g.,[kaas2002,hua2017](@cite) on this particular copula. Note that sampling from them is quite straightforward due to their particular shape:
The upper Fréchet-Hoeffding bound corresponds to the case of comonotone random vector: a random vector $\bm X$ is said to be comonotone, i.e., to have copula $M$, when each of its marginals can be written as a non-decreasing transformation of the same random variable (say with $\mathcal U\left([0,1]\right)$ distribution). This is a simple but important dependence structure. See e.g.,[kaas2002,hua2017](@cite) on this particular copula. Note that the implementation of their sampler was straightforward due to their particular shapes:

```@example 1
rand(MCopula(2),10) # sampled values are all equal, this is comonotony
Expand All @@ -138,11 +137,11 @@ Since copulas are distribution functions, like distribution functions of real-va

## Fitting copulas and compound distributions.

`Distributions.jl` proposes the `fit` function in their API for random ve tors and random variables. We used it to implement fitting of multivariate models (copulas, of course, but also compound distributions). It can be used as follows:

`Distributions.jl`'s API contains a `fit` function for random vectors and random variables. We propose an implementation of it for copulas and multivariate compound distributions (composed of a copula and some given marginals). It can be used as follows:

```@example 2
using Copulas, Distributions, Random
# Construct a given model:
X₁ = Gamma(2,3)
X₂ = Pareto()
X₃ = LogNormal(0,1)
Expand All @@ -155,14 +154,14 @@ simu = rand(D,1000) # Generate a dataset
D̂ = fit(SklarDist{ClaytonCopula,Tuple{Gamma,Normal,LogNormal}}, simu)
```

We see on the output that the parameters were correctly estimated from this sample. More details on the estimator, including e.g. standard errors, can be obtained from e.g., Bayesian approach, see [this example](@ref Bayesian-inference-with-Turing.jl).
We see on the output that the parameters were correctly estimated from this sample. More details on the estimator, including, e.g., standard errors, may be obtained with more complicated estimation routines. For a Bayesian approach using `Turing.jl`, see [this example](@ref Bayesian-inference-with-Turing.jl).

!!! info "About fitting methods"
[`Distributions.jl` documentation](https://juliastats.org/Distributions.jl/stable/fit/#Distribution-Fitting) states that :
!!! info "Fitting procedures are not part of the API"
[`Distributions.jl` documentation](https://juliastats.org/Distributions.jl/stable/fit/#Distribution-Fitting) states that:

> The fit function will choose a reasonable way to fit the distribution, which, in most cases, is maximum likelihood estimation.

The results of this fitting function should then only be used as "quick-and-dirty" fits, since the fitting method is "hidden" to the user. We embrace this philosophy: from one copula to the other, the fitting method might not be the same.
The results of this fitting function should then only be used as "quick-and-dirty" fits, since the fitting method is "hidden" to the user and might even change without breaking releases. We embrace this philosophy: from one copula to the other, the fitting method might not be the same.

## Going further

Expand Down
1 change: 0 additions & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ The [Copulas.jl](https://github.com/lrnv/Copulas.jl) package provides a large co
Since copulas are distribution functions, we fully comply with the [`Distributions.jl`](https://github.com/JuliaStats/Distributions.jl) API. This compliance allows direct interoperability with other packages based on this API such as, e.g., [`Turing.jl`](https://github.com/TuringLang/Turing.jl).

Usually, people that use and work with copulas turn to the `R` package [`copula`](https://cran.r-project.org/web/packages/copula/copula.pdf). While still well-maintained and regularly updated, the `R` package `copula` is a complicated code base for readability, extensibility, reliability, and maintenance.

This is an attempt to provide a very light, fast, reliable and maintainable copula implementation in native Julia. Among others, one of the notable benefits of such a native implementation is the floating point type agnosticism, i.e., compatibility with `BigFloat`, [`DoubleFloats`](https://github.com/JuliaMath/DoubleFloats.jl), [`MultiFloats`](https://github.com/dzhang314/MultiFloats.jl) and other kind of numbers.


Expand Down

0 comments on commit 1f591f7

Please sign in to comment.