From a379b74dd91aaab927c1dd44ff69d95dab9876b4 Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Wed, 15 Jan 2025 10:05:15 +0000 Subject: [PATCH] build based on f3decf2 --- index.html | 1 - previews/PR362/distributions/index.html | 460 +--------------------- previews/PR362/examples/index.html | 492 +----------------------- previews/PR362/index.html | 464 +--------------------- previews/PR362/search/index.html | 460 +--------------------- previews/PR362/transforms/index.html | 478 +---------------------- 6 files changed, 32 insertions(+), 2323 deletions(-) diff --git a/index.html b/index.html index 3ac25969..6a5afc30 100644 --- a/index.html +++ b/index.html @@ -1,3 +1,2 @@ - diff --git a/previews/PR362/distributions/index.html b/previews/PR362/distributions/index.html index afc1719b..dae9beb9 100644 --- a/previews/PR362/distributions/index.html +++ b/previews/PR362/distributions/index.html @@ -1,462 +1,5 @@ Distributions.jl integration · Bijectors

Basic usage

Other than the logpdf_with_trans methods, the package also provides a more composable interface through the Bijector types. Consider for example the one from above with Beta(2, 2).

julia> using Random;
-
-
-
-
-
        Random.seed!(42);
 
 julia> using Bijectors;
@@ -477,5 +20,4 @@
 dist: Beta{Float64}(α=2.0, β=2.0)
 transform: Bijectors.Logit{Float64, Float64}(0.0, 1.0)
 )
julia> tdist isa UnivariateDistributiontrue

We can the then compute the logpdf for the resulting distribution:

julia> # Some example values
-       x = rand(dist)0.3729689708838569
julia> y = tdist.transform(x)-0.5195007994273649
julia> logpdf(tdist, y)-1.1142791350274923
- + x = rand(dist)0.3111007779639805
julia> y = tdist.transform(x)-0.7949780887258908
julia> logpdf(tdist, y)-1.2888378508955771 diff --git a/previews/PR362/examples/index.html b/previews/PR362/examples/index.html index ec54ca3d..856ec634 100644 --- a/previews/PR362/examples/index.html +++ b/previews/PR362/examples/index.html @@ -1,462 +1,5 @@ Examples · Bijectors

Univariate ADVI example

But the real utility of TransformedDistribution becomes more apparent when using transformed(dist, b) for any bijector b. To get the transformed distribution corresponding to the Beta(2, 2), we called transformed(dist) before. This is simply an alias for transformed(dist, bijector(dist)). Remember bijector(dist) returns the constrained-to-constrained bijector for that particular Distribution. But we can of course construct a TransformedDistribution using different bijectors with the same dist. This is particularly useful in something called Automatic Differentiation Variational Inference (ADVI).[2] An important part of ADVI is to approximate a constrained distribution, e.g. Beta, as follows:

  1. Sample x from a Normal with parameters μ and σ, i.e. x ~ Normal(μ, σ).
  2. Transform x to y s.t. y ∈ support(Beta), with the transform being a differentiable bijection with a differentiable inverse (a "bijector")

This then defines a probability density with same support as Beta! Of course, it's unlikely that it will be the same density, but it's an approximation. Creating such a distribution becomes trivial with Bijector and TransformedDistribution:

julia> using StableRNGs: StableRNG
julia> rng = StableRNG(42);
julia> dist = Beta(2, 2)Beta{Float64}(α=2.0, β=2.0)
julia> b = bijector(dist) # (0, 1) → ℝBijectors.Logit{Float64, Float64}(0.0, 1.0)
julia> b⁻¹ = inverse(b) # ℝ → (0, 1)Inverse{Bijectors.Logit{Float64, Float64}}(Bijectors.Logit{Float64, Float64}(0.0, 1.0))
julia> td = transformed(Normal(), b⁻¹) # x ∼ 𝓝(0, 1) then b(x) ∈ (0, 1)UnivariateTransformed{Normal{Float64}, Inverse{Bijectors.Logit{Float64, Float64}}}( - - - - - dist: Normal{Float64}(μ=0.0, σ=1.0) transform: Inverse{Bijectors.Logit{Float64, Float64}}(Bijectors.Logit{Float64, Float64}(0.0, 1.0)) )
julia> x = rand(rng, td) # ∈ (0, 1)0.3384404850130036

It's worth noting that support(Beta) is the closed interval [0, 1], while the constrained-to-unconstrained bijection, Logit in this case, is only well-defined as a map (0, 1) → ℝ for the open interval (0, 1). This is of course not an implementation detail. is itself open, thus no continuous bijection exists from a closed interval to . But since the boundaries of a closed interval has what's known as measure zero, this doesn't end up affecting the resulting density with support on the entire real line. In practice, this means that

julia> td = transformed(Beta())UnivariateTransformed{Beta{Float64}, Bijectors.Logit{Float64, Float64}}(
@@ -482,27 +25,27 @@
        # Construct the transform
julia> bs = bijector.(dists); # constrained-to-unconstrained bijectors for dists
julia> ibs = inverse.(bs); # invert, so we get unconstrained-to-constrained
julia> sb = Stacked(ibs, ranges) # => Stacked <: Bijector # Mean-field normal with unconstrained-to-constrained stacked bijectorStacked(Any[Inverse{Bijectors.Logit{Float64, Float64}}(Bijectors.Logit{Float64, Float64}(0.0, 1.0)), Base.Fix1{typeof(broadcast), typeof(exp)}(broadcast, exp), Inverse{Bijectors.SimplexBijector}(Bijectors.SimplexBijector())], Any[1:1, 2:2, 3:4], Any[1:1, 2:2, 3:5])
julia> td = transformed(d, sb);
julia> y = rand(td)5-element Vector{Float64}: - 0.24818677837740177 - 0.5928364697010045 - 0.6248150010831096 - 0.18142537934545722 - 0.19375961957143317
julia> 0.0 ≤ y[1] ≤ 1.0true
julia> 0.0 < y[2]true
julia> sum(y[3:4]) ≈ 1.0false

Normalizing flows

A very interesting application is that of normalizing flows.[1] Usually this is done by sampling from a multivariate normal distribution, and then transforming this to a target distribution using invertible neural networks. Currently there are two such transforms available in Bijectors.jl: PlanarLayer and RadialLayer. Let's create a flow with a single PlanarLayer:

julia> d = MvNormal(zeros(2), ones(2));
julia> b = PlanarLayer(2)PlanarLayer(w = [-0.808925540032895, -1.1433193634422454], u = [1.8924634285828001, 0.3041951627443107], b = [-1.0037623366490902])
julia> flow = transformed(d, b)MultivariateTransformed{DiagNormal, PlanarLayer{Vector{Float64}, Vector{Float64}}}( + 0.5986401608225392 + 0.5200405446977606 + 0.6234172789506754 + 0.13996696153448207 + 0.23661575951484248
julia> 0.0 ≤ y[1] ≤ 1.0true
julia> 0.0 < y[2]true
julia> sum(y[3:4]) ≈ 1.0false

Normalizing flows

A very interesting application is that of normalizing flows.[1] Usually this is done by sampling from a multivariate normal distribution, and then transforming this to a target distribution using invertible neural networks. Currently there are two such transforms available in Bijectors.jl: PlanarLayer and RadialLayer. Let's create a flow with a single PlanarLayer:

julia> d = MvNormal(zeros(2), ones(2));
julia> b = PlanarLayer(2)PlanarLayer(w = [0.7420157228528924, -1.1122706487209677], u = [-0.5939452772218818, 0.41440043717052716], b = [0.1617738276713545])
julia> flow = transformed(d, b)MultivariateTransformed{DiagNormal, PlanarLayer{Vector{Float64}, Vector{Float64}}}( dist: DiagNormal( dim: 2 μ: [0.0, 0.0] Σ: [1.0 0.0; 0.0 1.0] ) -transform: PlanarLayer(w = [-0.808925540032895, -1.1433193634422454], u = [1.8924634285828001, 0.3041951627443107], b = [-1.0037623366490902]) +transform: PlanarLayer(w = [0.7420157228528924, -1.1122706487209677], u = [-0.5939452772218818, 0.41440043717052716], b = [0.1617738276713545]) )
julia> flow isa MultivariateDistributiontrue

That's it. Now we can sample from it using rand and compute the logpdf, like any other Distribution.

julia> y = rand(rng, flow)2-element Vector{Float64}:
- -1.7737457675455668
-  0.6652167059699576
julia> logpdf(flow, y) # uses inverse of `b`-1.6918276569402413

Similarily to the multivariate ADVI example, we could use Stacked to get a bounded flow:

julia> d = MvNormal(zeros(2), ones(2));
julia> ibs = inverse.(bijector.((InverseGamma(2, 3), Beta())));
julia> sb = Stacked(ibs) # == Stacked(ibs, [i:i for i = 1:length(ibs)]Stacked((Base.Fix1{typeof(broadcast), typeof(exp)}(broadcast, exp), Inverse{Bijectors.Logit{Float64, Float64}}(Bijectors.Logit{Float64, Float64}(0.0, 1.0))), (1:1, 2:2), (1:1, 2:2))
julia> b = sb ∘ PlanarLayer(2)Stacked((Base.Fix1{typeof(broadcast), typeof(exp)}(broadcast, exp), Inverse{Bijectors.Logit{Float64, Float64}}(Bijectors.Logit{Float64, Float64}(0.0, 1.0))), (1:1, 2:2), (1:1, 2:2)) ∘ PlanarLayer(w = [-1.011579002326965, -1.8476689044719092], u = [0.3290091128622954, -1.3933410354735298], b = [-1.8194606270930913])
julia> td = transformed(d, b);
julia> y = rand(rng, td)2-element Vector{Float64}: - 2.315637398951831 - 0.9112107234876439
julia> 0 < y[1]true
julia> 0 ≤ y[2] ≤ 1true

Want to fit the flow?

julia> using Zygote
+ -0.3337645283244014
+  0.2673166671897306
julia> logpdf(flow, y) # uses inverse of `b`-1.7276091344264914

Similarily to the multivariate ADVI example, we could use Stacked to get a bounded flow:

julia> d = MvNormal(zeros(2), ones(2));
julia> ibs = inverse.(bijector.((InverseGamma(2, 3), Beta())));
julia> sb = Stacked(ibs) # == Stacked(ibs, [i:i for i = 1:length(ibs)]Stacked((Base.Fix1{typeof(broadcast), typeof(exp)}(broadcast, exp), Inverse{Bijectors.Logit{Float64, Float64}}(Bijectors.Logit{Float64, Float64}(0.0, 1.0))), (1:1, 2:2), (1:1, 2:2))
julia> b = sb ∘ PlanarLayer(2)Stacked((Base.Fix1{typeof(broadcast), typeof(exp)}(broadcast, exp), Inverse{Bijectors.Logit{Float64, Float64}}(Bijectors.Logit{Float64, Float64}(0.0, 1.0))), (1:1, 2:2), (1:1, 2:2)) ∘ PlanarLayer(w = [0.7978102996606238, -1.3279319372428056], u = [0.7064143279484046, -0.503279023618515], b = [0.5819414970506178])
julia> td = transformed(d, b);
julia> y = rand(rng, td)2-element Vector{Float64}: + 3.8404787008384167 + 0.7883708001317883
julia> 0 < y[1]true
julia> 0 ≤ y[2] ≤ 1true

Want to fit the flow?

julia> using Zygote
        
        # Construct the flow.
julia> b = PlanarLayer(2) - # Convenient for extracting parameters and reconstructing the flow.PlanarLayer(w = [-0.5221868857639804, 1.2912535262924756], u = [-0.2721318813113839, -0.8936552464621412], b = [-0.47672753644468874])
julia> using Functors
julia> θs, reconstruct = Functors.functor(b); + # Convenient for extracting parameters and reconstructing the flow.PlanarLayer(w = [1.6791302794428598, 0.7080833957018406], u = [0.06429355293848117, -0.8344666197529531], b = [1.092849709816133])
julia> using Functors
julia> θs, reconstruct = Functors.functor(b); # Make the objective a `struct` to avoid capturing global variables.
julia> struct NLLObjective{R,D,T} reconstruct::R @@ -519,7 +62,7 @@ # Initial loss.
julia> @info "Initial loss: $(f(θs))" - # Train using gradient descent.[ Info: Initial loss: 3161.173444016327
julia> ε = 1e-3;
julia> for i in 1:100 + # Train using gradient descent.[ Info: Initial loss: 3240.907830268544
julia> ε = 1e-3;
julia> for i in 1:100 (∇s,) = Zygote.gradient(f, θs) θs = fmap(θs, ∇s) do θ, ∇ θ - ε .* ∇ @@ -528,9 +71,8 @@ # Final loss
julia> @info "Final loss: $(f(θs))" - # Very simple check to see if we learned something useful.[ Info: Final loss: 2848.5884252876695
julia> samples = rand(transformed(f.basedist, f.reconstruct(θs)), 1000);
julia> mean(eachcol(samples)) # ≈ [0, 0]2-element Vector{Float64}: - 0.04018112488335119 - 0.04546063637378019
julia> cov(samples; dims=2) # ≈ I2×2 Matrix{Float64}: - 0.984065 -0.0121647 - -0.0121647 0.957628

We can easily create more complex flows by simply doing PlanarLayer(10) ∘ PlanarLayer(10) ∘ RadialLayer(10) and so on.

- + # Very simple check to see if we learned something useful.[ Info: Final loss: 2883.758149006311
julia> samples = rand(transformed(f.basedist, f.reconstruct(θs)), 1000);
julia> mean(eachcol(samples)) # ≈ [0, 0]2-element Vector{Float64}: + 0.06970828446062367 + -0.02027640234193051
julia> cov(samples; dims=2) # ≈ I2×2 Matrix{Float64}: + 0.936051 -0.00703972 + -0.00703972 0.964018

We can easily create more complex flows by simply doing PlanarLayer(10) ∘ PlanarLayer(10) ∘ RadialLayer(10) and so on.

diff --git a/previews/PR362/index.html b/previews/PR362/index.html index e0ee34b1..204f8d60 100644 --- a/previews/PR362/index.html +++ b/previews/PR362/index.html @@ -1,462 +1,5 @@ Home · Bijectors

Bijectors.jl

This package implements a set of functions for transforming constrained random variables (e.g. simplexes, intervals) to Euclidean space. The 3 main functions implemented in this package are the link, invlink and logpdf_with_trans for a number of distributions.

Bijectors.linkFunction
link(d::Distribution, x)

Transforms the input x using the constrained-to-unconstrained bijector for distribution d.

See also: invlink.

Example

julia> using Bijectors
-
-
-
-
-
 
 julia> d = LogNormal()   # support is (0, Inf)
 LogNormal{Float64}(μ=0.0, σ=1.0)
@@ -468,7 +11,7 @@
 0.0
 
 julia> link(LogNormal(), 1.0)
-0.0
source
Bijectors.invlinkFunction
invlink(d::Distribution, y)

Performs the inverse transform on a value y that was transformed using the constrained-to-unconstrained bijector for distribution d.

It should hold that invlink(d, link(d, x)) = x.

See also: link.

Example

julia> using Bijectors
+0.0
source
Bijectors.invlinkFunction
invlink(d::Distribution, y)

Performs the inverse transform on a value y that was transformed using the constrained-to-unconstrained bijector for distribution d.

It should hold that invlink(d, link(d, x)) = x.

See also: link.

Example

julia> using Bijectors
 
 julia> d = LogNormal()    # support is (0, Inf)
 LogNormal{Float64}(μ=0.0, σ=1.0)
@@ -477,7 +20,7 @@
 0.0
 
 julia> invlink(LogNormal(), 0.0)
-1.0
source
Bijectors.logpdf_with_transFunction
logpdf_with_trans(d::Distribution, x, transform::Bool)

If transform is false, logpdf_with_trans calculates the log probability density function (logpdf) of distribution d at x.

If transform is true, x is transformed using the constrained-to-unconstrained bijector for distribution d, and then the logpdf of the resulting value is calculated with respect to the unconstrained (transformed) distribution. Equivalently, if x is distributed according to d and y = link(d, x) is distributed according to td = transformed(d), then logpdf_with_trans(d, x, true) = logpdf(td, y). This is accomplished by subtracting the log Jacobian of the transformation.

Example

julia> using Bijectors
+1.0
source
Bijectors.logpdf_with_transFunction
logpdf_with_trans(d::Distribution, x, transform::Bool)

If transform is false, logpdf_with_trans calculates the log probability density function (logpdf) of distribution d at x.

If transform is true, x is transformed using the constrained-to-unconstrained bijector for distribution d, and then the logpdf of the resulting value is calculated with respect to the unconstrained (transformed) distribution. Equivalently, if x is distributed according to d and y = link(d, x) is distributed according to td = transformed(d), then logpdf_with_trans(d, x, true) = logpdf(td, y). This is accomplished by subtracting the log Jacobian of the transformation.

Example

julia> using Bijectors
 
 julia> logpdf_with_trans(LogNormal(), ℯ, false)
 -2.4189385332046727
@@ -494,5 +37,4 @@
 
 julia> # The difference between the two is due to the Jacobian
        logabsdetjac(bijector(LogNormal()), ℯ)
--1
source

The distributions supported are:

  1. RealDistribution: Union{Cauchy, Gumbel, Laplace, Logistic, NoncentralT, Normal, NormalCanon, TDist},
  2. PositiveDistribution: Union{BetaPrime, Chi, Chisq, Erlang, Exponential, FDist, Frechet, Gamma, InverseGamma, InverseGaussian, Kolmogorov, LogNormal, NoncentralChisq, NoncentralF, Rayleigh, Weibull},
  3. UnitDistribution: Union{Beta, KSOneSided, NoncentralBeta},
  4. SimplexDistribution: Union{Dirichlet},
  5. PDMatDistribution: Union{InverseWishart, Wishart}, and
  6. TransformDistribution: Union{T, Truncated{T}} where T<:ContinuousUnivariateDistribution.

All exported names from the Distributions.jl package are reexported from Bijectors.

Bijectors.jl also provides a nice interface for working with these maps: composition, inversion, etc. The following table lists mathematical operations for a bijector and the corresponding code in Bijectors.jl.

OperationMethodAutomatic
b ↦ b⁻¹inverse(b)
(b₁, b₂) ↦ (b₁ ∘ b₂)b₁ ∘ b₂
(b₁, b₂) ↦ [b₁, b₂]stack(b₁, b₂)
x ↦ b(x)b(x)×
y ↦ b⁻¹(y)inverse(b)(y)×
x ↦ log|det J(b, x)|logabsdetjac(b, x)AD
x ↦ b(x), log|det J(b, x)|with_logabsdet_jacobian(b, x)
p ↦ q := b_* pq = transformed(p, b)
y ∼ qy = rand(q)
p ↦ b such that support(b_* p) = ℝᵈbijector(p)
(x ∼ p, b(x), log|det J(b, x)|, log q(y))forward(q)

In this table, b denotes a Bijector, J(b, x) denotes the Jacobian of b evaluated at x, b_* denotes the push-forward of p by b, and x ∼ p denotes x sampled from the distribution with density p.

The "Automatic" column in the table refers to whether or not you are required to implement the feature for a custom Bijector. "AD" refers to the fact that it can be implemented "automatically" using automatic differentiation.

- +-1source

The distributions supported are:

  1. RealDistribution: Union{Cauchy, Gumbel, Laplace, Logistic, NoncentralT, Normal, NormalCanon, TDist},
  2. PositiveDistribution: Union{BetaPrime, Chi, Chisq, Erlang, Exponential, FDist, Frechet, Gamma, InverseGamma, InverseGaussian, Kolmogorov, LogNormal, NoncentralChisq, NoncentralF, Rayleigh, Weibull},
  3. UnitDistribution: Union{Beta, KSOneSided, NoncentralBeta},
  4. SimplexDistribution: Union{Dirichlet},
  5. PDMatDistribution: Union{InverseWishart, Wishart}, and
  6. TransformDistribution: Union{T, Truncated{T}} where T<:ContinuousUnivariateDistribution.

All exported names from the Distributions.jl package are reexported from Bijectors.

Bijectors.jl also provides a nice interface for working with these maps: composition, inversion, etc. The following table lists mathematical operations for a bijector and the corresponding code in Bijectors.jl.

OperationMethodAutomatic
b ↦ b⁻¹inverse(b)
(b₁, b₂) ↦ (b₁ ∘ b₂)b₁ ∘ b₂
(b₁, b₂) ↦ [b₁, b₂]stack(b₁, b₂)
x ↦ b(x)b(x)×
y ↦ b⁻¹(y)inverse(b)(y)×
x ↦ log|det J(b, x)|logabsdetjac(b, x)AD
x ↦ b(x), log|det J(b, x)|with_logabsdet_jacobian(b, x)
p ↦ q := b_* pq = transformed(p, b)
y ∼ qy = rand(q)
p ↦ b such that support(b_* p) = ℝᵈbijector(p)
(x ∼ p, b(x), log|det J(b, x)|, log q(y))forward(q)

In this table, b denotes a Bijector, J(b, x) denotes the Jacobian of b evaluated at x, b_* denotes the push-forward of p by b, and x ∼ p denotes x sampled from the distribution with density p.

The "Automatic" column in the table refers to whether or not you are required to implement the feature for a custom Bijector. "AD" refers to the fact that it can be implemented "automatically" using automatic differentiation.

diff --git a/previews/PR362/search/index.html b/previews/PR362/search/index.html index 1084bc0f..f18ab09b 100644 --- a/previews/PR362/search/index.html +++ b/previews/PR362/search/index.html @@ -1,460 +1,2 @@ -Search · Bijectors

Loading search...

    - - - - - - +Search · Bijectors

    Loading search...

      diff --git a/previews/PR362/transforms/index.html b/previews/PR362/transforms/index.html index 30f4d06d..f0b15cd8 100644 --- a/previews/PR362/transforms/index.html +++ b/previews/PR362/transforms/index.html @@ -1,471 +1,14 @@ Transforms · Bijectors

      Usage

      A very simple example of a "bijector"/diffeomorphism, i.e. a differentiable transformation with a differentiable inverse, is the exp function:

      • The inverse of exp is log.
      • The derivative of exp at an input x is simply exp(x), hence logabsdetjac is simply x.
      julia> using Bijectors
      julia> transform(exp, 1.0)2.718281828459045
      julia> logabsdetjac(exp, 1.0)1.0
      julia> with_logabsdet_jacobian(exp, 1.0)(2.718281828459045, 1.0)

      Some transformations are well-defined for different types of inputs, e.g. exp can also act elementwise on an N-dimensional Array{<:Real,N}. To specify that a transformation should act elementwise, we use the elementwise method:

      julia> x = ones(2, 2)2×2 Matrix{Float64}:
      -
      -
      -
      -
      -
        1.0  1.0
        1.0  1.0
      julia> transform(elementwise(exp), x)2×2 Matrix{Float64}: 2.71828 2.71828 2.71828 2.71828
      julia> logabsdetjac(elementwise(exp), x)4.0
      julia> with_logabsdet_jacobian(elementwise(exp), x)([2.718281828459045 2.718281828459045; 2.718281828459045 2.718281828459045], 4.0)

      These methods also work nicely for compositions of transformations:

      julia> transform(elementwise(log ∘ exp), x)2×2 Matrix{Float64}:
        1.0  1.0
      - 1.0  1.0

      Unlike exp, some transformations have parameters affecting the resulting transformation they represent, e.g. Logit has two parameters a and b representing the lower- and upper-bound, respectively, of its domain:

      julia> using Bijectors: Logit
      julia> f = Logit(0.0, 1.0)Bijectors.Logit{Float64, Float64}(0.0, 1.0)
      julia> f(rand()) # takes us from `(0, 1)` to `(-∞, ∞)`3.2562415483600335

      User-facing methods

      Without mutation:

      with_logabsdet_jacobian

      With mutation:

      Bijectors.transform!Function
      transform!(b, x[, y])

      Transform x using b, storing the result in y.

      If y is not provided, x is used as the output.

      source
      Bijectors.logabsdetjac!Function
      logabsdetjac!(b, x[, logjac])

      Compute log(abs(det(J(b, x)))) and store the result in logjac, where J(b, x) is the jacobian of b at x.

      source
      Bijectors.with_logabsdet_jacobian!Function
      with_logabsdet_jacobian!(b, x[, y, logjac])

      Compute transform(b, x) and logabsdetjac(b, x), storing the result in y and logjac, respetively.

      If y is not provided, then x will be used in its place.

      Defaults to calling with_logabsdet_jacobian(b, x) and updating y and logjac with the result.

      source

      Implementing a transformation

      Any callable can be made into a bijector by providing an implementation of ChangeOfVariables.with_logabsdet_jacobian(b, x).

      You can also optionally implement transform and logabsdetjac to avoid redundant computations. This is usually only worth it if you expect transform or logabsdetjac to be used heavily without the other.

      Similarly with the mutable versions with_logabsdet_jacobian!, transform!, and logabsdetjac!.

      Working with Distributions.jl

      Bijectors.bijectorFunction
      bijector(d::Distribution)

      Returns the constrained-to-unconstrained bijector for distribution d.

      source
      Bijectors.transformedMethod
      transformed(d::Distribution)
      -transformed(d::Distribution, b::Bijector)

      Couples distribution d with the bijector b by returning a TransformedDistribution.

      If no bijector is provided, i.e. transformed(d) is called, then transformed(d, bijector(d)) is returned.

      source

      Utilities

      Bijectors.elementwiseFunction
      elementwise(f)

      Alias for Base.Fix1(broadcast, f).

      In the case where f::ComposedFunction, the result is Base.Fix1(broadcast, f.outer) ∘ Base.Fix1(broadcast, f.inner) rather than Base.Fix1(broadcast, f).

      source
      Bijectors.isclosedformMethod
      isclosedform(b::Transform)::bool
      -isclosedform(b⁻¹::Inverse{<:Transform})::bool

      Returns true or false depending on whether or not evaluation of b has a closed-form implementation.

      Most transformations have closed-form evaluations, but there are cases where this is not the case. For example the inverse evaluation of PlanarLayer requires an iterative procedure to evaluate.

      source

      API

      Bijectors.TransformType

      Abstract type for a transformation.

      Implementing

      A subtype of Transform of should at least implement transform(b, x).

      If the Transform is also invertible:

      • Required:
        • Either of the following:
          • transform(::Inverse{<:MyTransform}, x): the transform for its inverse.
          • InverseFunctions.inverse(b::MyTransform): returns an existing Transform.
        • logabsdetjac: computes the log-abs-det jacobian factor.
      • Optional:
        • with_logabsdet_jacobian: transform and logabsdetjac combined. Useful in cases where we can exploit shared computation in the two.

      For the above methods, there are mutating versions which can optionally be implemented:

      source
      Bijectors.BijectorType

      Abstract type of a bijector, i.e. differentiable bijection with differentiable inverse.

      source
      Bijectors.InverseType
      inverse(b::Transform)
      -Inverse(b::Transform)

      A Transform representing the inverse transform of b.

      source

      Bijectors

      Bijectors.CorrBijectorType
      CorrBijector <: Bijector

      A bijector implementation of Stan's parametrization method for Correlation matrix: https://mc-stan.org/docs/2_23/reference-manual/correlation-matrix-transform-section.html

      Basically, a unconstrained strictly upper triangular matrix y is transformed to a correlation matrix by following readable but not that efficient form:

      K = size(y, 1)
      + 1.0  1.0

      Unlike exp, some transformations have parameters affecting the resulting transformation they represent, e.g. Logit has two parameters a and b representing the lower- and upper-bound, respectively, of its domain:

      julia> using Bijectors: Logit
      julia> f = Logit(0.0, 1.0)Bijectors.Logit{Float64, Float64}(0.0, 1.0)
      julia> f(rand()) # takes us from `(0, 1)` to `(-∞, ∞)`1.2127738543575732

      User-facing methods

      Without mutation:

      with_logabsdet_jacobian

      With mutation:

      Bijectors.transform!Function
      transform!(b, x[, y])

      Transform x using b, storing the result in y.

      If y is not provided, x is used as the output.

      source
      Bijectors.logabsdetjac!Function
      logabsdetjac!(b, x[, logjac])

      Compute log(abs(det(J(b, x)))) and store the result in logjac, where J(b, x) is the jacobian of b at x.

      source
      Bijectors.with_logabsdet_jacobian!Function
      with_logabsdet_jacobian!(b, x[, y, logjac])

      Compute transform(b, x) and logabsdetjac(b, x), storing the result in y and logjac, respetively.

      If y is not provided, then x will be used in its place.

      Defaults to calling with_logabsdet_jacobian(b, x) and updating y and logjac with the result.

      source

      Implementing a transformation

      Any callable can be made into a bijector by providing an implementation of ChangeOfVariables.with_logabsdet_jacobian(b, x).

      You can also optionally implement transform and logabsdetjac to avoid redundant computations. This is usually only worth it if you expect transform or logabsdetjac to be used heavily without the other.

      Similarly with the mutable versions with_logabsdet_jacobian!, transform!, and logabsdetjac!.

      Working with Distributions.jl

      Bijectors.bijectorFunction
      bijector(d::Distribution)

      Returns the constrained-to-unconstrained bijector for distribution d.

      source
      Bijectors.transformedMethod
      transformed(d::Distribution)
      +transformed(d::Distribution, b::Bijector)

      Couples distribution d with the bijector b by returning a TransformedDistribution.

      If no bijector is provided, i.e. transformed(d) is called, then transformed(d, bijector(d)) is returned.

      source

      Utilities

      Bijectors.elementwiseFunction
      elementwise(f)

      Alias for Base.Fix1(broadcast, f).

      In the case where f::ComposedFunction, the result is Base.Fix1(broadcast, f.outer) ∘ Base.Fix1(broadcast, f.inner) rather than Base.Fix1(broadcast, f).

      source
      Bijectors.isclosedformMethod
      isclosedform(b::Transform)::bool
      +isclosedform(b⁻¹::Inverse{<:Transform})::bool

      Returns true or false depending on whether or not evaluation of b has a closed-form implementation.

      Most transformations have closed-form evaluations, but there are cases where this is not the case. For example the inverse evaluation of PlanarLayer requires an iterative procedure to evaluate.

      source

      API

      Bijectors.TransformType

      Abstract type for a transformation.

      Implementing

      A subtype of Transform of should at least implement transform(b, x).

      If the Transform is also invertible:

      • Required:
        • Either of the following:
          • transform(::Inverse{<:MyTransform}, x): the transform for its inverse.
          • InverseFunctions.inverse(b::MyTransform): returns an existing Transform.
        • logabsdetjac: computes the log-abs-det jacobian factor.
      • Optional:
        • with_logabsdet_jacobian: transform and logabsdetjac combined. Useful in cases where we can exploit shared computation in the two.

      For the above methods, there are mutating versions which can optionally be implemented:

      source
      Bijectors.BijectorType

      Abstract type of a bijector, i.e. differentiable bijection with differentiable inverse.

      source
      Bijectors.InverseType
      inverse(b::Transform)
      +Inverse(b::Transform)

      A Transform representing the inverse transform of b.

      source

      Bijectors

      Bijectors.CorrBijectorType
      CorrBijector <: Bijector

      A bijector implementation of Stan's parametrization method for Correlation matrix: https://mc-stan.org/docs/2_23/reference-manual/correlation-matrix-transform-section.html

      Basically, a unconstrained strictly upper triangular matrix y is transformed to a correlation matrix by following readable but not that efficient form:

      K = size(y, 1)
       z = tanh.(y)
       
       for j=1:K, i=1:K
      @@ -489,12 +32,12 @@
       [w1'w1 w1'w2 ... w1'wn;
        w2'w1 w2'w2 ... w2'wn;
        ...
      -]

      The diagonal elements are given by wk'wk = 1, thus x is a correlation matrix.

      Every step is invertible, so this is a bijection(bijector).

      Note: The implementation doesn't follow their "manageable expression" directly, because their equation seems wrong (7/30/2020). Insteadly it follows definition above the "manageable expression" directly, which is also described in above doc.

      source
      Bijectors.LeakyReLUType
      LeakyReLU{T}(α::T) <: Bijector

      Defines the invertible mapping

      x ↦ x if x ≥ 0 else αx

      where α > 0.

      source
      Bijectors.StackedType
      Stacked(bs)
      +]

      The diagonal elements are given by wk'wk = 1, thus x is a correlation matrix.

      Every step is invertible, so this is a bijection(bijector).

      Note: The implementation doesn't follow their "manageable expression" directly, because their equation seems wrong (7/30/2020). Insteadly it follows definition above the "manageable expression" directly, which is also described in above doc.

      source
      Bijectors.LeakyReLUType
      LeakyReLU{T}(α::T) <: Bijector

      Defines the invertible mapping

      x ↦ x if x ≥ 0 else αx

      where α > 0.

      source
      Bijectors.StackedType
      Stacked(bs)
       Stacked(bs, ranges)
       stack(bs::Bijector...)

      A Bijector which stacks bijectors together which can then be applied to a vector where bs[i]::Bijector is applied to x[ranges[i]]::UnitRange{Int}.

      Arguments

      • bs can be either a Tuple or an AbstractArray of 0- and/or 1-dimensional bijectors
        • If bs is a Tuple, implementations are type-stable using generated functions
        • If bs is an AbstractArray, implementations are not type-stable and use iterative methods
      • ranges needs to be an iterable consisting of UnitRange{Int}
        • length(bs) == length(ranges) needs to be true.

      Examples

      b1 = Logit(0.0, 1.0)
       b2 = identity
       b = stack(b1, b2)
      -b([0.0, 1.0]) == [b1(0.0), 1.0]  # => true
      source
      Bijectors.RationalQuadraticSplineType
      RationalQuadraticSpline{T} <: Bijector

      Implementation of the Rational Quadratic Spline flow [1].

      • Outside of the interval [minimum(widths), maximum(widths)], this mapping is given by the identity map.
      • Inside the interval it's given by a monotonic spline (i.e. monotonic polynomials connected at intermediate points) with endpoints fixed so as to continuously transform into the identity map.

      For the sake of efficiency, there are separate implementations for 0-dimensional and 1-dimensional inputs.

      Notes

      There are two constructors for RationalQuadraticSpline:

      • RationalQuadraticSpline(widths, heights, derivatives): it is assumed that widths,

      heights, and derivatives satisfy the constraints that makes this a valid bijector, i.e.

      • widths: monotonically increasing and length(widths) == K,
      • heights: monotonically increasing and length(heights) == K,
      • derivatives: non-negative and derivatives[1] == derivatives[end] == 1.
      • RationalQuadraticSpline(widths, heights, derivatives, B): other than than the lengths, no assumptions are made on parameters. Therefore we will transform the parameters s.t.:
      • widths_new ∈ [-B, B]ᴷ⁺¹, where K == length(widths),
      • heights_new ∈ [-B, B]ᴷ⁺¹, where K == length(heights),
      • derivatives_new ∈ (0, ∞)ᴷ⁺¹ with derivatives_new[1] == derivates_new[end] == 1, where (K - 1) == length(derivatives).

      Examples

      Univariate

      julia> using StableRNGs: StableRNG; rng = StableRNG(42);  # For reproducibility.
      +b([0.0, 1.0]) == [b1(0.0), 1.0]  # => true
      source
      Bijectors.RationalQuadraticSplineType
      RationalQuadraticSpline{T} <: Bijector

      Implementation of the Rational Quadratic Spline flow [1].

      • Outside of the interval [minimum(widths), maximum(widths)], this mapping is given by the identity map.
      • Inside the interval it's given by a monotonic spline (i.e. monotonic polynomials connected at intermediate points) with endpoints fixed so as to continuously transform into the identity map.

      For the sake of efficiency, there are separate implementations for 0-dimensional and 1-dimensional inputs.

      Notes

      There are two constructors for RationalQuadraticSpline:

      • RationalQuadraticSpline(widths, heights, derivatives): it is assumed that widths,

      heights, and derivatives satisfy the constraints that makes this a valid bijector, i.e.

      • widths: monotonically increasing and length(widths) == K,
      • heights: monotonically increasing and length(heights) == K,
      • derivatives: non-negative and derivatives[1] == derivatives[end] == 1.
      • RationalQuadraticSpline(widths, heights, derivatives, B): other than than the lengths, no assumptions are made on parameters. Therefore we will transform the parameters s.t.:
      • widths_new ∈ [-B, B]ᴷ⁺¹, where K == length(widths),
      • heights_new ∈ [-B, B]ᴷ⁺¹, where K == length(heights),
      • derivatives_new ∈ (0, ∞)ᴷ⁺¹ with derivatives_new[1] == derivates_new[end] == 1, where (K - 1) == length(derivatives).

      Examples

      Univariate

      julia> using StableRNGs: StableRNG; rng = StableRNG(42);  # For reproducibility.
       
       julia> using Bijectors: RationalQuadraticSpline
       
      @@ -531,7 +74,7 @@
       julia> b([-1., 5.])
       2-element Vector{Float64}:
        -1.5660106244288925
      -  5.0

      References

      [1] Durkan, C., Bekasov, A., Murray, I., & Papamakarios, G., Neural Spline Flows, CoRR, arXiv:1906.04032 [stat.ML], (2019).

      source
      Bijectors.CouplingType
      Coupling{F, M}(θ::F, mask::M)

      Implements a coupling-layer as defined in [1].

      Examples

      julia> using Bijectors: Shift, Coupling, PartitionMask, coupling, couple
      +  5.0

      References

      [1] Durkan, C., Bekasov, A., Murray, I., & Papamakarios, G., Neural Spline Flows, CoRR, arXiv:1906.04032 [stat.ML], (2019).

      source
      Bijectors.CouplingType
      Coupling{F, M}(θ::F, mask::M)

      Implements a coupling-layer as defined in [1].

      Examples

      julia> using Bijectors: Shift, Coupling, PartitionMask, coupling, couple
       
       julia> m = PartitionMask(3, [1], [2]); # <= going to use x[2] to parameterize transform of x[1]
       
      @@ -558,7 +101,7 @@
       Shift{Vector{Float64}}([2.0])
       
       julia> with_logabsdet_jacobian(cl, x)
      -([3.0, 2.0, 3.0], 0.0)

      References

      [1] Kobyzev, I., Prince, S., & Brubaker, M. A., Normalizing flows: introduction and ideas, CoRR, (), (2019).

      source
      Bijectors.NamedTransformType
      NamedTransform <: AbstractNamedTransform

      Wraps a NamedTuple of key -> Bijector pairs, implementing evaluation, inversion, etc.

      Examples

      julia> using Bijectors: NamedTransform, Scale
      +([3.0, 2.0, 3.0], 0.0)

      References

      [1] Kobyzev, I., Prince, S., & Brubaker, M. A., Normalizing flows: introduction and ideas, CoRR, (), (2019).

      source
      Bijectors.NamedTransformType
      NamedTransform <: AbstractNamedTransform

      Wraps a NamedTuple of key -> Bijector pairs, implementing evaluation, inversion, etc.

      Examples

      julia> using Bijectors: NamedTransform, Scale
       
       julia> b = NamedTransform((a = Scale(2.0), b = exp));
       
      @@ -568,7 +111,7 @@
       (a = 2.0, b = 1.0, c = 42.0)
       
       julia> (a = 2 * x.a, b = exp(x.b), c = x.c)
      -(a = 2.0, b = 1.0, c = 42.0)
      source
      Bijectors.NamedCouplingType
      NamedCoupling{target, deps, F} <: AbstractNamedTransform

      Implements a coupling layer for named bijectors.

      See also: Coupling

      Examples

      julia> using Bijectors: NamedCoupling, Scale
      +(a = 2.0, b = 1.0, c = 42.0)
      source
      Bijectors.NamedCouplingType
      NamedCoupling{target, deps, F} <: AbstractNamedTransform

      Implements a coupling layer for named bijectors.

      See also: Coupling

      Examples

      julia> using Bijectors: NamedCoupling, Scale
       
       julia> b = NamedCoupling(:b, (:a, :c), (a, c) -> Scale(a + c));
       
      @@ -578,5 +121,4 @@
       (a = 1.0, b = 8.0, c = 3.0)
       
       julia> (a = x.a, b = (x.a + x.c) * x.b, c = x.c)
      -(a = 1.0, b = 8.0, c = 3.0)
      source
      - +(a = 1.0, b = 8.0, c = 3.0)source