diff --git a/CHANGELOG.md b/CHANGELOG.md
index 0a7d0c436..b1590a9f2 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,6 +2,7 @@
 
 ## Unreleased
 
+- De-interweave SpectralTransform [#587](https://github.com/SpeedyWeather/SpeedyWeather.jl/pull/587)
 - Rossby-Haurwitz wave initial conditions [#591](https://github.com/SpeedyWeather/SpeedyWeather.jl/pull/591)
 - Haversine formula and AbstractSphericalDistance [#588](https://github.com/SpeedyWeather/SpeedyWeather.jl/pull/588)
 - Array-agnostic SpectralTransform [#583](https://github.com/SpeedyWeather/SpeedyWeather.jl/pull/583)
diff --git a/docs/src/speedytransforms.md b/docs/src/speedytransforms.md
index 47b347c25..cd77fb60e 100644
--- a/docs/src/speedytransforms.md
+++ b/docs/src/speedytransforms.md
@@ -129,9 +129,7 @@ S = SpectralTransform(spectral_grid)
 Note that because we chose `dealiasing=3` (cubic truncation) we now match a T5 spectral field
 with a 12-ring octahedral Gaussian grid, instead of the 8 rings as above. So going from
 `dealiasing=2` (default) to `dealiasing=3` increased our resolution on the grid while the
-spectral resolution remains the same. The `SpectralTransform` also has options for the
-recomputation or pre-computation of the Legendre polynomials, fore more information see
-[(P)recompute Legendre polynomials](@ref).
+spectral resolution remains the same.
 
 Passing on `S` the `SpectralTransform` now allows us to transform directly on the grid
 defined therein. Note that we recreate `alms` to be of size 7x6 instead of 6x6 for T5
@@ -173,6 +171,43 @@ As you can see the `7x6 LowerTriangularMatrix` in the description above dropped
 `6x6 LowerTriangularMatrix`, this is the size of the input that is expected (otherwise
 you will get a `BoundsError`).
 
+## `SpectralTransform` generators
+
+While you can always create a `SpectralTransform` from a `SpectralGrid` (which defines
+both spectral and grid space) there are other constructors/generators available:
+
+```@example speedytransforms
+SpectralTransform(alms)
+```
+
+Now we have defined the resolution of the spectral space through `alms` but create
+a `SpectralTransform` by making assumption about the grid space. E.g. `Grid=FullGaussianGrid`
+by default, `dealiasing=2` and `nlat_half` correspondingly. But you can also pass them
+on as keyword arguments, for example
+
+```@example speedytransforms
+SpectralTransform(alms, Grid=OctahedralClenshawGrid, nlat_half=24)
+```
+
+Only note that you don't want to specify both `nlat_half` and `dealiasing` as you would
+otherwise overspecify the grid resolution (`dealiasing` will be ignored in that case).
+This also works starting from the grid space
+
+```@example speedytransforms
+grid = rand(FullClenshawGrid, 12)
+SpectralTransform(grid)
+```
+
+where you can also provide spectral resolution `trunc` or `dealiasing`. You can also
+provide both a grid and a lower triangular matrix to describe both spaces
+
+```@example speedytransforms
+SpectralTransform(grid, alms)
+```
+
+and you will precompute the transform between them. For more generators see the
+docstrings at `?SpectralTransform`.
+
 ## Power spectrum
 
 How to take some data and compute a power spectrum with SpeedyTransforms you may ask.
@@ -227,7 +262,6 @@ to be used in grid-point space!
 
 ## Example: Creating k^n-distributed noise
 
-
 How would we construct random noise in spectral space that follows a certain
 power law and transform it back into grid-point space? Define the wavenumber ``k``
 for T31, the spectral resolution we are interested in.
@@ -260,37 +294,35 @@ nothing # hide
 You can always access the underlying data in `map` via `map.data` in case you
 need to get rid of the wrapping into a grid again!
 
-## (P)recompute Legendre polynomials
+## Precomputed polynomials and allocated memory
+
+!!! info "Reuse `SpectralTransform`s wherever possible"
+    Depending on horizontal and vertical resolution of spectral and grid space,
+    a `SpectralTransform` can be become very large in memory. Also the recomputation
+    of the polynomials and the planning of the FFTs are expensive compared to the
+    actual transform itself. Therefore reuse a `SpectralTransform` wherever possible.
 
 The spectral transform uses a Legendre transform in meridional direction. For this the
 Legendre polynomials are required, at each latitude ring this is a ``l_{max} \times m_{max}``
 lower triangular matrix. Storing precomputed Legendre polynomials therefore quickly
-increase in size with resolution. One can recompute them to save memory, but that 
-uses more arithmetic operations. There is therefore a memory-compute tradeoff.
+increase in size with resolution. It is therefore advised to reuse a precomputed
+`SpectralTransform` object wherever possible. This prevents transforms to allocate
+large memory which would otherwise be garbage collected again after the transform.
 
-For a single transform, there is no need to precompute the polynomials as the
-`SpectralTransform` object will be garbage collected again anyway. For low resolution
-simulations with many repeated small transforms it makes sense to precompute the
-polynomials and SpeedyWeather.jl does that automatically anyway. At very high resolution
-the polynomials may, however, become prohibitively large. An example at T127,
-about 100km resolution
+You get information about the size of that memory (both polynomials and required scratch memory)
+in the terminal "show" of a `SpectralTransform` object, e.g. at T127 resolution
+with 8 layers these are
 
 ```@example speedytransforms
-spectral_grid = SpectralGrid(trunc=127)
-SpectralTransform(spectral_grid, recompute_legendre=false)
-```
-the polynomials are about 3MB in size. Easy that is not much. But at T1023 on the
-O1536 octahedral Gaussian grid, this is already 1.5GB, cubically increasing with the
-spectral truncation T. `recompute_legendre=true` (default `false` when
-constructing a `SpectralTransform` object which may be reused) would lower this
-to kilobytes
-```@example speedytransforms
-SpectralTransform(spectral_grid, recompute_legendre=true)
+spectral_grid = SpectralGrid(trunc=127, nlayers=8)
+SpectralTransform(spectral_grid)
 ```
 
 ## Batched Transforms 
 
-SpeedyTransforms also supports batched transforms. With batched input data the `transform` is performed along the leading dimension, and all further dimensions are interpreted as batch dimensions. Take for example 
+SpeedyTransforms also supports batched transforms. With batched input data the `transform`
+is performed along the leading dimension, and all further dimensions are interpreted as
+batch dimensions. Take for example 
 
 ```@example speedytransforms 
 alms = randn(LowerTriangularMatrix{Complex{Float32}}, 32, 32, 5) 
diff --git a/src/LowerTriangularMatrices/lower_triangular_array.jl b/src/LowerTriangularMatrices/lower_triangular_array.jl
index 06d0ef4c7..e852c9aa8 100644
--- a/src/LowerTriangularMatrices/lower_triangular_array.jl
+++ b/src/LowerTriangularMatrices/lower_triangular_array.jl
@@ -55,19 +55,32 @@ abstract type OneBased <: IndexBasis end
 # get matrix size of LTA from its data array and m, n (number of rows and columns)
 matrix_size(data::AbstractArray, m::Integer, n::Integer) = (m, n, size(data)[2:end]...)
 
+# extend to get the size of the i-th dimension, with 1 returned for any additional dimension
+# as it is also defined for Array
+function matrix_size(data::AbstractArray, m::Integer, n::Integer, i::Integer)
+    i == 1 && return m      # first dimension is the number of rows m
+    i == 2 && return n      # second dimension is the number of columns n
+    return size(data, i-1)  # -1 as m, n are collapsed into a vector in the data array
+end 
+
 """$(TYPEDSIGNATURES)
 Size of a `LowerTriangularArray` defined as size of the flattened array if `as <: AbstractVector`
 and as if it were a full matrix when `as <: AbstractMatrix`` ."""
 Base.size(L::LowerTriangularArray, base::Type{<:IndexBasis}=OneBased; as=Vector) = size(L, base, as)
-Base.size(L::LowerTriangularArray, i::Integer, base::Type{<:IndexBasis}=OneBased; as=Vector) = size(L, base; as=as)[i]
+Base.size(L::LowerTriangularArray, i::Integer, base::Type{<:IndexBasis}=OneBased; as=Vector) = size(L, i, base, as)
 
 # use multiple dispatch to chose the right options of basis and vector/flat vs matrix indexing
 # matrix indexing can be zero based (natural for spherical harmonics) or one-based,
-# vector/flat indexing has only one based indexing
+# vector/flat indexing has only one-based indexing
 Base.size(L::LowerTriangularArray, base::Type{OneBased}, as::Type{Matrix}) = matrix_size(L.data, L.m, L.n)
 Base.size(L::LowerTriangularArray, base::Type{ZeroBased}, as::Type{Matrix}) = matrix_size(L.data, L.m-1, L.n-1)
 Base.size(L::LowerTriangularArray, base::Type{OneBased}, as::Type{Vector}) = size(L.data)
 
+# size(L, i, ...) to get the size of the i-th dimension, with 1 returned for any additional dimension 
+Base.size(L::LowerTriangularArray, i::Integer, base::Type{OneBased}, as::Type{Matrix}) = matrix_size(L.data, L.m, L.n, i)
+Base.size(L::LowerTriangularArray, i::Integer, base::Type{ZeroBased}, as::Type{Matrix}) = matrix_size(L.data, L.m-1, L.n-1, i)
+Base.size(L::LowerTriangularArray, i::Integer, base::Type{OneBased}, as::Type{Vector}) = size(L.data, i)
+
 # sizeof the underlying data vector
 Base.sizeof(L::LowerTriangularArray) = sizeof(L.data)
 
@@ -107,7 +120,6 @@ end
 
 function Base.array_summary(io::IO, L::LowerTriangularMatrix{T}, inds::Tuple{Vararg{Base.OneTo}}) where T
     mn = size(L; as=Matrix)
-    @info Base.dims2string(length.(inds))
     print(io, Base.dims2string(length.(inds)), ", $(mn[1])x$(mn[2]) LowerTriangularMatrix{$T}")
 end
 
diff --git a/src/SpeedyTransforms/SpeedyTransforms.jl b/src/SpeedyTransforms/SpeedyTransforms.jl
index b50c23236..c23bb0e11 100644
--- a/src/SpeedyTransforms/SpeedyTransforms.jl
+++ b/src/SpeedyTransforms/SpeedyTransforms.jl
@@ -14,8 +14,6 @@ import Primes
 using ..LowerTriangularMatrices
 using ..RingGrids
 
-const DEFAULT_GRID = FullGaussianGrid
-
 # TRANSFORM
 export  SpectralTransform,
         transform!,
@@ -42,7 +40,10 @@ export  spectral_truncation,
 
 include("aliasing.jl")
 include("legendrepolarray.jl")
+include("legendre_shortcuts.jl")
 include("spectral_transform.jl")
+include("fourier.jl")
+include("legendre.jl")
 include("spectral_gradients.jl")
 include("spectral_truncation.jl")
 include("spectrum.jl")
diff --git a/src/SpeedyTransforms/fourier.jl b/src/SpeedyTransforms/fourier.jl
new file mode 100644
index 000000000..10bb60810
--- /dev/null
+++ b/src/SpeedyTransforms/fourier.jl
@@ -0,0 +1,204 @@
+# function barrier for batched or serial transforms as FFTW plans cannot be reused for fewer vertical layers
+function _fourier!(f_north, f_south, grids::AbstractGridArray, S::SpectralTransform)
+    _fourier! = if size(grids, 2) == S.nlayers > 1
+        _fourier_batched!
+    else
+        _fourier_serial!
+    end
+    return _fourier!(f_north, f_south, grids, S)
+end
+
+# function barrier for batched or serial transforms as FFTW plans cannot be reused for fewer vertical layers
+function _fourier!(grids::AbstractGridArray, f_north, f_south, S::SpectralTransform)
+    _fourier! = if size(grids, 2) == S.nlayers > 1
+        _fourier_batched!
+    else
+        _fourier_serial!
+    end
+    return _fourier!(grids, f_north, f_south, S)
+end
+
+"""$(TYPEDSIGNATURES)
+(Forward) Fast Fourier transform (grid to spectral) in zonal direction of `grids`,
+stored in scratch memories `f_north`, `f_south` to be passed on to the Legendre transform.
+Batched version that requires the number of vertical layers to be the same as precomputed in `S`.
+Not to be called directly, use `transform!` instead."""
+function _fourier_batched!(                 # GRID TO SPECTRAL
+    f_north::AbstractArray{<:Complex, 3},   # Fourier-transformed output
+    f_south::AbstractArray{<:Complex, 3},   # and for southern latitudes
+    grids::AbstractGridArray,               # gridded input
+    S::SpectralTransform,                   # precomputed transform
+)
+    (; nlat, nlons, nlat_half) = S          # dimensions
+    (; rfft_plans) = S                      # pre-planned transforms
+    nlayers = size(grids, 2)                # number of vertical layers
+
+    @boundscheck ismatching(S, grids) || throw(DimensionMismatch(S, grids))
+    @boundscheck nlayers == S.nlayers || throw(DimensionMismatch(S, grids))
+    @boundscheck size(f_north) == size(f_south) == (S.nfreq_max, S.nlayers, nlat_half) || throw(DimensionMismatch(S, grids))
+
+    rings = eachring(grids)                 # precomputed ring indices
+    @inbounds for j_north in 1:nlat_half    # symmetry: loop over northern latitudes only
+        j = j_north                         # symmetric index / ring-away from pole index
+        j_south = nlat - j_north + 1        # corresponding southern latitude index
+        nlon = nlons[j]                     # number of longitudes on this ring
+        nfreq = nlon÷2 + 1                  # linear max Fourier frequency wrt to nlon
+        not_equator = j_north != j_south    # is the latitude ring not on equator?
+
+        rfft_plan = rfft_plans[j]           # FFT planned wrt nlon on ring
+        ilons = rings[j_north]              # in-ring indices northern ring
+        
+        # FOURIER TRANSFORM in zonal direction, northern latitude
+        # views and copies necessary for stride-1 outputs required by FFTW
+        ring_layers = view(grids.data, ilons, :)
+        out = reshape(view(S.scratch_memory_spec, 1:nfreq*nlayers), (nfreq, nlayers))
+        LinearAlgebra.mul!(out, rfft_plan, ring_layers)     # Northern latitude
+        f_north[1:nfreq, 1:nlayers, j] .= out               # copy into correct stride
+
+        # and southern latitude if not on Equator
+        ilons = rings[j_south]              # in-ring indices southern ring
+        ring_layers = view(grids.data, ilons, :)
+        if not_equator                      # skip FFT, redundant because north did that latitude already
+            LinearAlgebra.mul!(out, rfft_plan, ring_layers)
+        else
+            fill!(out, 0)
+        end
+        f_south[1:nfreq, 1:nlayers, j] .= out               # copy into correct stride
+    end
+end
+
+"""$(TYPEDSIGNATURES)
+(Forward) Fast Fourier transform (grid to spectral) in zonal direction of `grids`,
+stored in scratch memories `f_north`, `f_south` to be passed on to the Legendre transform.
+Serial version that does not require the number of vertical layers to be the same as precomputed in `S`.
+Not to be called directly, use `transform!` instead."""
+function _fourier_serial!(                  # GRID TO SPECTRAL
+    f_north::AbstractArray{<:Complex, 3},   # Fourier-transformed output
+    f_south::AbstractArray{<:Complex, 3},   # and for southern latitudes
+    grids::AbstractGridArray,               # gridded input
+    S::SpectralTransform,                   # precomputed transform
+)
+    (; nlat, nlons, nlat_half) = S          # dimensions
+    rfft_plans = S.rfft_plans_1D            # pre-planned transforms
+    nlayers = size(grids, 2)                # number of vertical layers
+
+    @boundscheck ismatching(S, grids) || throw(DimensionMismatch(S, grids))
+    @boundscheck nlayers <= S.nlayers || throw(DimensionMismatch(S, grids))
+    @boundscheck size(f_north) == size(f_south) == (S.nfreq_max, S.nlayers, nlat_half) || throw(DimensionMismatch(S, grids))
+
+    rings = eachring(grids)                     # precomputed ring indices
+    @inbounds for (k, k_grid) in zip(1:nlayers, eachgrid(grids))
+        for j_north in 1:nlat_half              # symmetry: loop over northern latitudes only
+            j = j_north                         # symmetric index / ring-away from pole index
+            j_south = nlat - j_north + 1        # southern latitude index
+            nlon = nlons[j]                     # number of longitudes on this ring (north or south)
+            nfreq  = nlon÷2 + 1                 # linear max Fourier frequency wrt to nlon
+            not_equator = j_north != j_south    # is the latitude ring not on equator?
+
+            rfft_plan = rfft_plans[j]           # FFT planned wrt nlon on ring
+            ilons = rings[j_north]              # in-ring indices northern ring
+            
+            grid_jk = view(grids.data, ilons, k_grid)   # data on northern ring, vertical layer k
+            out = view(S.scratch_memory_spec, 1:nfreq)  # view on scratch memory to store transformed data
+            LinearAlgebra.mul!(out, rfft_plan, grid_jk) # perform FFT
+            f_north[1:nfreq, k, j] = out
+
+            # southern latitude, don't call redundant 2nd fft if ring is on equator 
+            ilons = rings[j_south]                      # in-ring indices southern ring
+            grid_jk = view(grids.data, ilons, k_grid)   # data on southern ring, vertical layer k
+            if not_equator 
+                LinearAlgebra.mul!(out, rfft_plan, grid_jk) # perform FFT
+            else
+                fill!(out, 0)
+            end
+            f_south[1:nfreq, k, j] = out
+        end
+    end
+end
+
+"""$(TYPEDSIGNATURES)
+Inverse fast Fourier transform (spectral to grid) of Legendre-transformed inputs `g_north` and `g_south`
+to be stored in `grids`. Not to be called directly, use `transform!` instead."""
+function _fourier_batched!(                 # SPECTRAL TO GRID
+    grids::AbstractGridArray,               # gridded output
+    g_north::AbstractArray{<:Complex, 3},   # Legendre-transformed input
+    g_south::AbstractArray{<:Complex, 3},   # and for southern latitudes
+    S::SpectralTransform,                   # precomputed transform
+)
+    (; nlat, nlons, nlat_half) = S          # dimensions
+    (; brfft_plans) = S                     # pre-planned transforms
+    nlayers = size(grids, 2)                # number of vertical layers
+
+    @boundscheck ismatching(S, grids) || throw(DimensionMismatch(S, grids))
+    @boundscheck nlayers == S.nlayers || throw(DimensionMismatch(S, grids))     # otherwise FFTW complains
+    @boundscheck size(g_north) == size(g_south) == (S.nfreq_max, S.nlayers, nlat_half) || throw(DimensionMismatch(S, grids))
+
+    rings = eachring(grids)                 # precomputed ring indices
+    @inbounds for j_north in 1:nlat_half    # symmetry: loop over northern latitudes only
+        j = j_north                         # symmetric index / ring-away from pole index
+        j_south = nlat - j_north + 1        # southern latitude index
+        nlon = nlons[j]                     # number of longitudes on this ring (north or south)
+        nfreq = nlon÷2 + 1                  # linear max Fourier frequency wrt to nlon
+        not_equator = j_north != j_south    # is the latitude ring not on equator?
+
+        brfft_plan = brfft_plans[j]         # FFT planned wrt nlon on ring
+        ilons = rings[j_north]              # in-ring indices northern ring
+
+        # PERFORM FFT, inverse complex to real, hence brfft
+        # FFTW is in-place writing into `out` via `mul`
+        # FFTW requires stride-1 output, hence view on scratch memory
+        out = reshape(view(S.scratch_memory_grid, 1:nlon*nlayers), (nlon, nlayers))
+        LinearAlgebra.mul!(out, brfft_plan, view(g_north, 1:nfreq, 1:nlayers, j))
+        grids[ilons, :] = out
+
+        # southern latitude, don't call redundant 2nd FFT if ring is on equator 
+        ilons = rings[j_south]              # in-ring indices southern ring
+        if not_equator
+            LinearAlgebra.mul!(out, brfft_plan, view(g_south, 1:nfreq, 1:nlayers, j))
+            grids[ilons, :] = out
+        end
+    end
+end
+
+"""$(TYPEDSIGNATURES)
+(Inverse) Fast Fourier transform (spectral to grid) of Legendre-transformed inputs `g_north` and `g_south`
+to be stored in `grids`. Serial version that does not require the number of vertical layers to be the same
+as precomputed in `S`. Not to be called directly, use `transform!` instead."""
+function _fourier_serial!(                  # GRID TO SPECTRAL
+    grids::AbstractGridArray,               # gridded output
+    g_north::AbstractArray{<:Complex, 3},   # Legendre-transformed input
+    g_south::AbstractArray{<:Complex, 3},   # and for southern latitudes
+    S::SpectralTransform,                   # precomputed transform
+)
+    (; nlat, nlons, nlat_half) = S          # dimensions
+    brfft_plans = S.brfft_plans_1D          # pre-planned transforms
+    nlayers = size(grids, 2)                # number of vertical layers
+
+    @boundscheck ismatching(S, grids) || throw(DimensionMismatch(S, grids))
+    @boundscheck nlayers <= S.nlayers || throw(DimensionMismatch(S, grids))     # otherwise FFTW complains
+    @boundscheck size(g_north) == size(g_south) == (S.nfreq_max, S.nlayers, nlat_half) || throw(DimensionMismatch(S, grids))
+
+    rings = eachring(grids)                     # precomputed ring indices
+    @inbounds for (k, k_grid) in zip(1:nlayers, eachgrid(grids))
+        for j_north in 1:nlat_half              # symmetry: loop over northern latitudes only
+            j = j_north                         # symmetric index / ring-away from pole index
+            j_south = nlat - j_north + 1        # southern latitude index
+            nlon = nlons[j]                     # number of longitudes on this ring (north or south)
+            nfreq  = nlon÷2 + 1                 # linear max Fourier frequency wrt to nlon
+            not_equator = j_north != j_south    # is the latitude ring not on equator?
+
+            brfft_plan = brfft_plans[j]         # FFT planned wrt nlon on ring
+            ilons = rings[j_north]              # in-ring indices northern ring
+            
+            gn = view(g_north, 1:nfreq, k, j)       # data on northern ring, vertical layer k
+            out = view(grids.data, ilons, k_grid)   # view on scratch memory to store transformed data
+            LinearAlgebra.mul!(out, brfft_plan, gn)                 # perform FFT
+
+            # southern latitude, don't call redundant 2nd fft if ring is on equator
+            gs = view(g_south, 1:nfreq, k, j)       # data on southern ring, vertical layer k
+            ilons = rings[j_south]                  # in-ring indices southern ring
+            out = view(grids.data, ilons, k_grid)   # data on southern ring, vertical layer k
+            not_equator && LinearAlgebra.mul!(out, brfft_plan, gs)  # perform FFT
+        end
+    end
+end
\ No newline at end of file
diff --git a/src/SpeedyTransforms/legendre.jl b/src/SpeedyTransforms/legendre.jl
new file mode 100644
index 000000000..06e7dd5e1
--- /dev/null
+++ b/src/SpeedyTransforms/legendre.jl
@@ -0,0 +1,172 @@
+# (inverse) legendre transform kernel, called from _legendre!
+@inline function _fused_oddeven_matvec!(
+    north::AbstractVector,      # output, accumulator vector, northern latitudes
+    south::AbstractVector,      # output, accumulator vector, southern latitudes
+    specs::AbstractMatrix,      # input, spherical harmonic coefficients
+    legendre::AbstractVector,   # input, Legendre polynomials
+)
+    lmax, nlayers = axes(specs)             # lmax is the number of degrees at order m, 
+    isoddlmax = isodd(length(lmax))
+    lmax_even = length(lmax) - isoddlmax    # if lmax is odd do last odd element after the loop
+
+    @boundscheck size(north) == size(south) || throw(DimensionMismatch)
+    @boundscheck size(specs, 1) == length(legendre) || throw(DimensionMismatch)
+    @boundscheck size(specs, 2) <= length(north) || throw(DimensionMismatch)
+    
+    @inbounds for k in nlayers
+        # "even" and "odd" coined with 0-based indexing, i.e. the even l=0 mode is 1st element
+        even_k = zero(eltype(south))    # dot product with elements 1, 3, 5, ...
+        odd_k  = zero(eltype(north))    # dot prodcut with elements 2, 4, 6, ...
+
+        for l in 1:2:lmax_even          # dot product in pairs for contiguous memory access
+            even_k = muladd(specs[l,  k], legendre[l],  even_k)
+            odd_k = muladd(specs[l+1, k], legendre[l+1], odd_k)
+        end
+
+        # now do the last row if lmax is odd, all written as muladds
+        even_k = muladd(specs[end, k], isoddlmax*legendre[end], even_k)
+        north[k] = muladd( 1, odd_k, even_k)    # north = even + odd
+        south[k] = muladd(-1, odd_k, even_k)    # south = even - odd
+    end
+
+    return north, south
+end
+
+"""$(TYPEDSIGNATURES)
+Inverse Legendre transform, batched in the vertical. Not to be used
+directly, but called from transform!."""
+function _legendre!(
+    g_north::AbstractArray{<:Complex, 3},   # Legendre-transformed output, northern latitudes
+    g_south::AbstractArray{<:Complex, 3},   # and southern latitudes
+    specs::LowerTriangularArray,            # input: spherical harmonic coefficients
+    S::SpectralTransform;                   # precomputed transform
+    unscale_coslat::Bool = false,           # unscale by cosine of latitude on the fly?
+)
+    (; nlat_half) = S                       # dimensions    
+    (; lmax, mmax ) = S                     # 0-based max degree l, order m of spherical harmonics  
+    (; legendre_polynomials) = S            # precomputed Legendre polynomials    
+    (; mmax_truncation) = S                 # Legendre shortcut, shortens loop over m, 0-based  
+    (; coslat⁻¹, lon_offsets ) = S
+    nlayers = axes(specs, 2)                # get number of layers of specs for fewer layers than precomputed in S
+
+    @boundscheck ismatching(S, specs) || throw(DimensionMismatch(S, specs))
+    @boundscheck size(g_north) == size(g_south) == (S.nfreq_max, S.nlayers, nlat_half) || throw(DimensionMismatch(S, specs))
+
+    north = S.scratch_memory_column_north   # use scratch memory for vertically-batched dot product
+    south = S.scratch_memory_column_south
+
+    @inbounds for j in 1:nlat_half          # symmetry: loop over northern latitudes only
+        g_north[:, nlayers, j] .= 0       # reset scratch memory
+        g_south[:, nlayers, j] .= 0       # reset scratch memory
+
+        # INVERSE LEGENDRE TRANSFORM by looping over wavenumbers l, m
+        lm = 1                              # single running index for non-zero l, m indices
+        for m in 1:mmax_truncation[j] + 1   # Σ_{m=0}^{mmax}, but 1-based index, shortened to mmax_truncation
+            lm_end = lm + lmax-m+1          # last index in column
+
+            # view on lower triangular column, but batched in vertical
+            spec_view = view(specs.data, lm:lm_end, :)
+            legendre_view = view(legendre_polynomials.data, lm:lm_end, j)
+
+            # dot product but split into even and odd harmonics on the fly for better performance
+            # function is 1-based (odd, even, odd, ...) but here use 0-based indexing to name
+            # the "even" and "odd" harmonics, batched in the vertical so it's a mat vec multiplication
+            north, south = _fused_oddeven_matvec!(north, south, spec_view, legendre_view)
+
+            # CORRECT FOR LONGITUDE OFFSETTS (if grid points don't start at 0°E)
+            o = lon_offsets[m, j]           # rotation through multiplication with complex unit vector
+            for k in nlayers
+                g_north[m, k, j] = muladd(o, north[k], g_north[m, k, j])
+                g_south[m, k, j] = muladd(o, south[k], g_south[m, k, j])
+            end
+
+            lm = lm_end + 1                         # first index of next m column
+        end
+
+        if unscale_coslat
+            g_north[:, nlayers, j] .*= coslat⁻¹[j]        # scale in place
+            g_south[:, nlayers, j] .*= coslat⁻¹[j]
+        end
+    end
+end
+
+# (forward) Legendre kernel, called from _legendre!
+@inline function _fused_oddeven_outer_product_accumulate!(
+    specs::AbstractMatrix,      # output, accumulated spherical harmonic coefficients
+    legendre::AbstractVector,   # input, Legendre polynomials
+    even::AbstractVector,       # input, even harmonics
+    odd::AbstractVector,        # input, odd harmonics
+)
+    lmax, nlayers = size(specs)
+    isoddlmax = isodd(lmax)
+    lmax_even = lmax - isoddlmax
+
+    @boundscheck size(odd) == size(even) || throw(DimensionMismatch)
+    @boundscheck size(specs, 1) == length(legendre) || throw(DimensionMismatch)
+    @boundscheck size(specs, 2) <= length(even) || throw(DimensionMismatch)
+
+    @inbounds for k in 1:nlayers
+        even_k, odd_k = even[k], odd[k]
+        for l in 1:2:lmax_even
+            specs[l,   k] = muladd(legendre[l],  even_k, specs[l,   k])
+            specs[l+1, k] = muladd(legendre[l+1], odd_k, specs[l+1, k])
+        end
+        specs[end, k] = muladd(legendre[end], isoddlmax*even_k, specs[end, k])
+    end
+end
+
+"""$(TYPEDSIGNATURES)
+(Forward) Legendre transform, batched in the vertical. Not to be used
+directly, but called from transform!."""
+function _legendre!(                        # GRID TO SPECTRAL
+    specs::LowerTriangularArray,            # Fourier and Legendre-transformed output
+    f_north::AbstractArray{<:Complex, 3},   # Fourier-transformed input, northern latitudes
+    f_south::AbstractArray{<:Complex, 3},   # and southern latitudes
+    S::SpectralTransform,                   # precomputed transform
+)
+    (; nlat, nlat_half) = S                 # dimensions
+    (; lmax, mmax) = S                      # 0-based max degree l, order m of spherical harmonics  
+    (; legendre_polynomials) = S            # precomputed Legendre polynomials    
+    (; mmax_truncation) = S                 # Legendre shortcut, shortens loop over m, 0-based  
+    (; solid_angles, lon_offsets) = S
+    nlayers = axes(specs, 2)                # get number of layers of specs for fewer layers than precomputed in S
+
+    @boundscheck ismatching(S, specs) || throw(DimensionMismatch(S, specs))
+    @boundscheck size(f_north) == size(f_south) == (S.nfreq_max, S.nlayers, nlat_half) || throw(DimensionMismatch(S, specs))
+
+    even = S.scratch_memory_column_north    # use scratch memory for outer product
+    odd = S.scratch_memory_column_south
+
+    fill!(specs, 0)                         # reset as we accumulate into specs
+
+    @inbounds for j_north in 1:nlat_half    # symmetry: loop over northern latitudes only
+        j = j_north                         # symmetric index / ring-away from pole index
+
+        # SOLID ANGLES including quadrature weights (sinθ Δθ) and azimuth (Δϕ) on ring j
+        ΔΩ = solid_angles[j]                # = sinθ Δθ Δϕ, solid angle for a grid point
+
+        lm = 1                              # single running index for spherical harmonics
+        for m in 1:mmax_truncation[j] + 1   # Σ_{m=0}^{mmax}, but 1-based index, shortened to mmax_truncation
+
+            # SOLID ANGLE QUADRATURE WEIGHTS and LONGITUDE OFFSET
+            o = lon_offsets[m, j]           # longitude offset rotation by multiplication with complex unit vector
+            ΔΩ_rotated = ΔΩ*conj(o)         # complex conjugate for rotation back to prime meridian
+            
+            # LEGENDRE TRANSFORM
+            for k in nlayers
+                fn, fs  = f_north[m, k, j], f_south[m, k, j]
+                @fastmath even[k] = ΔΩ_rotated*(fn + fs)
+                @fastmath odd[k]  = ΔΩ_rotated*(fn - fs)
+            end
+
+            # integration over l = m:lmax+1
+            lm_end = lm + lmax-m+1                      # last index in column m
+            spec_view = view(specs.data, lm:lm_end, :)
+            legendre_view = view(legendre_polynomials.data, lm:lm_end, j)
+
+            _fused_oddeven_outer_product_accumulate!(spec_view, legendre_view, even, odd)
+
+            lm = lm_end + 1                             # first index of next column m+1
+        end
+    end
+end
\ No newline at end of file
diff --git a/src/SpeedyTransforms/legendre_shortcuts.jl b/src/SpeedyTransforms/legendre_shortcuts.jl
new file mode 100644
index 000000000..91e5aa020
--- /dev/null
+++ b/src/SpeedyTransforms/legendre_shortcuts.jl
@@ -0,0 +1,49 @@
+"""Legendre shortcut is the truncation of the loop over the order m of the spherical harmonics
+in the Legendre transform. For reduced grids with as few as 4 longitudes around the poles
+(HEALPix grids) or 20 (octahedral grids) the higher wavenumbers in large orders m do not
+project (significantly) onto such few longitudes. For performance reasons the loop over m can
+therefore but truncated early. A Legendre shortcut `<: AbstractLegendreShortcut`
+is implemented as a functor that returns the 0-based maximum order m to retain per latitude ring,
+i.e. to be used for m in 0:mmax_truncation.
+
+New shortcuts can be added by defining `struct LegendreShortcutNew <: AbstractLegendreShortcut end`
+and the functor method `LegendreShortcutNew(nlon::Integer, lat::Real) = ...`, with `nlon` the
+number of longitude points on that ring, and `latd` its latitude in degrees (-90˚ to 90˚N).
+Many implementations may not use the latitude `latd` but it is included for compatibility.
+If unused set to default value to 0. Also define `short_name(::Type{<:LegendreShortcutNew}) = "new"`.
+
+Implementions are `LegendreShortcutLinear`, `LegendreShortcutQuadratic`, `LegendreShortcutCubic`,
+`LegendreShortcutLinQuadCosLat²` and `LegendreShortcutLinCubCoslat`."""
+abstract type AbstractLegendreShortcut end
+short_name(s::Type{<:AbstractLegendreShortcut}) = string(s)
+short_name(s::AbstractLegendreShortcut) = short_name(typeof(s))
+
+struct LegendreShortcutLinear <: AbstractLegendreShortcut end
+"""$(TYPEDSIGNATURES)
+Linear Legendre shortcut, truncates the Legendre loop over order m to nlon/2."""
+LegendreShortcutLinear(nlon::Integer, latd::Real=0) = nlon÷2
+short_name(::Type{<:LegendreShortcutLinear}) = "linear"
+
+struct LegendreShortcutQuadratic <: AbstractLegendreShortcut end
+"""$(TYPEDSIGNATURES)
+Quadratic Legendre shortcut, truncates the Legendre loop over order m to nlon/3."""
+LegendreShortcutQuadratic(nlon::Integer, latd::Real=0) = nlon÷3
+short_name(::Type{<:LegendreShortcutQuadratic}) = "quadratic"
+
+struct LegendreShortcutCubic <: AbstractLegendreShortcut end
+"""$(TYPEDSIGNATURES)
+Cubic Legendre shortcut, truncates the Legendre loop over order m to nlon/4."""
+LegendreShortcutCubic(nlon::Integer, latd::Real=0) = nlon÷4
+short_name(::Type{<:LegendreShortcutCubic}) = "cubic"
+
+struct LegendreShortcutLinQuadCoslat² <: AbstractLegendreShortcut end
+"""$(TYPEDSIGNATURES)
+Linear-Quadratic Legendre shortcut, truncates the Legendre loop over order m to nlon/(2 + cosd(latd)^2)."""
+LegendreShortcutLinQuadCoslat²(nlon::Integer, latd::Real) = floor(Int, nlon/(2 + cosd(latd)^2))
+short_name(::Type{<:LegendreShortcutLinQuadCoslat²}) = "linquadcoslat²"
+
+struct LegendreShortcutLinCubCoslat <: AbstractLegendreShortcut end
+"""$(TYPEDSIGNATURES)
+Linear-Cubic Legendre shortcut, truncates the Legendre loop over order m to nlon/(2 + 2cosd(latd))."""
+LegendreShortcutLinCubCoslat(nlon::Integer, latd::Real) = floor(Int, nlon/(2 + 2cosd(latd)))
+short_name(::Type{<:LegendreShortcutLinCubCoslat}) = "lincubcoslat"
\ No newline at end of file
diff --git a/src/SpeedyTransforms/show.jl b/src/SpeedyTransforms/show.jl
index 471bd2d86..8a17d602a 100644
--- a/src/SpeedyTransforms/show.jl
+++ b/src/SpeedyTransforms/show.jl
@@ -13,11 +13,16 @@ function prettymemory(b)
 end
 
 function Base.show(io::IO, S::SpectralTransform{NF, ArrayType}) where {NF, ArrayType}
-    (; lmax, mmax, Grid, nlat_half) = S
+    (; lmax, mmax, Grid, nlat_half, nlayers) = S
 
-    # add information about size of Legendre polynomials
-    s = S.recompute_legendre ? Base.summarysize(S.Λ) : Base.summarysize(S.Λs)
-    s_str = prettymemory(s)
+    # add information about size of Legendre polynomials and scratch memory
+    polysize_str = prettymemory(Base.summarysize(S.legendre_polynomials))
+    memorysize_str = prettymemory(
+                Base.summarysize(S.scratch_memory_north) +      # add all scratch_memories
+                Base.summarysize(S.scratch_memory_south) + 
+                Base.summarysize(S.scratch_memory_grid) + 
+                Base.summarysize(S.scratch_memory_spec)
+            )
 
     dealias = get_dealiasing(mmax, nlat_half)
     truncations = ["<linear", "linear", "quadratic", "cubic", ">cubic"]
@@ -28,5 +33,6 @@ function Base.show(io::IO, S::SpectralTransform{NF, ArrayType}) where {NF, Array
     println(io, "├ Spectral:   T$mmax, $(lmax+1)x$(mmax+1) LowerTriangularMatrix{Complex{$NF}}")
     println(io, "├ Grid:       $(RingGrids.get_nlat(Grid, nlat_half))-ring $Grid{$NF}")
     println(io, "├ Truncation: dealiasing = $dealiasing ($truncation)")
-      print(io, "└ Legendre:   recompute polynomials $(S.recompute_legendre) ($s_str)")
-end
+    println(io, "├ Legendre:   Polynomials $polysize_str, shortcut: $(short_name(S.LegendreShortcut))")
+      print(io, "└ Memory:     for $nlayers layers ($memorysize_str)")
+end
\ No newline at end of file
diff --git a/src/SpeedyTransforms/spectral_transform.jl b/src/SpeedyTransforms/spectral_transform.jl
index a41d8381d..82f11d671 100644
--- a/src/SpeedyTransforms/spectral_transform.jl
+++ b/src/SpeedyTransforms/spectral_transform.jl
@@ -1,5 +1,9 @@
-"""
-SpectralTransform struct that contains all parameters and precomputed arrays
+const DEFAULT_NLAYERS = 1
+const DEFAULT_GRID = FullGaussianGrid
+
+abstract type AbstractSpectralTransform end
+
+"""SpectralTransform struct that contains all parameters and precomputed arrays
 to perform a spectral transform. Fields are
 $(TYPEDFIELDS)"""
 struct SpectralTransform{
@@ -9,18 +13,21 @@ struct SpectralTransform{
     VectorComplexType,          # <: ArrayType{Complex{NF}, 1},
     VectorIntType,              # <: ArrayType{Int, 1},
     MatrixComplexType,          # <: ArrayType{Complex{NF}, 2},
+    ArrayComplexType,           # <: ArrayType{Complex{NF}, 3},
     LowerTriangularMatrixType,  # <: LowerTriangularArray{NF, 1, ArrayType{NF}},
     LowerTriangularArrayType,   # <: LowerTriangularArray{NF, 2, ArrayType{NF}},
-}
+} <: AbstractSpectralTransform
     # GRID
     Grid::Type{<:AbstractGridArray} # grid type used
     nlat_half::Int                  # resolution parameter of grid (# of latitudes on one hemisphere, Eq incl)
+    nlayers::Int                    # number of layers in the vertical (for scratch memory size)
 
     # SPECTRAL RESOLUTION
-    lmax::Int               # Maximum degree l=[0, lmax] of spherical harmonics
-    mmax::Int               # Maximum order m=[0, l] of spherical harmonics
-    nfreq_max::Int          # Maximum (at Equator) number of Fourier frequencies (real FFT)
-    m_truncs::VectorIntType # Maximum order m to retain per latitude ring
+    lmax::Int                       # Maximum degree l=[0, lmax] of spherical harmonics
+    mmax::Int                       # Maximum order m=[0, l] of spherical harmonics
+    nfreq_max::Int                  # Maximum (at Equator) number of Fourier frequencies (real FFT)
+    LegendreShortcut::Type{<:AbstractLegendreShortcut} # Legendre shortcut for truncation of m loop
+    mmax_truncation::VectorIntType  # Maximum order m to retain per latitude ring
 
     # CORRESPONDING GRID SIZE
     nlon_max::Int           # Maximum number of longitude points (at Equator)
@@ -28,33 +35,39 @@ struct SpectralTransform{
     nlat::Int               # Number of latitude rings
 
     # CORRESPONDING GRID VECTORS
-    colat::VectorType               # Gaussian colatitudes (0, π) North to South Pole 
-    cos_colat::VectorType           # Cosine of colatitudes
-    sin_colat::VectorType           # Sine of colatitudes
+    coslat::VectorType              # Cosine of latitudes, north to south
+    coslat⁻¹::VectorType            # inverse of coslat inv.(coslat)
     lon_offsets::MatrixComplexType  # Offset of first lon per ring from prime meridian
 
     # NORMALIZATION
     norm_sphere::NF                         # normalization of the l=0, m=0 mode
 
-    # FFT plans, one plan for each latitude ring
+    # FFT plans, one plan for each latitude ring, batched in the vertical
     rfft_plans::Vector{AbstractFFTs.Plan}   # FFT plan for grid to spectral transform
     brfft_plans::Vector{AbstractFFTs.Plan}  # spectral to grid transform (inverse)
 
-    # LEGENDRE POLYNOMIALS
-    recompute_legendre::Bool                # Pre or recompute Legendre polynomials
-    Λ::AssociatedLegendrePolMatrix{NF}      # Legendre polynomials for one latitude (requires recomputing)
-    Λs::Vector{LowerTriangularMatrixType}   # Legendre polynomials for all latitudes (all precomputed)
+    # FFT plans, but unbatched
+    rfft_plans_1D::Vector{AbstractFFTs.Plan}
+    brfft_plans_1D::Vector{AbstractFFTs.Plan}
+
+    # LEGENDRE POLYNOMIALS, for all latitudes, precomputed
+    legendre_polynomials::LowerTriangularArrayType
     
+    # SCRATCH MEMORY FOR FOURIER NOT YET LEGENDRE TRANSFORMED AND VICE VERSA
+    # state is undetermined, only read after writing to it
+    scratch_memory_north::ArrayComplexType
+    scratch_memory_south::ArrayComplexType
+    scratch_memory_grid::VectorType                 # scratch memory with 1-stride for FFT output
+    scratch_memory_spec::VectorComplexType
+    scratch_memory_column_north::VectorComplexType  # scratch memory for vertically batched Legendre transform
+    scratch_memory_column_south::VectorComplexType  # scratch memory for vertically batched Legendre transform
+
     # SOLID ANGLES ΔΩ FOR QUADRATURE
     # (integration for the Legendre polynomials, extra normalisation of π/nlat included)
     # vector is pole to pole although only northern hemisphere required
     solid_angles::VectorType                # = ΔΩ = sinθ Δθ Δϕ (solid angle of grid point)
 
-    # RECURSION FACTORS
-    ϵlms::LowerTriangularMatrixType     # precomputed for meridional gradients grad_y1, grad_y2
-
     # GRADIENT MATRICES (on unit sphere, no 1/radius-scaling included)
-    grad_x ::VectorComplexType          # = i*m but precomputed
     grad_y1::LowerTriangularMatrixType  # precomputed meridional gradient factors, term 1
     grad_y2::LowerTriangularMatrixType  # term 2
 
@@ -77,23 +90,21 @@ $(TYPEDSIGNATURES)
 Generator function for a SpectralTransform struct. With `NF` the number format,
 `Grid` the grid type `<:AbstractGrid` and spectral truncation `lmax, mmax` this function sets up
 necessary constants for the spetral transform. Also plans the Fourier transforms, retrieves the colatitudes,
-and preallocates the Legendre polynomials (if recompute_legendre == false) and quadrature weights."""
+and preallocates the Legendre polynomials and quadrature weights."""
 function SpectralTransform(
-    ::Type{NF},                                 # Number format NF
-    Grid::Type{<:AbstractGridArray},            # type of spatial grid used
-    lmax::Int,                                  # Spectral truncation: degrees
-    mmax::Int;                                  # Spectral truncation: orders
-    ArrayType::Type{<:AbstractArray} = Array,   # Array type used for spectral coefficients
-    recompute_legendre::Bool = true,            # re or precompute legendre polynomials?
-    legendre_shortcut::Symbol = :linear,        # shorten Legendre loop over order m
-    dealiasing::Real=DEFAULT_DEALIASING
+    ::Type{NF},                                     # Number format NF
+    lmax::Integer,                                  # Spectral truncation: degrees
+    mmax::Integer,                                  # Spectral truncation: orders
+    nlat_half::Integer;                             # grid resolution, latitude rings on one hemisphere incl equator
+    Grid::Type{<:AbstractGridArray} = DEFAULT_GRID, # type of spatial grid used
+    ArrayType::Type{<:AbstractArray} = Array,       # Array type used for spectral coefficients
+    nlayers::Integer = DEFAULT_NLAYERS,             # number of layers in the vertical (for scratch memory size)
+    LegendreShortcut::Type{<:AbstractLegendreShortcut} = LegendreShortcutLinear,   # shorten Legendre loop over order m
 ) where NF
 
-    Grid = RingGrids.nonparametric_type(Grid)   # always use nonparametric super type
+    Grid = RingGrids.nonparametric_type(Grid)   # always use nonparametric concrete type
 
     # RESOLUTION PARAMETERS
-    nlat_half = get_nlat_half(mmax, dealiasing) # resolution parameter nlat_half,
-                                                # number of latitude rings on one hemisphere incl equator
     nlat = get_nlat(Grid, nlat_half)            # 2nlat_half but one less if grids have odd # of lat rings
     nlon_max = get_nlon_max(Grid, nlat_half)    # number of longitudes around the equator
                                                 # number of longitudes per latitude ring (one hemisphere only)
@@ -101,56 +112,66 @@ function SpectralTransform(
     nfreq_max = nlon_max÷2 + 1                      # maximum number of fourier frequencies (real FFTs)
 
     # LATITUDE VECTORS (based on Gaussian, equi-angle or HEALPix latitudes)
-    colat = RingGrids.get_colat(Grid, nlat_half)    # colatitude in radians                             
+    latd = RingGrids.get_latd(Grid, nlat_half)      # latitude in degrees (90˚Nto -90˚N)
+    colat = RingGrids.get_colat(Grid, nlat_half)    # colatitude in radians
     cos_colat = cos.(colat)                         # cos(colat)
-    sin_colat = sin.(colat)                         # sin(colat)
-
-    # LEGENDRE SHORTCUT OVER ORDER M (to have linear/quad/cubic truncation per latitude ring)
-    if legendre_shortcut in (:linear, :quadratic, :cubic)
-        s = (linear=1, quadratic=2, cubic=3)[legendre_shortcut]
-        m_truncs = [nlons[j]÷(s+1) + 1 for j in 1:nlat_half]
-    elseif legendre_shortcut == :linquad_coslat²
-        m_truncs = [ceil(Int, nlons[j]/(2 + sin_colat[j]^2)) for j in 1:nlat_half]
-    elseif legendre_shortcut == :lincub_coslat
-        m_truncs = [ceil(Int, nlons[j]/(2 + 2sin_colat[j])) for j in 1:nlat_half]
-    else
-        throw(error("legendre_shortcut=$legendre_shortcut not defined."))
-    end
-    m_truncs = min.(m_truncs, mmax+1)    # only to mmax in any case (otherwise BoundsError)
+    coslat = cosd.(latd)                            # cos(lat)
+    coslat⁻¹ = inv.(coslat)                         # 1/cos(lat)
+
+    # LEGENDRE SHORTCUT OVER ORDER M (0-based), truncate the loop over order m 
+    mmax_truncation = [LegendreShortcut(nlons[j], latd[j]) for j in 1:nlat_half]
+    mmax_truncation = min.(mmax_truncation, mmax)   # only to mmax in any case (otherwise BoundsError)
 
-    # PLAN THE FFTs
-    FFT_package = NF <: Union{Float32, Float64} ? FFTW : GenericFFT
-    rfft_plans = [FFT_package.plan_rfft(zeros(NF, nlon)) for nlon in nlons]
-    brfft_plans = [FFT_package.plan_brfft(zeros(Complex{NF}, nlon÷2 + 1), nlon) for nlon in nlons]
-    
     # NORMALIZATION
     norm_sphere = 2sqrt(π)  # norm_sphere at l=0, m=0 translates to 1s everywhere in grid space
-
+    
     # LONGITUDE OFFSETS OF FIRST GRID POINT PER RING (0 for full and octahedral grids)
     _, lons = RingGrids.get_colatlons(Grid, nlat_half)
     rings = eachring(Grid, nlat_half)                       # compute ring indices
     lon1s = [lons[rings[j].start] for j in 1:nlat_half]     # pick lons at first index for each ring
     lon_offsets = [cispi(m*lon1/π) for m in 0:mmax, lon1 in lon1s]
-
-    # PREALLOCATE LEGENDRE POLYNOMIALS, +1 for 1-based indexing
-    # Legendre polynomials for one latitude
-    Λ = AssociatedLegendrePolArray{NF, 2, 1, Vector{NF}}(zeros(LowerTriangularMatrix{NF}, lmax+1, mmax+1))
-
-    # allocate memory in Λs for polynomials at all latitudes or allocate dummy array if precomputed
-    # Λs is of size (lmax+1) x (mmax+1) x nlat_half unless recomputed
-    # for recomputed only Λ is used, not Λs, create dummy array of 0-size instead
-    b = ~recompute_legendre                 # true for precomputed
-    Λs = [zeros(LowerTriangularMatrix{NF}, b*(lmax+1), b*(mmax+1)) for _ in 1:nlat_half]
-
-    if recompute_legendre == false          # then precompute all polynomials
-        Λtemp = zeros(NF, lmax+1, mmax+1)   # preallocate matrix
-        for j in 1:nlat_half                # only one hemisphere due to symmetry
-            Legendre.λlm!(Λtemp, lmax, mmax, Float64(cos_colat[j]))   # always precalculate in Float64
-            # underflow!(Λtemp, sqrt(floatmin(NF)))
-            copyto!(Λs[j], Λtemp)
-        end
+    
+    # PRECOMPUTE LEGENDRE POLYNOMIALS, +1 for 1-based indexing
+    legendre_polynomials = zeros(LowerTriangularMatrix{NF}, lmax+1, mmax+1, nlat_half)
+    legendre_polynomials_j = zeros(NF, lmax+1, mmax+1)      # temporary for one latitude
+    for j in 1:nlat_half                                    # only one hemisphere due to symmetry
+        Legendre.λlm!(legendre_polynomials_j, lmax, mmax, cos_colat[j])             # precompute
+        legendre_polynomials[:, j] = LowerTriangularArray(legendre_polynomials_j)  # store
     end
+    
+    # SCRATCH MEMORY FOR FOURIER NOT YET LEGENDRE TRANSFORMED AND VICE VERSA
+    scratch_memory_north = zeros(Complex{NF}, nfreq_max, nlayers, nlat_half)
+    scratch_memory_south = zeros(Complex{NF}, nfreq_max, nlayers, nlat_half)
+
+    # SCRATCH MEMORY TO 1-STRIDE DATA FOR FFTs
+    scratch_memory_grid  = zeros(NF, nlon_max*nlayers)
+    scratch_memory_spec  = zeros(Complex{NF}, nfreq_max*nlayers)
+
+    # SCRATCH MEMORY COLUMNS FOR VERTICALLY BATCHED LEGENDRE TRANSFORM
+    scratch_memory_column_north = zeros(Complex{NF}, nlayers)
+    scratch_memory_column_south = zeros(Complex{NF}, nlayers)
 
+    # PLAN THE FFTs
+    FFT_package = NF <: Union{Float32, Float64} ? FFTW : GenericFFT
+    rfft_plans = Vector{AbstractFFTs.Plan}(undef, nlat_half)
+    brfft_plans = Vector{AbstractFFTs.Plan}(undef, nlat_half)
+    rfft_plans_1D = Vector{AbstractFFTs.Plan}(undef, nlat_half)
+    brfft_plans_1D = Vector{AbstractFFTs.Plan}(undef, nlat_half)
+
+    fake_grid_data = zeros(Grid{NF}, nlat_half, nlayers)
+
+    for (j, nlon) in enumerate(nlons)
+        real_matrix_input = view(fake_grid_data.data, rings[j], :)
+        complex_matrix_input = view(scratch_memory_north, 1:nlon÷2 + 1, :, j)
+        real_vector_input = view(fake_grid_data.data, rings[j], 1)
+        complex_vector_input = view(scratch_memory_north, 1:nlon÷2 + 1, 1, j)
+
+        rfft_plans[j] = FFT_package.plan_rfft(real_matrix_input, 1)
+        brfft_plans[j] = FFT_package.plan_brfft(complex_matrix_input, nlon, 1)
+        rfft_plans_1D[j] = FFT_package.plan_rfft(real_vector_input, 1)
+        brfft_plans_1D[j] = FFT_package.plan_brfft(complex_vector_input, nlon, 1)
+    end
+    
     # SOLID ANGLES WITH QUADRATURE WEIGHTS (Gaussian, Clenshaw-Curtis, or Riemann depending on grid)
     # solid angles are ΔΩ = sinθ Δθ Δϕ for every grid point with
     # sin(θ)dθ are the quadrature weights approximate the integration over latitudes
@@ -162,8 +183,6 @@ function SpectralTransform(
     ϵlms = get_recursion_factors(lmax+1, mmax)
 
     # GRADIENTS (on unit sphere, hence 1/radius-scaling is omitted)
-    grad_x = [im*m for m in 0:mmax]         # zonal gradient (precomputed currently not used)
-
     # meridional gradient for scalars (coslat scaling included)
     grad_y1 = zeros(LowerTriangularMatrix, lmax+1, mmax+1)      # term 1, mul with harmonic l-1, m
     grad_y2 = zeros(LowerTriangularMatrix, lmax+1, mmax+1)      # term 2, mul with harmonic l+1, m
@@ -217,61 +236,108 @@ function SpectralTransform(
         ArrayType_{Complex{NF}, 1},
         ArrayType_{Int, 1},
         ArrayType_{Complex{NF}, 2},
+        ArrayType_{Complex{NF}, 3},
         LowerTriangularArray{NF, 1, ArrayType_{NF, 1}},
         LowerTriangularArray{NF, 2, ArrayType_{NF, 2}},
     }(
-        Grid, nlat_half,
-        lmax, mmax, nfreq_max, m_truncs,
+        Grid, nlat_half, nlayers,
+        lmax, mmax, nfreq_max, 
+        LegendreShortcut, mmax_truncation,
         nlon_max, nlons, nlat,
-        colat, cos_colat, sin_colat, lon_offsets,
+        coslat, coslat⁻¹, lon_offsets,
         norm_sphere,
-        rfft_plans, brfft_plans,
-        recompute_legendre, Λ, Λs, solid_angles,
-        ϵlms, grad_x, grad_y1, grad_y2,
+        rfft_plans, brfft_plans, rfft_plans_1D, brfft_plans_1D,
+        legendre_polynomials,
+        scratch_memory_north, scratch_memory_south,
+        scratch_memory_grid, scratch_memory_spec,
+        scratch_memory_column_north, scratch_memory_column_south,
+        solid_angles, grad_y1, grad_y2,
         grad_y_vordiv1, grad_y_vordiv2, vordiv_to_uv_x,
         vordiv_to_uv1, vordiv_to_uv2,
         eigenvalues, eigenvalues⁻¹
     )
 end
 
-"""
-$(TYPEDSIGNATURES)
+"""$(TYPEDSIGNATURES)
 Generator function for a `SpectralTransform` struct based on the size of the spectral
-coefficients `alms` and the grid `Grid`. Recomputes the Legendre polynomials by default."""
+coefficients `specs`. Use keyword arguments `nlat_half`, `Grid` or `deliasing` (if `nlat_half`
+not provided) to define the grid."""
 function SpectralTransform(
-    alms::LowerTriangularArray{NF, N, ArrayType};   # spectral coefficients
-    recompute_legendre::Bool = true,                # saves memory
-    Grid::Type{<:AbstractGrid} = DEFAULT_GRID,
-    dealiasing::Real = DEFAULT_DEALIASING,
+    specs::LowerTriangularArray{NF, N, ArrayType};  # spectral coefficients
+    nlat_half::Integer = 0,                         # resolution parameter nlat_half
+    dealiasing::Real = DEFAULT_DEALIASING,          # dealiasing factor
+    kwargs...
 ) where {NF, N, ArrayType}                          # number format NF (can be complex)
-    lmax, mmax = size(alms, ZeroBased, as=Matrix)   # 0-based degree l, order m
-    return SpectralTransform(real(NF), Grid, lmax, mmax; ArrayType, recompute_legendre, dealiasing)
+    lmax, mmax = size(specs, ZeroBased, as=Matrix)  # 0-based degree l, order m
+
+    # get nlat_half from dealiasing if not provided
+    nlat_half = nlat_half > 0 ? nlat_half : get_nlat_half(mmax, dealiasing)
+    nlayers = size(specs, 2)
+    return SpectralTransform(real(NF), lmax, mmax, nlat_half; ArrayType, nlayers, kwargs...)
 end
 
-"""
-$(TYPEDSIGNATURES)
-Generator function for a `SpectralTransform` struct based on the size and grid type of
-gridded field `grids`. Recomputes the Legendre polynomials by default."""
+"""$(TYPEDSIGNATURES)
+Generator function for a `SpectralTransform` struct based on the size and grid type of `grids`.
+Use keyword arugments `trunc`, `dealiasing` (ignored if `trunc` is used) or `one_more_degree`
+to define the spectral truncation."""
 function SpectralTransform(
     grids::AbstractGridArray{NF, N, ArrayType};     # gridded field
-    recompute_legendre::Bool=true,                  # saves memory as transform is garbage collected anyway
-    one_more_degree::Bool=false,                    # returns a square LowerTriangularMatrix by default
-    dealiasing::Real = DEFAULT_DEALIASING,
+    trunc::Integer = 0,                             # spectral truncation
+    dealiasing::Real = DEFAULT_DEALIASING,          # dealiasing factor
+    one_more_degree::Bool = false,                  # returns a square LowerTriangularMatrix by default
+    kwargs...
 ) where {NF, N, ArrayType}                          # number format NF
     Grid = RingGrids.nonparametric_type(typeof(grids))
-    trunc = get_truncation(grids, dealiasing)
-    return SpectralTransform(NF, Grid, trunc+one_more_degree, trunc; ArrayType, recompute_legendre, dealiasing)
+    trunc = trunc > 0 ? trunc : get_truncation(grids, dealiasing)
+    nlayers = size(grids, 2)
+    return SpectralTransform(NF, trunc+one_more_degree, trunc, grids.nlat_half; Grid, ArrayType, nlayers, kwargs...)
 end
 
+"""$(TYPEDSIGNATURES)
+Generator function for a `SpectralTransform` struct to transform between `grids` and `specs`."""
+function SpectralTransform(
+    grids::AbstractGridArray{NF1, N, ArrayType1},
+    specs::LowerTriangularArray{NF2, N, ArrayType2};
+    kwargs...
+) where {NF1, NF2, N, ArrayType1, ArrayType2}           # number formats 1 and 2
+    
+    # infer types for SpectralTransform
+    NF = promote_type(real(eltype(grids)), real(eltype(specs)))
+    Grid = RingGrids.nonparametric_type(typeof(grids))
+    _ArrayType1 = RingGrids.nonparametric_type(ArrayType1)
+    _ArrayType2 = RingGrids.nonparametric_type(ArrayType2)
+    @assert _ArrayType1 == _ArrayType2 "ArrayTypes of grids ($_ArrayType1) and specs ($_ArrayType2) do not match."
+
+    # get resolution
+    lmax, mmax = size(specs, ZeroBased, as=Matrix)      # 0-based degree l, order m
+    nlayers = size(grids, 2)
+    @assert nlayers == size(specs, 2) "Number of layers in grids ($nlayers) and lower triangular matrices ($(size(specs, 2))) do not match."
+    return SpectralTransform(NF, lmax, mmax, grids.nlat_half; Grid, ArrayType=ArrayType1, nlayers, kwargs...)
+end
+
+# make commutative
+SpectralTransform(specs::LowerTriangularArray, grids::AbstractGridArray) = SpectralTransform(grids, specs)
+
 # CHECK MATCHING SIZES
-# TODO use dispatch to return false if array types don't match?
+"""$(TYPEDSIGNATURES)
+Spectral transform `S` and lower triangular matrix `L` match if the
+spectral dimensions `(lmax, mmax)` match and the number of vertical layers is
+equal or larger in the transform (constraints due to allocated scratch memory size)."""
 function ismatching(S::SpectralTransform, L::LowerTriangularArray)
-    return (S.lmax, S.mmax) == size(L, ZeroBased, as=Matrix)[1:2]
+    resolution_math = (S.lmax, S.mmax) == size(L, ZeroBased, as=Matrix)[1:2]
+    vertical_match = length(axes(L, 2)) <= S.nlayers
+    return resolution_math && vertical_match
 end
 
+"""$(TYPEDSIGNATURES)
+Spectral transform `S` and `grid` match if the resolution `nlat_half` and the
+type of the grid match and the number of vertical layers is equal or larger in
+the transform (constraints due to allocated scratch memory size)."""
 function ismatching(S::SpectralTransform, grid::AbstractGridArray)
-    match = S.Grid == RingGrids.nonparametric_type(typeof(grid)) && S.nlat_half == grid.nlat_half
-    return match
+    type_match = S.Grid == RingGrids.nonparametric_type(typeof(grid))
+    resolution_match = S.nlat_half == grid.nlat_half
+    vertical_match = size(grid, 2) <= S.nlayers
+    return type_match && resolution_match && vertical_match
 end
 
 # make `ismatching` commutative
@@ -279,30 +345,24 @@ ismatching(L::LowerTriangularArray, S::SpectralTransform) = ismatching(S, L)
 ismatching(G::AbstractGridArray,    S::SpectralTransform) = ismatching(S, G)
 
 function Base.DimensionMismatch(S::SpectralTransform, L::LowerTriangularArray)
-    s = "SpectralTransform for $(S.lmax+1)x$(S.mmax+1)xN LowerTriangularArrays and $(Base.dims2string(size(L, as=Matrix))) "*
+    s = "SpectralTransform for $(S.lmax+1)x$(S.mmax+1)x$(S.nlayers) LowerTriangularArrays "*
+        "with $(Base.dims2string(size(L, as=Matrix))) "*
         "LowerTriangularArray do not match."
     return DimensionMismatch(s)
 end
 
-function Base.DimensionMismatch(S::SpectralTransform{NF1}, G::AbstractGridArray{NF2}) where {NF1, NF2}
-    s = "SpectralTransform{$NF1}($(S.Grid), nlat_half=$(S.nlat_half)) and "*
-        "$(RingGrids.nonparametric_type(G)){$NF2} with nlat_half=$(G.nlat_half) do not match."
+function Base.DimensionMismatch(S::SpectralTransform, G::AbstractGridArray)
+    sz = (RingGrids.get_npoints2D(S.Grid, S.nlat_half), S.nlayers)
+    s = "SpectralTransform for $(Base.dims2string(sz)) $(S.Grid) with "*
+        "$(Base.dims2string(size(G))) $(RingGrids.nonparametric_type(G)) do not match."
     return DimensionMismatch(s)
 end
 
 """
 $(TYPEDSIGNATURES)
 Recursion factors `ϵ` as a function of degree `l` and order `m` (0-based) of the spherical harmonics.
-ϵ(l, m) = sqrt((l^2-m^2)/(4*l^2-1)) and then converted to number format NF."""
-function ϵlm(::Type{NF}, l::Int, m::Int) where NF
-    return convert(NF, sqrt((l^2-m^2)/(4*l^2-1)))
-end
-
-"""
-$(TYPEDSIGNATURES)
-Recursion factors `ϵ` as a function of degree `l` and order `m` (0-based) of the spherical harmonics.
-ϵ(l, m) = sqrt((l^2-m^2)/(4*l^2-1)) with default number format Float64."""
-ϵlm(l::Int, m::Int) = ϵlm(Float64, l, m)
+ϵ(l, m) = sqrt((l^2-m^2)/(4*l^2-1))."""
+recursion_factor(l::Int, m::Int) = sqrt((l^2-m^2)/(4*l^2-1))
 
 """
 $(TYPEDSIGNATURES)     
@@ -310,13 +370,13 @@ Returns a matrix of recursion factors `ϵ` up to degree `lmax` and order `mmax`
 function get_recursion_factors( ::Type{NF}, # number format NF
                                 lmax::Int,  # max degree l of spherical harmonics (0-based here)
                                 mmax::Int   # max order m of spherical harmonics
-                                ) where {NF<:AbstractFloat}
+                                ) where NF
 
     # +1 for 0-based to 1-based
     ϵlms = zeros(LowerTriangularMatrix{NF}, lmax+1, mmax+1)      
-    for m in 1:mmax+1                   # loop over 1-based l, m
+    for m in 1:mmax+1                                   # loop over 1-based l, m
         for l in m:lmax+1
-            ϵlms[l, m] = ϵlm(NF, l-1, m-1) # convert to 0-based l, m for function call
+            ϵlms[l, m] = recursion_factor(l-1, m-1)     # convert to 0-based l, m for function arguments
         end
     end
     return ϵlms
@@ -333,102 +393,25 @@ number format-flexible but `grids` and the spectral transform `S` have to have t
 Uses the precalculated arrays, FFT plans and other constants in the SpectralTransform struct `S`.
 The spectral transform is grid-flexible as long as the `typeof(grids)<:AbstractGridArray` and `S.Grid`
 matches."""
-function transform!(                # SPECTRAL TO GRID
-    grids::AbstractGridArray,       # gridded output
-    specs::LowerTriangularArray,    # spectral coefficients input
-    S::SpectralTransform{NF};       # precomputed transform
-    unscale_coslat::Bool=false,     # unscale with cos(lat) on the fly?
-) where NF                          # number format NF
-
-    (; nlat, nlons, nlat_half, nfreq_max ) = S
-    (; cos_colat, sin_colat, lon_offsets ) = S
-    (; recompute_legendre, Λ, Λs, m_truncs ) = S
-    (; brfft_plans ) = S
-
+function transform!(                    # SPECTRAL TO GRID
+    grids::AbstractGridArray,           # gridded output
+    specs::LowerTriangularArray,        # spectral coefficients input
+    S::SpectralTransform;               # precomputed transform
+    unscale_coslat::Bool = false,       # unscale with cos(lat) on the fly?
+)
+    # catch incorrect sizes early
     @boundscheck ismatching(S, grids) || throw(DimensionMismatch(S, grids))
     @boundscheck ismatching(S, specs) || throw(DimensionMismatch(S, specs))
 
-    lmax = specs.m - 1            # 0-based maximum degree l of spherical harmonics
-    mmax = specs.n - 1            # 0-based maximum order m of spherical harmonics
-
-    # preallocate work arrays
-    gn = zeros(Complex{NF}, nfreq_max)      # phase factors for northern latitudes
-    gs = zeros(Complex{NF}, nfreq_max)      # phase factors for southern latitudes
-
-    rings = eachring(grids)                 # precomputed ring indices
-
-    Λw = Legendre.Work(Legendre.λlm!, Λ, Legendre.Scalar(zero(Float64)))
-
-    # loop over all specs/grids (e.g. vertical dimension)
-    # k, k_grid only differ when specs/grids have a singleton dimension
-    @inbounds for (k, k_grid) in zip(eachmatrix(specs), eachgrid(grids))
-        for j_north in 1:nlat_half              # symmetry: loop over northern latitudes only
-            j_south = nlat - j_north + 1        # southern latitude index
-            nlon = nlons[j_north]               # number of longitudes on this ring
-            nfreq  = nlon÷2 + 1                 # linear max Fourier frequency wrt to nlon
-            m_trunc = m_truncs[j_north]         # (lin/quad/cub) max frequency to shorten loop over m
-            not_equator = j_north != j_south    # is the latitude ring not on equator?
-
-            # Recalculate or use precomputed Legendre polynomials Λ
-            recompute_legendre && Legendre.unsafe_legendre!(Λw, Λ, lmax, mmax, Float64(cos_colat[j_north]))
-            Λj = recompute_legendre ? Λ : Λs[j_north]
-
-            # INVERSE LEGENDRE TRANSFORM by looping over wavenumbers l, m
-            lm = 1                              # single index for non-zero l, m indices
-            for m in 1:m_trunc                  # Σ_{m=0}^{mmax}, but 1-based index, shortened to m_trunc
-                acc_odd  = zero(Complex{NF})    # accumulator for isodd(l+m)
-                acc_even = zero(Complex{NF})    # accumulator for iseven(l+m)
-
-                # integration over l = m:lmax+1
-                lm_end = lm + lmax-m+1              # first index lm plus lmax-m+1 (length of column -1)
-                even_degrees = iseven(lm+lm_end)    # is there an even number of degrees in column m?
-
-                # anti-symmetry: sign change of odd harmonics on southern hemisphere
-                # but put both into one loop for contiguous memory access
-                for lm_even in lm:2:lm_end-even_degrees     
-                    # split into even, i.e. iseven(l+m)
-                    # acc_even += specs[lm_even] * Λj[lm_even], but written with muladd
-                    acc_even = muladd(specs[lm_even, k], Λj[lm_even], acc_even)
-
-                    # and odd (isodd(l+m)) harmonics
-                    # acc_odd += specs[lm_odd] * Λj[lm_odd], but written with muladd
-                    acc_odd = muladd(specs[lm_even+1, k], Λj[lm_even+1], acc_odd)
-                end
-
-                # for even number of degrees, one acc_even iteration is skipped, do now
-                acc_even = even_degrees ? muladd(specs[lm_end, k], Λj[lm_end], acc_even) : acc_even
-
-                acc_n = (acc_even + acc_odd)        # accumulators for northern
-                acc_s = (acc_even - acc_odd)        # and southern hemisphere
-                
-                # CORRECT FOR LONGITUDE OFFSETTS
-                o = lon_offsets[m, j_north]         # longitude offset rotation            
-
-                gn[m] = muladd(acc_n, o, gn[m])     # accumulate in phase factors for northern
-                gs[m] = muladd(acc_s, o, gs[m])     # and southern hemisphere
-
-                lm = lm_end + 1                     # first index of next m column
-            end
-
-            if unscale_coslat
-                @inbounds cosθ = sin_colat[j_north] # sin(colat) = cos(lat)
-                gn ./= cosθ                         # scale in place          
-                gs ./= cosθ
-            end
-
-            # INVERSE FOURIER TRANSFORM in zonal direction
-            brfft_plan = brfft_plans[j_north]       # FFT planned wrt nlon on ring
-            ilons = rings[j_north]                  # in-ring indices northern ring
-            LinearAlgebra.mul!(view(grids.data, ilons, k_grid), brfft_plan, view(gn, 1:nfreq))  # perform FFT
-
-            # southern latitude, don't call redundant 2nd fft if ring is on equator 
-            ilons = rings[j_south]                  # in-ring indices southern ring
-            not_equator && LinearAlgebra.mul!(view(grids.data, ilons, k_grid), brfft_plan, view(gs, 1:nfreq))  # perform FFT
-
-            fill!(gn, zero(Complex{NF}))            # set phase factors back to zero
-            fill!(gs, zero(Complex{NF}))
-        end
-    end
+    # use scratch memory for Legendre but not yet Fourier-transformed data
+    g_north = S.scratch_memory_north    # phase factors for northern latitudes
+    g_south = S.scratch_memory_south    # phase factors for southern latitudes
+
+    # INVERSE LEGENDRE TRANSFORM in meridional direction
+    _legendre!(g_north, g_south, specs, S; unscale_coslat)
+
+    # INVERSE FOURIER TRANSFORM in zonal direction
+    _fourier!(grids, g_north, g_south, S)
 
     return grids
 end
@@ -441,98 +424,24 @@ number format-flexible but `grids` and the spectral transform `S` have to have t
 Uses the precalculated arrays, FFT plans and other constants in the SpectralTransform struct `S`.
 The spectral transform is grid-flexible as long as the `typeof(grids)<:AbstractGridArray` and `S.Grid`
 matches."""
-function transform!(                    # grid -> spectral
+function transform!(                    # GRID TO SPECTRAL
     specs::LowerTriangularArray,        # output: spectral coefficients
-    grids::AbstractGridArray{NF},       # input: gridded values
-    S::SpectralTransform{NF}            # precomputed spectral transform
-) where NF                              # number format
-    
-    (; nlat, nlat_half, nlons, nfreq_max, cos_colat ) = S
-    (; recompute_legendre, Λ, Λs, solid_angles ) = S
-    (; rfft_plans, lon_offsets, m_truncs ) = S
-    
+    grids::AbstractGridArray,           # input: gridded values
+    S::SpectralTransform,               # precomputed spectral transform
+)
+    # catch incorrect sizes early
     @boundscheck ismatching(S, grids) || throw(DimensionMismatch(S, grids))
     @boundscheck ismatching(S, specs) || throw(DimensionMismatch(S, specs))
 
-    lmax = specs.m - 1            # 0-based maximum degree l of spherical harmonics
-    mmax = specs.n - 1            # 0-based maximum order m of spherical harmonics
-
-    # preallocate work arrays
-    fn = zeros(Complex{NF}, nfreq_max)      # Fourier-transformed northern latitude
-    fs = zeros(Complex{NF}, nfreq_max)      # Fourier-transformed southern latitude
-
-    rings = eachring(grids)                 # precompute ring indices
-
-    # partial sums are accumulated in specs, force zeros initially.
-    fill!(specs, 0)
-
-    Λw = Legendre.Work(Legendre.λlm!, Λ, Legendre.Scalar(zero(Float64)))
-
-    # loop over all specs/grids (e.g. vertical dimension)
-    # k, k_grid only differ when specs/grids have a singleton dimension
-    @inbounds for (k, k_grid) in zip(eachmatrix(specs), eachgrid(grids))
-        for j_north in 1:nlat_half              # symmetry: loop over northern latitudes only
-            j_south = nlat - j_north + 1        # corresponding southern latitude index
-            nlon = nlons[j_north]               # number of longitudes on this ring
-            nfreq  = nlon÷2 + 1                 # linear max Fourier frequency wrt to nlon
-            m_trunc = m_truncs[j_north]         # (lin/quad/cub) max frequency to shorten loop over m
-            not_equator = j_north != j_south    # is the latitude ring not on equator?
-
-            # FOURIER TRANSFORM in zonal direction
-            rfft_plan = rfft_plans[j_north]         # FFT planned wrt nlon on ring
-            ilons = rings[j_north]                  # in-ring indices northern ring
-            LinearAlgebra.mul!(view(fn, 1:nfreq), rfft_plan, view(grids.data, ilons, k_grid))   # Northern latitude
-
-            ilons = rings[j_south]                  # in-ring indices southern ring
-                                                    # Southern latitude (don't call FFT on Equator)
-                                                    # then fill fs with zeros and no changes needed further down
-            not_equator ? LinearAlgebra.mul!(view(fs, 1:nfreq), rfft_plan, view(grids.data, ilons, k_grid)) : fill!(fs, 0)
-
-            # LEGENDRE TRANSFORM in meridional direction
-            # Recalculate or use precomputed Legendre polynomials Λ
-            recompute_legendre && Legendre.unsafe_legendre!(Λw, Λ, lmax, mmax, Float64(cos_colat[j_north]))
-            Λj = recompute_legendre ? Λ : Λs[j_north]
-            
-            # SOLID ANGLES including quadrature weights (sinθ Δθ) and azimuth (Δϕ) on ring j
-            ΔΩ = solid_angles[j_north]                      # = sinθ Δθ Δϕ, solid angle for a grid point
-
-            lm = 1                                          # single index for spherical harmonics
-            for m in 1:m_trunc                              # Σ_{m=0}^{mmax}, but 1-based index
-
-                an, as = fn[m], fs[m]
-
-                # SOLID ANGLE QUADRATURE WEIGHTS and LONGITUDE OFFSET
-                o = lon_offsets[m, j_north]                 # longitude offset rotation
-                ΔΩ_rotated = ΔΩ*conj(o)                     # complex conjugate for rotation back to prime meridian
-
-                # LEGENDRE TRANSFORM
-                a_even = (an + as)*ΔΩ_rotated               # sign flip due to anti-symmetry with
-                a_odd = (an - as)*ΔΩ_rotated                # odd polynomials 
-
-                # integration over l = m:lmax+1
-                lm_end = lm + lmax-m+1                      # first index lm plus lmax-m+1 (length of column -1)
-                even_degrees = iseven(lm+lm_end)            # is there an even number of degrees in column m?
-                
-                # anti-symmetry: sign change of odd harmonics on southern hemisphere
-                # but put both into one loop for contiguous memory access
-                for lm_even in lm:2:lm_end-even_degrees
-                    # lm_odd = lm_even+1
-                    # split into even, i.e. iseven(l+m)
-                    # specs[lm_even] += a_even * Λj[lm_even]#, but written with muladd
-                    specs[lm_even, k] = muladd(a_even, Λj[lm_even], specs[lm_even, k])
-                    
-                    # and odd (isodd(l+m)) harmonics
-                    # specs[lm_odd] += a_odd * Λj[lm_odd]#, but written with muladd
-                    specs[lm_even+1, k] = muladd(a_odd, Λj[lm_even+1], specs[lm_even+1, k])
-                end
-
-                # for even number of degrees, one even iteration is skipped, do now
-                specs[lm_end, k] = even_degrees ? muladd(a_even, Λj[lm_end], specs[lm_end, k]) : specs[lm_end, k]
-
-                lm = lm_end + 1                             # first index of next m column
-            end
-        end
-    end
+    # use scratch memory for Fourier but not yet Legendre-transformed data
+    f_north = S.scratch_memory_north    # phase factors for northern latitudes
+    f_south = S.scratch_memory_south    # phase factors for southern latitudes
+
+    # FOURIER TRANSFORM in zonal direction
+    _fourier!(f_north, f_south, grids, S)    
+    
+    # LEGENDRE TRANSFORM in meridional direction
+    _legendre!(specs, f_north, f_south, S)
 
     return specs
 end
@@ -552,6 +461,7 @@ function transform(             # GRID TO SPECTRAL
     return specs
 end
 
+
 """$(TYPEDSIGNATURES)
 Spherical harmonic transform from `specs` to a newly allocated `grids::AbstractGridArray`
 using the precomputed spectral transform `S`."""
@@ -571,16 +481,14 @@ $(TYPEDSIGNATURES)
 Spectral transform (spectral to grid space) from spherical coefficients `alms` to a newly allocated gridded
 field `map`. Based on the size of `alms` the grid type `grid`, the spatial resolution is retrieved based
 on the truncation defined for `grid`. SpectralTransform struct `S` is allocated to execute `transform(alms, S)`."""
-function transform(                     # SPECTRAL TO GRID
-    specs::LowerTriangularArray{NF};    # spectral coefficients input
-    recompute_legendre::Bool = true,    # saves memory
-    Grid::Type{<:AbstractGrid} = DEFAULT_GRID,
-    dealiasing::Real = DEFAULT_DEALIASING,
-    kwargs...                           # pass on unscale_coslat=true/false(default)
-) where NF                              # number format NF
-    lmax, mmax = size(specs, ZeroBased, as=Matrix)
-    S = SpectralTransform(real(NF), Grid, lmax, mmax; recompute_legendre, dealiasing)
-    return transform(specs, S; kwargs...)
+function transform(
+    specs::LowerTriangularArray;                # SPECTRAL TO GRID
+    unscale_coslat::Bool = false,               # separate from kwargs as argument for transform!
+    kwargs...                                   # arguments for SpectralTransform constructor
+)
+    S = SpectralTransform(specs; kwargs...)     # precompute transform
+
+    return transform(specs, S; unscale_coslat)  # do the transform
 end
 
 """
@@ -588,15 +496,7 @@ $(TYPEDSIGNATURES)
 Spectral transform (grid to spectral space) from `grids` to a newly allocated `LowerTriangularArray`.
 Based on the size of `grids` and the keyword `dealiasing` the spectral resolution trunc is
 retrieved. SpectralTransform struct `S` is allocated to execute `transform(grids, S)`."""
-function transform(
-    grids::AbstractGridArray{NF};       # gridded fields
-    recompute_legendre::Bool = true,    # saves memory
-    one_more_degree::Bool = false,      # for lmax+2 x mmax+1 output size
-    dealiasing::Real = DEFAULT_DEALIASING,
-) where NF                              # number format NF
-
-    Grid = RingGrids.nonparametric_type(typeof(grids))
-    trunc = get_truncation(grids.nlat_half, dealiasing)
-    S = SpectralTransform(NF, Grid, trunc+one_more_degree, trunc; recompute_legendre, dealiasing)
-    return transform(grids, S)
+function transform(grids::AbstractGridArray; kwargs...) # GRID TO SPECTRAL
+    S = SpectralTransform(grids; kwargs...)             # precompute transform
+    return transform(grids, S)                          # do the transform
 end
\ No newline at end of file
diff --git a/src/SpeedyTransforms/spectrum.jl b/src/SpeedyTransforms/spectrum.jl
index cc48f2652..de0539327 100644
--- a/src/SpeedyTransforms/spectrum.jl
+++ b/src/SpeedyTransforms/spectrum.jl
@@ -1,3 +1,6 @@
+"""$(TYPEDSIGNATURES)
+Compute the power spectrum of the spherical harmonic coefficients
+`alms` (lower triangular matrix) of type `Complex{NF}`."""
 function power_spectrum(alms::LowerTriangularMatrix{Complex{NF}};
                         normalize::Bool=true) where NF
     
diff --git a/src/SpeedyWeather.jl b/src/SpeedyWeather.jl
index b709b3742..cf943eadc 100644
--- a/src/SpeedyWeather.jl
+++ b/src/SpeedyWeather.jl
@@ -6,7 +6,6 @@ using DocStringExtensions
 # NUMERICS
 import Primes
 import Random
-import FastGaussQuadrature
 import LinearAlgebra: LinearAlgebra, Diagonal
 
 # GPU, PARALLEL
diff --git a/src/dynamics/spectral_grid.jl b/src/dynamics/spectral_grid.jl
index 099696ddf..75f7a0a7a 100644
--- a/src/dynamics/spectral_grid.jl
+++ b/src/dynamics/spectral_grid.jl
@@ -79,13 +79,13 @@ function Base.show(io::IO, SG::SpectralGrid)
     (; nparticles) = SG
 
     # resolution information
-    res_ave = sqrt(4π*radius^2/npoints)/1000  # in [km]
+    average_resolution = sqrt(4π*radius^2/npoints)/1000  # in [km]
     s(x) = x > 1000 ? @sprintf("%i", x) : @sprintf("%.3g", x)
 
     println(io, "$(typeof(SG)):")
     println(io, "├ Spectral:   T$trunc LowerTriangularMatrix{Complex{$NF}}, radius = $radius m")
     println(io, "├ Grid:       $nlat-ring $Grid{$NF}, $npoints grid points")
-    println(io, "├ Resolution: $(s(res_ave))km (average)")
+    println(io, "├ Resolution: $(s(average_resolution))km (average)")
     if nparticles > 0
     println(io, "├ Particles:  $nparticles")
     end
@@ -97,10 +97,8 @@ end
 $(TYPEDSIGNATURES)
 Generator function for a SpectralTransform struct pulling in parameters from a SpectralGrid struct."""
 function SpeedyTransforms.SpectralTransform(spectral_grid::SpectralGrid;
-                                            recompute_legendre::Bool = false,
                                             one_more_degree::Bool = true,
                                             kwargs...)
-    (; NF, Grid, trunc, dealiasing, ArrayType) = spectral_grid
-    return SpectralTransform(NF, Grid, trunc+one_more_degree, trunc; ArrayType, recompute_legendre, dealiasing, kwargs...)
-end
-
+    (; NF, Grid, trunc, nlat_half, nlayers, ArrayType) = spectral_grid
+    return SpectralTransform(NF, trunc+one_more_degree, trunc, nlat_half; Grid, ArrayType, nlayers, kwargs...)
+end
\ No newline at end of file
diff --git a/test/lower_triangular_matrix.jl b/test/lower_triangular_matrix.jl
index 828eeb3f5..1c97d8d2a 100644
--- a/test/lower_triangular_matrix.jl
+++ b/test/lower_triangular_matrix.jl
@@ -56,6 +56,26 @@ end
                 @test size(L.data) == size(L, as=Vector)
                 @test size(L)[2:end] == size(A)[3:end]
                 @test size(L)[1] == SpeedyWeather.LowerTriangularMatrices.nonzeros(size(A,1), size(A,2))
+                
+                # with integer to request length of one specific dimension
+                @test size(L)[1] == size(L, 1)
+                @test size(L)[1] == size(L, 1, as=Vector)
+                @test size(L)[1] == size(L, 1, OneBased, as=Vector)
+
+                @test size(L, 1, as=Matrix) == lmax
+                @test size(L, 2, as=Matrix) == mmax
+                @test size(L, 1, ZeroBased, as=Matrix) == lmax-1
+                @test size(L, 2, ZeroBased, as=Matrix) == mmax-1
+
+                # all additional dimensions are or size 1, as defined for Array too
+                @test size(L, 2+length(idims)) == 1
+                @test size(L, 3+length(idims)) == 1
+                @test size(L, 4+length(idims)) == 1
+                @test size(L, 5+length(idims)) == 1
+
+                @test size(L, 3+length(idims), as=Matrix) == 1
+                @test size(L, 4+length(idims), as=Matrix) == 1
+                @test size(L, 5+length(idims), as=Matrix) == 1
 
                 for m in 1:mmax
                     for l in 1:lmax
diff --git a/test/spectral_gradients.jl b/test/spectral_gradients.jl
index 1fe4a431e..7084ad99b 100644
--- a/test/spectral_gradients.jl
+++ b/test/spectral_gradients.jl
@@ -233,7 +233,7 @@ end
         alms2 = copy(alms)
         alms3 = copy(alms)
 
-        S = SpectralTransform(alms, recompute_legendre=false)
+        S = SpectralTransform(alms)
 
         # ∇⁻²! same as inverse=true
         SpeedyWeather.∇²!(alms2, alms, S, inverse=true);
diff --git a/test/spectral_transform.jl b/test/spectral_transform.jl
index 8e084eeff..3253f520d 100644
--- a/test/spectral_transform.jl
+++ b/test/spectral_transform.jl
@@ -67,24 +67,25 @@ spectral_resolutions_inexact = (127, 255)
     end
 end
 
-@testset "Transform: Recompute, precompute identical results" begin
-    for trunc in spectral_resolutions
-        for NF in (Float32, Float64)
+# functionality deprecated
+# @testset "Transform: Recompute, precompute identical results" begin
+#     for trunc in spectral_resolutions
+#         for NF in (Float32, Float64)
 
-            SG = SpectralGrid(; NF, trunc)
-            S1 = SpectralTransform(SG, recompute_legendre=true)
-            S2 = SpectralTransform(SG, recompute_legendre=false)
+#             SG = SpectralGrid(; NF, trunc)
+#             S1 = SpectralTransform(SG, recompute_legendre=true)
+#             S2 = SpectralTransform(SG, recompute_legendre=false)
 
-            alms = randn(LowerTriangularMatrix{Complex{NF}}, trunc+2, trunc+1)
+#             alms = randn(LowerTriangularMatrix{Complex{NF}}, trunc+2, trunc+1)
 
-            map1 = transform(alms, S1)
-            map2 = transform(alms, S2)
+#             map1 = transform(alms, S1)
+#             map2 = transform(alms, S2)
         
-            # is only approx as recompute_legendre may use a different precision
-            @test map1 ≈ map2
-        end
-    end
-end
+#             # is only approx as recompute_legendre may use a different precision
+#             @test map1 ≈ map2
+#         end
+#     end
+# end
 
 @testset "Transform: Individual Legendre polynomials" begin
     @testset for trunc in spectral_resolutions
@@ -95,7 +96,7 @@ end
                             OctahedralClenshawGrid)
 
                 SG = SpectralGrid(; NF, trunc, Grid)
-                S = SpectralTransform(SG, recompute_legendre=true)
+                S = SpectralTransform(SG)
 
                 lmax = 3
                 for l in 1:lmax
@@ -125,7 +126,7 @@ end
                             OctahedralClenshawGrid)
 
                 SG = SpectralGrid(; NF, trunc, Grid)
-                S = SpectralTransform(SG, recompute_legendre=true)
+                S = SpectralTransform(SG)
 
                 lmax = 3
                 for l in 1:lmax
@@ -258,7 +259,7 @@ end
                                     FullOctaHEALPixGrid)
                 
                 SG = SpectralGrid(; NF, trunc, Grid)
-                S = SpectralTransform(SG, recompute_legendre=true)
+                S = SpectralTransform(SG,)
 
                 lmax = 3
                 for l in 1:lmax
@@ -295,7 +296,7 @@ end
                 dealiasing = Grid in (FullGaussianGrid, OctahedralGaussianGrid) ? 2 : 3
 
                 SG = SpectralGrid(; NF, trunc, Grid, dealiasing)
-                S = SpectralTransform(SG, recompute_legendre=false)
+                S = SpectralTransform(SG)
                 O = EarthOrography(SG, smoothing=true)
                 E = Earth(SG)
                 initialize!(O, E, S)