compatibility with LazyArrays #925

tiemvanderdeure · 2025-02-07T10:17:43Z

It could be nice to make dimensionaldata and LazyArrays.jl work together. It would probably need an extension. Currently everything I've tried with the @d macro returns errors somewhere and I don't really understand how their @~ macro works. But just to demonstrate some potential use cases I've implemented a wrapper to generate a DimArray with a LazyArray parent:

using DimensionalData, LazyArrays, BenchmarkTools

_maybe_reshape(x::AbstractDimArray, dims) = parent(DimensionalData._maybe_insert_length_one_dims(x, dims))
_maybe_reshape(x, _) = x

function LazyDimArray(a)
    dimargs = filter(x -> x isa AbstractDimArray, a.args)
    isempty(dimargs) && error("No dimensional arguments found")
    dest_dims = DimensionalData.combinedims(dimargs...)
    _args = map(x -> _maybe_reshape(x, dest_dims), dimargs)
    data = LazyArray(@~ a.f.(_args...))
    rebuild(first(dimargs); data, dims = dest_dims)
end

Something like this could be a neat way of getting some really big dimarrays with minimal overhead:

da1 = rand(X(1:1000))
da2 = rand(Y(1:1000))

@btime sum(LazyDimArray(@~ $da1 .* $da2))
@btime sum(@d $da1 .* $da2)
A = parent(da1); B = parent(da2)'
@btime sum(LazyArray(@~ $A .* $B))

  105.847 μs (2 allocations: 96 bytes)
  1.399 ms (5 allocations: 7.63 MiB)
  105.815 μs (0 allocations: 0 bytes)

The text was updated successfully, but these errors were encountered:

rafaqz · 2025-02-07T10:32:42Z

Probably we want to apply this wrapper automatically, so we need to add methods to whatever the constructor of your a object is that rewrap as a DimArray.

So that: (@~ $da1 .* $da2) isa DimArray

(The principle here is that DD aggressively rewraps other constructors to be the outer wrapper wherever it can, without user intervention)

tiemvanderdeure · 2025-02-07T10:48:04Z

Ideally I think something like LazyArray(@d @~ da1 .* da2) would work. But I think we might need some additional dispatches.

What @~ does is basically it stops the materialize. So @~ A .* B is a Base.Broadcast.Broadcasted

rafaqz · 2025-02-07T15:12:45Z

I don't think @d @~ will compose correctly, why not the other way around?

tiemvanderdeure · 2025-02-10T15:36:31Z

why not the other way around

The other way around doesn't work because the expression @d returns is a let block which @~ cannot do anything with.

Another option is to handle all of this within the @d macro and just add a keyword like lazy.

So @d A .* B lazy=true would just return a DimArray with a LazyArray parent. It's pretty clean syntax and we don't have to make LazyArray return a DimArray, but of course it comes at the cost of adding another keyword, and diverging from LazyArrays syntax.

It'd be very easy to implement, we'd just need to add something like

if lazy 
    broadcast_expr = quote
        bc = Base.Broadcast.instantiate(LazyArrays.lazy.($broadcast_expr))
        rebuild(first(vars); data = LazyArray(bc), dims)
    end
end

rafaqz · 2025-02-10T16:54:11Z

It's also a bit weird as we mostly have disk arrays for lazyness, and this laziness may be broken on disk arrays depending how the broadcast styles work out. People will think it means lazy on disk too, but it doesnt

tiemvanderdeure · 2025-02-10T17:18:11Z

In Rasters, yes. But now that @d makes broadcasting across dims so easy, I think this is also a pretty logical use case.

I agree this could potentially be confusing, though.

tiemvanderdeure · 2025-02-10T17:25:37Z

Another thing we need for this to work is to move some code from Broadcast.copy to Broadcast.instantiate (like dims comparisons, and unwrapping). Just spent ages debugging because the axes of a lazy dimensional array I made in this way were broken because there is no Broadcast.instantiate(bc::Broadcasted{DimensionalStyle{S}})

rafaqz · 2025-02-10T17:27:45Z

And YAX...

We can totally integrate with LazyArrays, but there is no way lazy will be a built in keyword for the d macro that means "LazyArrays".

asinghvi17 · 2025-02-10T17:42:47Z

Maybe materialize=false, if it's just a naming issue?

rafaqz · 2025-02-10T17:56:20Z

Name is one thing, but also putting extensions into default keywords is kinda weird.

materialize also makes no sense if the array parent is already a disk array. Then materialise=false would break the already nonmatrializing behaviour.

rafaqz · 2025-02-10T18:02:00Z

Is there a reason not to do modify(LazyArray, A) on you DimArrays? Or DimArray(LazyArray(A), dims) ?

Won't the broadcasts end up lazy?

tiemvanderdeure · 2025-02-10T19:30:15Z

No unfortunately not, because we also need to functionality of @~, which prevents the broadcast from materializing. I still haven't figured out what the best way is to get both @~ and @d.

Once we have that the wrapping of a LazyArray in a DimArray is the easy bit - which we can do with a dispatch in an extension or just have people do manually.

rafaqz · 2025-02-10T19:52:06Z

Hm... I'm just wondering why LazyArray wrappers don't do that with Broadcast styles. That's what DiskArrays does and its a lot more elegant and composable than needing a special macro. Macros compose badly and always lead to these problems.

(e.g. we only have the @d macro because it actually changes the result of the broadcast. LazyArrays doesn't need a macro except for convenience)

tiemvanderdeure · 2025-02-11T12:48:46Z

Yeah it's quite annoying. The design choice is that you can't just wrap an Array in LazyArray to make any operation on it automatically (like with DiskArrays). Instead you have to construct a LazyArray is a result of a broadcast operation. It's not something we're going to be able to change, I guess, but it makes life a lot harder for us.

The way to do this without the macro is to go LazyArrays.lazy.(A .* B) (which is what the macro does), and this almost works with @d, except for that @d wraps the result of A .* B in a _maybe_dimensional_broadcast as well, and so it again materializes.

Maybe this is something we can change? Is there a reason that in @d f.(g.(A,B)), the result of g.(A,B) should be wrapped in _maybe_dimensional_broadcast, if both A and B are anyway?

Otherwise I suppose the conclusion is that this needs thought/isn't possible right now, I can use my bandaid solution for my own use case, and we now have this issue as documentation for anyone who's trying the same thing.

rafaqz · 2025-02-11T13:17:49Z

Ok, yeah that's not great design, I would make an issue there too. It should work like DiskArrays, whicu is really composable even though it does so much more than LazyArrays

We could limit how much _maybe_dimensional_broadcast is applied by passing out a flag for if we found a broadcast lower down. Or maybe it can let more things through without wrapping. It will just take some care

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compatibility with LazyArrays #925

compatibility with LazyArrays #925

tiemvanderdeure commented Feb 7, 2025

rafaqz commented Feb 7, 2025 •

edited

Loading

tiemvanderdeure commented Feb 7, 2025

rafaqz commented Feb 7, 2025 •

edited

Loading

tiemvanderdeure commented Feb 10, 2025

rafaqz commented Feb 10, 2025

tiemvanderdeure commented Feb 10, 2025

tiemvanderdeure commented Feb 10, 2025

rafaqz commented Feb 10, 2025 •

edited

Loading

asinghvi17 commented Feb 10, 2025

rafaqz commented Feb 10, 2025 •

edited

Loading

rafaqz commented Feb 10, 2025

tiemvanderdeure commented Feb 10, 2025

rafaqz commented Feb 10, 2025 •

edited

Loading

tiemvanderdeure commented Feb 11, 2025

rafaqz commented Feb 11, 2025

compatibility with LazyArrays #925

compatibility with LazyArrays #925

Comments

tiemvanderdeure commented Feb 7, 2025

rafaqz commented Feb 7, 2025 • edited Loading

tiemvanderdeure commented Feb 7, 2025

rafaqz commented Feb 7, 2025 • edited Loading

tiemvanderdeure commented Feb 10, 2025

rafaqz commented Feb 10, 2025

tiemvanderdeure commented Feb 10, 2025

tiemvanderdeure commented Feb 10, 2025

rafaqz commented Feb 10, 2025 • edited Loading

asinghvi17 commented Feb 10, 2025

rafaqz commented Feb 10, 2025 • edited Loading

rafaqz commented Feb 10, 2025

tiemvanderdeure commented Feb 10, 2025

rafaqz commented Feb 10, 2025 • edited Loading

tiemvanderdeure commented Feb 11, 2025

rafaqz commented Feb 11, 2025

rafaqz commented Feb 7, 2025 •

edited

Loading

rafaqz commented Feb 7, 2025 •

edited

Loading

rafaqz commented Feb 10, 2025 •

edited

Loading

rafaqz commented Feb 10, 2025 •

edited

Loading

rafaqz commented Feb 10, 2025 •

edited

Loading