You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If we load this into memory as numpy arrays, everything is fine, however keeping them as lazy objects causes values not to be set or causes an error like: ValueError: cannot broadcast shape (10,) to shape (nan,). I think I'm doing something here that was not intended to work with Dask and somehow magically worked.
Here it tries to set a read-only array in case Dask is used: .values then returns a read-only array.
Related issue: dask/dask#11753, this provides a potential solution:
In the meantime, dask-ml should use da.where instead (which is what da.Array.setitem calls internally) unless there is a use case about applying the same functions to numpy arrays, which would take a performance hit?
So I therefore think using a where is a lot safer here. Operations are on 2D grids, without time dimension, so I don't foresee huge performance issues by doing this.
Furthermore: The fact that we missed this is probably caused by our test bench loading everything into memory, otherwise this bug would have surfaced earlier. Dask 2025.2.0 is in the present dev environment.
Refinement
In the MetaSWAP mapping derivations that use fancy indexing, use where instead.
Add tests where data is not loaded into memory and run with dask.
The text was updated successfully, but these errors were encountered:
Bug description
By sheer coincidence, @WouterSwierstra and I ran into a bug that arose with the latest Dask release. Problematic are lines like this:
imod-python/imod/msw/coupler_mapping.py
Line 120 in 4c4e6e1
If we load this into memory as numpy arrays, everything is fine, however keeping them as lazy objects causes values not to be set or causes an error like:
ValueError: cannot broadcast shape (10,) to shape (nan,)
. I think I'm doing something here that was not intended to work with Dask and somehow magically worked.And even worse:
imod-python/imod/msw/grid_data.py
Line 98 in 4c4e6e1
Here it tries to set a read-only array in case Dask is used:
.values
then returns a read-only array.Related issue: dask/dask#11753, this provides a potential solution:
So I therefore think using a
where
is a lot safer here. Operations are on 2D grids, without time dimension, so I don't foresee huge performance issues by doing this.Furthermore: The fact that we missed this is probably caused by our test bench loading everything into memory, otherwise this bug would have surfaced earlier. Dask 2025.2.0 is in the present dev environment.
Refinement
where
instead.The text was updated successfully, but these errors were encountered: