Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] - Derivation MetaSWAP mappings fails with Dask >= 2025.2.0 #1436

Open
2 tasks
JoerivanEngelen opened this issue Feb 19, 2025 · 0 comments
Open
2 tasks
Labels
bug Something isn't working

Comments

@JoerivanEngelen
Copy link
Contributor

JoerivanEngelen commented Feb 19, 2025

Bug description

By sheer coincidence, @WouterSwierstra and I ran into a bug that arose with the latest Dask release. Problematic are lines like this:

mod_id.data[idomain_active.data] = np.arange(1, n_mod + 1)

If we load this into memory as numpy arrays, everything is fine, however keeping them as lazy objects causes values not to be set or causes an error like: ValueError: cannot broadcast shape (10,) to shape (nan,). I think I'm doing something here that was not intended to work with Dask and somehow magically worked.

And even worse:

svat.values[isactive.values] = np.arange(1, index.sum() + 1)

Here it tries to set a read-only array in case Dask is used: .values then returns a read-only array.

Related issue: dask/dask#11753, this provides a potential solution:

In the meantime, dask-ml should use da.where instead (which is what da.Array.setitem calls internally) unless there is a use case about applying the same functions to numpy arrays, which would take a performance hit?

So I therefore think using a where is a lot safer here. Operations are on 2D grids, without time dimension, so I don't foresee huge performance issues by doing this.

Furthermore: The fact that we missed this is probably caused by our test bench loading everything into memory, otherwise this bug would have surfaced earlier. Dask 2025.2.0 is in the present dev environment.

Refinement

  • In the MetaSWAP mapping derivations that use fancy indexing, use where instead.
  • Add tests where data is not loaded into memory and run with dask.
@JoerivanEngelen JoerivanEngelen added the bug Something isn't working label Feb 19, 2025
@github-project-automation github-project-automation bot moved this to 📯 New in iMOD Suite Feb 19, 2025
@JoerivanEngelen JoerivanEngelen moved this from 📯 New to 📝Refined in iMOD Suite Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: 📝Refined
Development

No branches or pull requests

1 participant