-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check indexing arg types #1293
Comments
My main concern here is whether doing |
It does not, which kind of defeats the optimization. This now needs doing. Confirmed by adding a import anndata as ad, numpy as np, zarr
from scipy import sparse
rng = np.random.default_rng()
g = zarr.open()
X = sparse.random(10_000, 1_000, density=0.01, format="csr", random_state=rng)
ad.experimental.write_elem(g, "X", X, dataset_kwargs={"chunks": 1_000})
X_backed = ad.experimental.sparse_dataset(g["X"])
bool_idx = np.zeros(X.shape[0], dtype=bool)
bool_idx[5000:] = True
bool_idx[:1000] = True
adata = ad.AnnData(X=X_backed)
adata[bool_idx].X
|
slight aside: |
@ilan-gold what’s missing here? |
Closing this as it is no longer needed. I've gone through the remaining instances and haven't found any that affect the performance optimizations. |
Please describe your wishes and possible alternatives to achieve the desired result.
Working on #1224, we discovered that
np.where
is called internally byscipy
on boolean masks, but there are places in our code where we also call it, likely unnecessarily. It would be good to review these places to see if they are still necessary and then see if there is an effect on performance.The text was updated successfully, but these errors were encountered: