Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creation of object-defined feature tables #27

Closed
gusqgm opened this issue Apr 25, 2022 · 6 comments
Closed

Creation of object-defined feature tables #27

gusqgm opened this issue Apr 25, 2022 · 6 comments
Assignees

Comments

@gusqgm
Copy link
Collaborator

gusqgm commented Apr 25, 2022

Once label-maps are created, we would like to calculate object-based features and store them for further analysis. Challenge here is to create friendly tables that can be used for storing features of each object or objects of object.

It seems like there is some working being done towards the implementation of AnnData into Napari, which could be promising, as it may be able to deal with large tables. Also a potentially interesting approach would be to look at awkward arrays, which support the handling of variable sized nested arrays (which would work well with the situation of an experiment having e.g. contained objects each parent of different amount of sub-objects) using numpy-like functions. I have not found any documentation about parallel reading of it yet. @jluethi have you ever heard/tried any of this?

@gusqgm
Copy link
Collaborator Author

gusqgm commented Apr 25, 2022

The minimal version of such an array would be before feature extraction, just saving the object labels along with their masks into a table of some sort. Here one idea would be to save the coordinates of e.g. top left of a bounding box along with its dimensions and a binary array that contains the mask of the object itself.

Later on we can populate this label index with more feature columns

@jluethi
Copy link
Collaborator

jluethi commented Jun 8, 2022

We're currently testing the AnnData integration into the OME-NGFF spec proposal as a solution for this problem, see here, it looks promising from a conceptual perspective:

@jluethi
Copy link
Collaborator

jluethi commented Jun 15, 2022

I have been testing this prototype repository more and it's looking promising.

I tested it on the single well, 2x2 test case. The prototype repository also has support for writing label images, so I wrote both my label images as well as actual feature measurements for those fovs to the ome-zarr file.

The feature data is saved per field of view in this setting (where there are also corresponding label images).

The structure then looks like this:

.
├── 20200812-CardiomyocyteDifferentiation14-Cycle1.zarr
│   └── B
│       └── 03
│           ├── 0
│           │   ├── 0
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── 1
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── 2
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── 3
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── 4
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── labels
│           │   │   └── label_image
│           │   └── tables
│           │       └── regions_table
│           ├── 1
│           │   ├── 0
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── 1
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── 2
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── 3
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── 4
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── labels
│           │   │   └── label_image
│           │   └── tables
│           │       └── regions_table
│           ├── 2
│           │   ├── 0
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── 1
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── 2
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── 3
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── 4
│           │   │   ├── 0
│           │   │   ├── 1
│           │   │   └── 2
│           │   ├── labels
│           │   │   └── label_image
│           │   └── tables
│           │       └── regions_table
│           └── 3
│               ├── 0
│               │   ├── 0
│               │   ├── 1
│               │   └── 2
│               ├── 1
│               │   ├── 0
│               │   ├── 1
│               │   └── 2
│               ├── 2
│               │   ├── 0
│               │   ├── 1
│               │   └── 2
│               ├── 3
│               │   ├── 0
│               │   ├── 1
│               │   └── 2
│               ├── 4
│               │   ├── 0
│               │   ├── 1
│               │   └── 2
│               ├── labels
│               │   └── label_image
│               └── tables
│                   └── regions_table

And the regions_table folder something like this:

│               └── tables
│                   └── regions_table
│                       ├── X
│                       ├── layers
│                       ├── obs
│                       ├── obsm
│                       ├── obsp
│                       ├── uns
│                       ├── var
│                       ├── varm
│                       └── varp

@jluethi
Copy link
Collaborator

jluethi commented Jun 16, 2022

Another solution to this problem we should keep an eye on is the Omero Plus implementation of features using PyTables & HDF5. See here for some details: https://www.glencoesoftware.com/blog/2022/04/01/Beyond-images-with-OME-NGFF.html
I can't find out much about it, maybe it's just something custom for the commercial version. But certainly interesting for inspiration.

@jluethi
Copy link
Collaborator

jluethi commented Jun 20, 2022

I had a follow-up meeting with Kevin today. I think the AnnData approach is looking promising, the proposal to include it into OME-NGFF should be discussed in July and we already have a working python implementation in this example repo: https://github.com/kevinyamauchi/ome-ngff-tables-prototype

I have tested this on our UZH 2x2 test case and saved the labels & features. It looks promising. I will contribute this test case to the example repo, so that they are also aware of the HCS setups.

My main takeaways from the meeting:

  1. AnnData in OME-NGFF is being pushed soon. Promising for us to start using as our feature store. We should know by end of July if anyone is blocking AnnData in OME-NGFF or whether it will proceed.
  2. They are planning to look into ways of saving points & polygons with it as well, maybe also regions of interest.
  3. After getting the spec approved, they are aiming to build a separate spatial data library that builds on AnnData in OME-Zarr. This could tackle things like merging data modalities, but also spatial indexing via R-trees or KD-trees (=> better lazy loading)
  4. They aren't familiar with the HCS spec, but also the HCS spec seems to be an early version, maybe it will change again?
  5. Regarding hierarchies, Kevin recommended using "parent/child" feature columns for objects (e.g. each cell has an organoid column with an organoid identified. Either unique identifiers or labels). Especially if the relationships are only a few levels deep, this may be the least complicated and easiest way to achieve this.

@tcompa tcompa closed this as completed Jul 20, 2022
Repository owner moved this from TODO to Done in Fractal Project Management Jul 20, 2022
@jluethi jluethi moved this from Done to Done Archive in Fractal Project Management Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants