Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support abstract data types that are not meant to be instantiated #44

Open
rly opened this issue Feb 12, 2025 · 8 comments
Open

Support abstract data types that are not meant to be instantiated #44

rly opened this issue Feb 12, 2025 · 8 comments
Labels
category: proposal discussion of proposed enhancements or new features
Milestone

Comments

@rly
Copy link
Contributor

rly commented Feb 12, 2025

It is sometimes useful to define a hierarchy of neurodata types such that a type (usually the highest level) is an abstract data type, i.e., one that is not meant to be instantiated.

Use Case 1: NeurodataWithoutBorders/nwb-schema#603 proposes creating a BaseImage neurodata type that has Image and ExternalImage as subtypes. It would be useful to allow other neurodata types to link to / include a BaseImage neurodata type. This is not just a Union of Image and ExternalImage but it would also allow new data types to be defined that extend BaseImage that could be placed wherever a BaseImage is allowed. A BaseImage should not be created though.

Use Case 2: https://github.com/catalystneuro/ndx-microscopy proposes creating a ImagingSpace neurodata type which has PlanarImagingSpace and VolumetricImagingSpace as subtypes.

LinkML supports the abstract key: https://linkml.io/linkml-model/latest/docs/abstract/

cc @bendichter

@rly rly added the category: proposal discussion of proposed enhancements or new features label Feb 12, 2025
@rly rly modified the milestone: 3.0.0 Feb 12, 2025
@rly
Copy link
Contributor Author

rly commented Feb 12, 2025

Proposal: Recommend that abstract data type names start with the prefix "Base". Then APIs like PyNWB and MatNWB can strip the "Base" when automatically generating field names. For example, in catalystneuro/ndx-microscopy#41 (comment) we have the schema:

- neurodata_type_def: SegmentationContainer
    neurodata_type_inc: NWBDataInterface
    doc: ...
    groups:
      - neurodata_type_inc: BaseSegmentation
        doc: ...
        quantity: "+"

I think in both PyNWB and MatNWB, the autogenerated field name for the dictionary of BaseSegmentation objects in the SegmentationContainer class would be base_segmentations. I think that's confusing. It would be better if the API just used segmentations. This is ultimately an API problem, but I think the schema language can recommend naming practices that the APIs can take advantage of to build usable interfaces.

@oruebel
Copy link
Contributor

oruebel commented Feb 12, 2025

It seems that this is behavior that should be enforced. If that is the case, then I think making this explicit in the schema will be important, e.g. via an explicit key to define whether a type should be abstract./e.g., abstract: true. Naming conventions are nice, but naming conventions alone do not make intended behavior explicit. A user that want to create an abstract class needs to find the naming conventions and a user who doesn't want to create an abstract class may end up accidently defining a class as abstract by using a name that starts with Base. E.g., is a type Baseline abstract? Having the naming convention is good, but I don't think it is sufficient to explicitly describe such a complex behavior.

@oruebel
Copy link
Contributor

oruebel commented Feb 12, 2025

LinkML also uses the abstract: true key approach. https://linkml.io/linkml/schemas/inheritance.html#abstract-classes-and-slots

@rly
Copy link
Contributor Author

rly commented Feb 12, 2025

Yes, sorry, I meant to propose the abstract: true key approach like in LinkML in the original post, and on top of that, recommend the "Base" naming convention for those data types.

@oruebel
Copy link
Contributor

oruebel commented Feb 12, 2025

Sounds good. Having both the abstract: true key combined with a naming convention sounds reasonable.

@bendichter
Copy link
Collaborator

bendichter commented Feb 12, 2025

I like the key and the base convention, but I worry a bit about the name handling, particularly on the matlab side. Thoughts, @ehennestad?

@rly
Copy link
Contributor Author

rly commented Feb 12, 2025

MatNWB does not need to follow the same convention for generating APIs as HDMF/PyNWB (MatNWB already lacks object mappers, e.g., they have to know about the NWB general group) but I think it would be nice for users.

@rly
Copy link
Contributor Author

rly commented Feb 12, 2025

An alternate approach to the naming convention for APIs is to allow the spec to make explicit what the field name should be for a collection of objects of type X. That also avoids the Potato -> "potatos" (as opposed to "potatoes") issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: proposal discussion of proposed enhancements or new features
Projects
None yet
Development

No branches or pull requests

3 participants