-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Approach to sharing metadata alongside files for reuse #153
Comments
Current thoughts on this:
|
+1 to this. This is flexible enough to include multiple data modalities if needed.
I would recommend considering the following framework: All metadata from included files in all datasets of a collection "roll-up" to the top level collection. This functionality will be similar to "Add all Annotations" function that currently exists in the fileview webUI, but will need to be implemented in the context of collections. Then the metadata file that will be readily available for the collection will contain all metadata attributes in one csv file. |
This makes sense and would be a nice feature to have! The problem that I think we need to address is how to surface record-based metadata that we want to share alongside files. Metadata types like Biospecimen and Individual are uploaded via their own manifests and stored in their own tables - they aren't tied to files or applied as file annotations. That way, we can have a database of the record-based info, which can be searched and associated with files, as needed. To associate files with this metadata, there is a system of primary and foreign keys that can be used to refer to the relevant entries. Considering this comment:
I have considered using Datasets/Collections to surface this info as annotations and this concept actually makes it seem like the better idea here. We can extract record-based metadata entries from the source table and apply it to the files in a Dataset, based on foreign key attributes in the metadata. Even if this functionality is currently exclusive to Datasets, having a single manifest with the file, assay, and specimen info seems like a convenient way to share metadata |
Issue: files can be added to Synapse Datasets, but the only metadata we can directly add to the table are file annotations. We should consider how we want to approach sharing experiment-related record metadata (Biospecimen, Model, Individual, GeoMx AOI info, etc.)
Tables can be easily subsetted or queried to extract experiment-specific information. However, if we are going to use table subsets to derive experiment-specific record metadata that can be packaged with files for download, what is the best way to go about this?
The text was updated successfully, but these errors were encountered: