-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Files Module and Example #29
Merged
Merged
Changes from 9 commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
594ad1b
initial commit
a-l-holmes 270b353
:tada: create logical model and part of file profile
a-l-holmes 514e2d6
":sparkles: introduce EDAM code system and write complex extension fo…
a-l-holmes 241af29
":sparkles: introduce EDAM code system and write complex extension fo…
a-l-holmes 593e4ca
:sparkles: finish example, profile, introduce info page
a-l-holmes 81baee1
Merge branch 'files_module' of https://github.com/NIH-NCPI/ncpi-fhir-…
a-l-holmes 677981a
:recycle: restore completeness after making push/pull mistake
a-l-holmes a742247
:recycle: make EDAM for file format a value set
a-l-holmes cd3a666
:memo: added information page about logical model
a-l-holmes b48abde
:adhesive_bandage: correct intro table and redeploy IG preview
a-l-holmes e7f36eb
Merge branch 'main' into files_module
a-l-holmes f4de841
:memo: correct file name inconsistency and add notation about File Me…
a-l-holmes f12a301
Merge branch 'files_module' of https://github.com/NIH-NCPI/ncpi-fhir-…
a-l-holmes File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
Instance: PT-006SP660 | ||
InstanceOf: NcpiFile | ||
Title: "Example file based on CBTN" | ||
Usage: #example | ||
Description: "Use case of file information from CBTN" | ||
* identifier.value = "PT_006SP660" | ||
* subject = Reference(GF_6BAD9S7D) | ||
* description = "Annotated Variant Call" | ||
* type = $edam#operation_3227 "Variant calling" | ||
* status = #current | ||
* content.attachment.url = "s3://kf-strides-study-us-east-1-prd-sd-54g4wg4r/harmonized-data/family-variants/155bb529-2e7b-474f-ba24-cd0656d5f3d0.CGP.filtered.deNovo.vep.vcf.gz" | ||
* extension[location-access].valueReference = Reference(NcpiResearchAccessPolicy.accessType) | ||
* extension[file-format].valueCodeableConcept.coding = $edam#format_3016 "VCF" | ||
* extension[file-size] | ||
* valueQuantity | ||
* value = 1044770380 | ||
* unit = "bytes" | ||
* extension[hash] | ||
* extension[hash-value].valueString = "8f107912d862cf91fbfb77bf9c1bab36-4" | ||
* extension[hash-type].valueCode = #etag |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,142 @@ | ||
/* | ||
Files Module profiles and logical Model | ||
*/ | ||
|
||
Logical: CdmFile | ||
Id: SharedDataModelFile | ||
Title: "Shared Data Model for File" | ||
Description: "The **Shared Data Model for File**" | ||
* participantID 1..1 reference "The participant(s) for whom this file contains data" | ||
* fileExternalID 0..1 string "A related identifier of this file" | ||
* format 1..1 code "The file format used" | ||
* location 1..* List "List of locations where this data can be accessed" | ||
* location.uri 1..1 uri "The URI at which this data can be accessed" | ||
* location.accessPolicy 0..* reference "If present, only those under the specific Access Policy can access the file in this location." | ||
* fileSize 1..1 Quantity "The size of the file, e.g., in bytes." | ||
* hash 0..* List "Provides a list of hashes for confirming file transfers" | ||
* hash.type 0..1 code "Algorithm used to calculate the hash (and size, where applicable)" | ||
* hash.value 1..1 string "Value of hashing the file" | ||
* contentVersion 0..1 string "Version of the file content" | ||
* description 0..1 string "A description of the file" | ||
* type 1..1 code "The type of data contained in this file. Should be as detailed as possible, e.g., Whole Exome Variant Calls." | ||
* relatedFile 0..1 List "Provides a reference to another file that is related to this one" | ||
* relatedFile.file 0..1 reference "The file to which this related file is related" | ||
* relatedFile.type 0..1 code "The relationship of the file to the parent file in reference" | ||
|
||
/* TODO Add Related file to metadata - AH 2024-07-30 */ | ||
|
||
CodeSystem: HashTypeCS | ||
Id: example-hash-type-code-system | ||
Title: "Hash Types Code System" | ||
Description: "Algorithm used to calculate the hash (and size, where applicable)" | ||
* #md5 "md5 hash type" | ||
* #sha256 "sha256 hash type" | ||
* #sha512 "sha512 hash type" | ||
* #sha1 "sha1 hash type" | ||
* #crc32 "crc32 hash type" | ||
* #crc32c "crc32c hash type" | ||
* #etag "etag hash type" | ||
|
||
CodeSystem: RelatedFileTypeCS | ||
Id: related-file-type-code-system | ||
Title: "Related File Type Code System" | ||
Description: "Explains the relationship of this file to the file of reference" | ||
* #index_of "Index of" | ||
* #has_index "Has index" | ||
* #data_dictionary_of "Data dictionary of" | ||
* #has_data_dictionary "Has data dictionary" | ||
* #plink-type-associated-files "Plink-type associated files" | ||
|
||
ValueSet: EDAMOntologyTerms | ||
Id: edam-ontology-terms | ||
Title: "Enumerations for the EDAM ontology" | ||
Description: "Enumerations for the EDAM ontology" | ||
* include codes from system $edam | ||
|
||
Extension: FileFormat | ||
Id: file-format | ||
Title: "The file format used" | ||
Description: "The file format used" | ||
* insert SetContext(DocumentReference) | ||
* value[x] only CodeableConcept | ||
* valueCodeableConcept from $edam (extensible) | ||
|
||
Extension: LocationAccess | ||
Id: location-access | ||
Title: "If present, only those under the specific Access Policy can access the file in this location." | ||
Description: "If present, only those under the specific Access Policy can access the file in this location." | ||
* insert SetContext(DocumentReference.content) | ||
* value[x] only Reference | ||
* valueReference ^short = "If present, only those under the specific Access Policy can access the file in this location." | ||
|
||
Extension: FileSize | ||
Id: file-size | ||
Title: "The size of the file, e.g., in bytes." | ||
Description: "The size of the file, e.g., in bytes." | ||
* insert SetContext(DocumentReference) | ||
* value[x] only Quantity | ||
* valueQuantity ^short = "Indicate the size of the file in reference" | ||
|
||
Extension: ContentVersion | ||
Id: content-version | ||
Title: "Version of the contents of the file" | ||
Description: "Version of the contents of the file" | ||
* insert SetContext(DocumentReference) | ||
* value[x] only string | ||
* valueString ^short = "Indicate the version (e.g., V1) for the contents of this file" | ||
|
||
Extension: HashValue | ||
Id: hash-value | ||
Title: "Value of hashing the file" | ||
Description: "Value of hashing the file" | ||
* insert SetContext(DocumentReference.extension) | ||
* value[x] only string | ||
* valueString ^short = "Value of hashing the file" | ||
|
||
Extension: HashType | ||
Id: hash-type | ||
Title: "Algorithm used to calculate the hash (and size, where applicable)" | ||
Description: "Algorithm used to calculate the hash (and size, where applicable)" | ||
* insert SetContext(DocumentReference.extension) | ||
* value[x] only code | ||
* valueCode ^short = "Algorithm used to calculate the hash (and size, where applicable)" | ||
|
||
Extension: HashExtension | ||
Id: hash-extension | ||
Title: "Provides a list of hashes for confirming file transfers" | ||
Description: "Provides a list of hashes for confirming file transfers" | ||
* insert SetContext(List) | ||
* extension contains HashValue named hash-value 1..1 | ||
* extension[hash-value] ^short = "Value of hashing the file" | ||
* extension contains HashType named hash-type 1..1 | ||
* extension[hash-type] ^short = "Algorithm used to calculate the hash (and size, where applicable)" | ||
|
||
/** TODO Add Related file to metadata - AH 2024-07-30 */ | ||
|
||
Profile: NcpiFile | ||
Parent: DocumentReference | ||
Id: ncpi-file | ||
Title: "NCPI File" | ||
Description: "Information about a file related to a research participant" | ||
* ^version = "0.0.1" | ||
* ^status = #draft | ||
* identifier 0..* /*File External ID*/ | ||
* identifier ^short = "A related external file ID" | ||
* subject 0..1 /*Participant*/ | ||
* subject ^short = "The participant(s) for whom this file contains data (i.e., ParticipantID)" | ||
* extension contains FileFormat named file-format 1..1 /*File Format*/ | ||
* extension[file-format] ^short = "The file format used (EDAM is preferred)" | ||
* content.attachment.url 1..1 /*Location uri*/ | ||
* content.attachment.url ^short = "The URI at which this data can be accessed" | ||
* extension contains LocationAccess named location-access 0..* /*Location Access Policy*/ | ||
* extension[location-access] ^short = "If present, only those under the specific Access Policy can access the file in this location." | ||
* extension contains FileSize named file-size 1..1 /*File Size*/ | ||
* extension[file-size] ^short = "Indicate the size of the file in reference" | ||
* extension contains HashExtension named hash 0..* /*Hash (contains type and value)*/ | ||
* extension contains ContentVersion named content-version 0..1 /*Content Version*/ | ||
* extension[content-version] ^short = "The version of the content in the file" | ||
* description 0..1 /*Description*/ | ||
* description ^short = "A description of the file" | ||
* type 0..1 /*File Type*/ | ||
* type from edam-ontology-terms (extensible) | ||
* type ^short = "The type of data contained in this file." |
11 changes: 11 additions & 0 deletions
11
input/pagecontent/StructureDefinition-SharedDataModelResearchFile-intro.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
### NCPI File | ||
#### Introduction | ||
Files are a common research product. In this straightforward representation, we provide basic details of the file and how to access it. Details about what is contained in the file or how the content was generated should be described with other entities, such as data dictionaries, summaries, or assays. | ||
|
||
#### File Definitions | ||
File contains basic file metadata about the file location and contents. Files are typically associated with one or more participants, though they can also include general study documents. The file content may have different access control restrictions when compared to this entity, which is only the file metadata. | ||
|
||
There can be multiple file location references, for example DRS and cloud storage references, though the access approaches for those locations should be reasonably apparent through the Access Policy for the file content. | ||
|
||
#### Example | ||
If a data file is ONLY accessible through DRS, the underlying bucket locations should not be included here as no user would be able to access them directly. However, if there are multiple Access Policies that provide routes to access the data through different URIs, those can be included. Controlled access release via DRS with a consortium access model permitting direct bucket access could both be stated here to permit consistent reference to the File irrespective of the access mechanism. |
torstees marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
#### Key Guidelines | ||
The NCPI File profile is based on the standard resource type, [DocumentReference](https://hl7.org/fhir/r4/documentreference.html) and is intended to represent the files associated with a participant in a research study. | ||
|
||
##### Added Profile Restrictions | ||
In order to ensure that our resources are interoperable across studies, we have employed a number of restrictions that should make consuming Patient resources more consistent. | ||
|
||
|
||
* participantID **should** be a globally unique identifier associated with the patient. This practice is intended to make constructing queries for the same patient compatible across different servers (such as QA vs PROD) but also to make the resource URLs more meaningful. | ||
|
||
* fileExternalID **should** have all appropriate Identifiers with a meaningful system/value pair. Such identifiers may include DbGAP accession IDs, global and external IDs, etc. | ||
|
||
* format and relatedFile.type **should** use [EDAM](https://edamontology.org/) terminology (i.e., codes) when available. Othe file type code systems are allowed if a suitable EDAM code does not exist. | ||
|
||
|
||
#### Recommended Practices | ||
TDOD: Write Recommended Practices | ||
|
||
##### FHIR Mappings | ||
The following fields from the shared data model are to be mapped to the NCPI File as shown below: | ||
|
||
| **Logical Model Property** | **Cardinality** | **NCPI File Mapping** | **Usage Guidance** | **Notes**| | ||
participantID|1..1|VSReference(5.1.0)|The participant(s) for whom this file contains data| | ||
fileExternalID|0..1|string|A related identifier of this file| | ||
format|1..1|code|The file format used| | ||
location|1..*|List|List of locations where this data can be accessed| | ||
location.uri|1..1|uri|The URI at which this data can be accessed| | ||
location.accessPolicy|0..*|VSReference(5.1.0)|If present, only those under the specific Access Policy can access the file in this location.| | ||
fileSize|1..1|Quantity|The size of the file, e.g., in bytes.| | ||
hash|0..*|List|Provides a list of hashes for confirming file transfers| | ||
hash.type|0..1|code|Algorithm used to calculate the hash (and size, where applicable)| | ||
hash.value|1..1|string|Value of hashing the file| | ||
contentVersion|0..1|string|Version of the file content| | ||
description|0..1|string|A description of the file| | ||
type|1..1|code|The type of data contained in this file. Should be as detailed as possible, e.g., Whole Exome Variant Calls.| | ||
relatedFile|0..1|List|Provides a reference to another file that is related to this one| | ||
relatedFile.file|0..1|VSReference(5.1.0)|The file to which this related file is related| | ||
relatedFile.type|0..1|code|The relationship of the file to the parent file in reference| |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filename here is slightly different from what is expected by the publisher application. It should be StructureDefinition-SahredDataModelFile-intro.md based on the id of your logical model. When you change this filename, the content here should magically appear inside your logical model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch-- I made the change in the file name!