Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Central schema yaml #89

Merged
merged 10 commits into from
Dec 6, 2023
Merged
2 changes: 2 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ jobs:

env:
DATAREG_CONFIG: "${{ github.workspace }}/config.txt"
DATAREG_BACKEND: "postgres"

# Service containers to run with `runner-job`
services:
Expand Down Expand Up @@ -150,6 +151,7 @@ jobs:

env:
DATAREG_CONFIG: "${{ github.workspace }}/config.txt"
DATAREG_BACKEND: "sqlite"

# Our strategy lists the OS and Python versions we want to test on.
strategy:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ jobs:
run: |
python -m pip install --upgrade pip
python -m pip install .
pip install sphinx sphinx_rtd_theme sphinx_toolbox sphinxcontrib-autoprogram
pip install sphinx sphinx_rtd_theme sphinx_toolbox sphinxcontrib-autoprogram sphinxcontrib.datatemplates
- name: Sphinx build
run: |
sphinx-build docs/source _build
Expand Down
3 changes: 3 additions & 0 deletions docs/source/_static/css/custom.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.tight-table td {
white-space: normal !important;
}
9 changes: 8 additions & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@
"sphinx_rtd_theme",
"sphinx.ext.autodoc",
'sphinx.ext.napoleon',
'sphinxcontrib.autoprogram'
'sphinxcontrib.autoprogram',
'sphinxcontrib.datatemplates'
]

project = 'DESC data management'
Expand Down Expand Up @@ -36,3 +37,9 @@
html_logo = '_static/DREGS_logo_v2.png'

autoclass_content = 'both'

templates_path = ['templates']

html_css_files = [
'css/custom.css',
]
247 changes: 2 additions & 245 deletions docs/source/reference_schema.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,248 +7,5 @@ database (e.g., the default and production schemas) follows the same structure.
.. image:: _static/schema_plot.png
:alt: Image missing

The dataset table
-----------------

.. list-table::
:header-rows: 1

* - row
- description
- type
* - ``dataset_id``
- Unique identifier for dataset
- int
* - ``name``
- User given name for dataset
- str
* - ``relative_path``
- Relative path storing the data, relative to `<root_dir>`
- str
* - ``version_major``
- Major version in semantic string (i.e., X.x.x)
- int
* - ``version_minor``
- Minor version in semantic string (i.e., x.X.x)
- int
* - ``version_patch``
- Patch version in semantic string (i.e., x.x.X)
- int
* - ``version_suffix``
- Optional version suffix
- str
* - ``dataset_creation_date``
- Dataset creation date
- datetime
* - ``is_archived``
- True if the data is archived, i.e, the data is longer within `<root_dir>`
- bool
* - ``is_external_link``
- ???
- bool
* - ``is_overwritten``
- True if the original data for this dataset has been overwritten at some point. This would have required that ``is_overwritable`` was set to ``true`` on the original dataset
- bool
* - ``is_valid``
- ???
- bool
* - ``register_date``
- Date the dataset was registered
- datetime
* - ``creator_uid``
- `uid` (user id) of the person that registered the dataset
- str
* - ``access_API``
- Describes the software that can read the dataset (e.g., "gcr-catalogs", "skyCatalogs")
- str
* - ``execution_id``
- ID of execution this dataset belongs to
- int
* - ``description``
- User provided description of the dataset
- str
* - ``owner_type``
- Datasets owner type, can be "user", "group", "project" or "production".
- str
* - ``owner``
- Owner of the dataset
- str
* - ``data_org``
- Dataset organisation ("file" or "directory")
- str
* - ``nfiles``
- How many files are in the dataset
- int
* - ``total_disk_space``
- Total disk spaced used by the dataset
- float

The dataset_alias table
-----------------------

.. list-table::
:header-rows: 1

* - row
- description
- type
* - ``dataset_alias_id``
- Unique identifier for alias
- int
* - ``name``
- User given alias name
- str
* - ``dataset_id``
- ID of dataset this is an alias for
- int
* - ``supersede_date``
- If a new entry has been added to the table with the same alias name (but
different dataset_id), the old entry will be superseded. ``supersede_date``
in the old entry tracks when this happened. If the entry has not been
superseded, ``supersede_date`` will be None
- datetime
* - ``register_date``
- Date the dataset was registered
- datetime
* - ``creator_uid``
- `uid` (user id) of the person that registered the dataset
- str

The dependency table
--------------------

.. list-table::
:header-rows: 1

* - row
- description
- type
* - ``dependency_id``
- Unique identifier for dependency
- int
* - ``execution_id``
- Execution this dependency is linked to
- int
* - ``input_id``
- Dataset ID of the dependent dataset
- int
* - ``register_date``
- Date the dependency was registered
- datetime

The execution table
-------------------

.. list-table::
:header-rows: 1

* - row
- description
- type
* - ``execution_id``
- Unique identifier for execution
- int
* - ``description``
- User given discription of execution
- str
* - ``name``
- User given execution name
- str
* - ``register_date``
- Date the execution was registered
- datetime
* - ``execution_start``
- Date the execution started
- datetime
* - ``locale``
- Locale of execution (e.g., NERSC)
- str
* - ``configuration``
- Path to configuration file of execution
- str
* - ``creator_uid``
- `uid` (user id) of the person that registered the dataset
- str

The execution_alias table
-------------------------

.. list-table::
:header-rows: 1

* - row
- description
- type
* - ``execution_alias_id``
- Unique identifier for execution alias
- int
* - ``execution_id``
- Execution this alias is linked to
- int
* - ``alias``
- User given execution alias name
- str
* - ``register_date``
- Date the execution was registered
- datetime
* - ``supersede_date``
- If a new entry has been added to the table with the same alias name (but
different dataset_id), the old entry will be superseded. ``supersede_date``
in the old entry tracks when this happened. If the entry has not been
superseded, ``supersede_date`` will be None
- datetime
* - ``creator_uid``
- `uid` (user id) of the person that registered the dataset
- str

The provenance table
--------------------

.. list-table::
:header-rows: 1

* - row
- description
- type
* - ``provenance_id``
- Unique identifier for provenance
- int
* - ``code_version_major``
- Major version of code when this schema was created
- int
* - ``code_version_minor``
- Minor version of code when this schema was created
- int
* - ``code_version_patch``
- Patch version of code when this schema was created
- int
* - ``code_version_suffix``
- Version suffix of code when this schema was created
- str
* - ``db_version_major``
- Major version of database
- int
* - ``db_version_minor``
- Minor version of database
- int
* - ``db_version_patch``
- Patch version of database
- int
* - ``git_hash``
- Git commit hash when this schema was created
- str
* - ``repo_is_clean``
- Was repository clean when this schema was created
- bool
* - ``update_method``
- "CREATE", "MODIFY" or "MIGRATE"
- str
* - ``schema_enabled_date``
- When was the schema enabled
- datetime
* - ``creator_uid``
- `uid` (user id) of the person that registered the schema
- str
* - ``comment``
- Any comment
- str
.. datatemplate:yaml:: ../../src/dataregistry/schema/schema.yaml
:template: schema_table.tmpl
22 changes: 22 additions & 0 deletions docs/source/templates/schema_table.tmpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
.. -*- mode: rst -*-

{% for table in ['execution','provenance','execution_alias','dataset','dependency','dataset_alias'] %}

The {{table}} table
----------------------------------------

.. list-table::
:header-rows: 1
:class: tight-table

* - row
- description
- type

{% for item in data[table] %}
* - {{item}}
{% for item2 in ['description', 'type'] %}
- {{data[table][item][item2]}}
{% endfor %}
{% endfor %}
{% endfor %}
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -38,4 +38,4 @@ where = ["src"]
dregs = "cli.cli:main"

[tool.setuptools.package-data]
"dataregistry" = ["site_config/site_rootdir.yaml"]
"dataregistry" = ["site_config/site_rootdir.yaml", "schema/schema.yaml"]
Loading
Loading