Central schema yaml #89

stuartmcalpine · 2023-12-02T15:23:25Z

The schema information is replicated in three places

When creating the schema using the script
Describing it in the docs
The command line args for the CLI

So we don't have to replicate all the entries each time in these three cases, I've put the schema in a yaml file, which all three sections above can read.

This tidies the code a lot, but also when we add a new row in a table we only have to do it in one place.

This contains all the tables and row entries in a central location. Now the schema creation script, the CLI and the docs can use this, so we don't have to worry about missing somewhere when adding a new entry to a table.

stuartmcalpine · 2023-12-02T16:35:01Z

More CI test with CLI registering datasets, e.g., DateTime

JoanneBogart

Please change all those occurrences of "row" (the ones I've marked and anything similar) to "column". That's the main thing. There are also one or two very minor things.

JoanneBogart · 2023-12-04T21:08:56Z

scripts/create_registry_db.py

+schema_yaml = load_schema()
+
+
+def _get_rows(schema, table):


Name should be _get_columns, or _get_column_definitions rather than _get_rows. There are no rows here, only column names and definitions. "Rows" normally refers to rows of data within a table. It's true that the properties of a column can be thought of as a row in a table, but, unless there is some extra explanatory text, that is not the first thing people will think of.

All of the remaining review comments for create_registry_db.py make essentially the same point. I see now that the code being replaced had the same confusing (to me) nomenclature; I apologize for not catching this earlier.

I think I've captured all instances of row -> column now

scripts/create_registry_db.py

src/cli/cli.py

src/dataregistry/schema/schema.yaml

JoanneBogart · 2023-12-06T00:56:49Z

src/dataregistry/schema/schema.yaml

+    foreign_key_table: "dataset"
+    foreign_key_row: "dataset_id"
+    description: "Dataset this alias is linked to"
+


Maybe add a comment here that, for any given row, exactly one of input_id and input_production_id is non-null

In case something like this comes up again: I was actually thinking of a yaml comment preceding the table definition, e.g.
# For each row in this table, exactly one of input_id and input_production_id is non-null
but the way you've done it is perfectly ok, perhaps preferable.

…umn'.

stuartmcalpine · 2023-12-06T09:27:15Z

I've changed all the instances of row->column where needed.

I've tweaked the register CLI a bit to make sure it's now able to accept all the properties (it was missing creation date before for example).

Also the only other change (other than your review comments) is in the last commit, where I reduce the register doc strings (which were pretty large). Basically any input args that are column names have a ** to refer to the docs for their description. The desciption of what the column represents doesn't help the code. Plus in the spirit of this PR its to reduce places in the code where these descriptions are duplicated.

JoanneBogart

LGTM. Thanks.

JoanneBogart · 2023-12-06T19:41:48Z

src/dataregistry/schema/schema.yaml

+    foreign_key_table: "dataset"
+    foreign_key_row: "dataset_id"
+    description: "Dataset this alias is linked to"
+


In case something like this comes up again: I was actually thinking of a yaml comment preceding the table definition, e.g.
# For each row in this table, exactly one of input_id and input_production_id is non-null
but the way you've done it is perfectly ok, perhaps preferable.

Add a central schema yaml file.

9edbe08

This contains all the tables and row entries in a central location. Now the schema creation script, the CLI and the docs can use this, so we don't have to worry about missing somewhere when adding a new entry to a table.

stuartmcalpine changed the base branch from main to u/stuart/database_generator December 2, 2023 15:23

stuartmcalpine added 4 commits December 2, 2023 16:25

Add schema.yaml to pyproject.toml

0b360da

Format

375f1eb

Remove defaults from table creation, move to registrar

0b499b8

Modify CLI to use schema.yaml

65f4c6e

Fix for sqlite

5d6b0ce

stuartmcalpine requested a review from JoanneBogart December 2, 2023 16:42

JoanneBogart requested changes Dec 6, 2023

View reviewed changes

stuartmcalpine added 3 commits December 6, 2023 09:30

Address reviewer comments. Mainly chaging references of 'row' to 'col…

abb8492

…umn'.

Fix CLI for all schema columns when registering a dataset

920bc09

Reduce register doc strings

59da1d8

stuartmcalpine requested a review from JoanneBogart December 6, 2023 09:27

JoanneBogart approved these changes Dec 6, 2023

View reviewed changes

Merge branch 'u/stuart/database_generator' into u/stuart/central_schema

a289daf

stuartmcalpine merged commit 6a6635c into u/stuart/database_generator Dec 6, 2023
8 checks passed

stuartmcalpine deleted the u/stuart/central_schema branch December 6, 2023 20:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Central schema yaml #89

Central schema yaml #89

stuartmcalpine commented Dec 2, 2023

stuartmcalpine commented Dec 2, 2023 •

edited

Loading

JoanneBogart left a comment

JoanneBogart Dec 4, 2023

stuartmcalpine Dec 6, 2023

JoanneBogart Dec 6, 2023

stuartmcalpine Dec 6, 2023

JoanneBogart Dec 6, 2023

stuartmcalpine commented Dec 6, 2023 •

edited

Loading

JoanneBogart left a comment

JoanneBogart Dec 6, 2023

Central schema yaml #89

Central schema yaml #89

Conversation

stuartmcalpine commented Dec 2, 2023

stuartmcalpine commented Dec 2, 2023 • edited Loading

JoanneBogart left a comment

Choose a reason for hiding this comment

JoanneBogart Dec 4, 2023

Choose a reason for hiding this comment

stuartmcalpine Dec 6, 2023

Choose a reason for hiding this comment

JoanneBogart Dec 6, 2023

Choose a reason for hiding this comment

stuartmcalpine Dec 6, 2023

Choose a reason for hiding this comment

JoanneBogart Dec 6, 2023

Choose a reason for hiding this comment

stuartmcalpine commented Dec 6, 2023 • edited Loading

JoanneBogart left a comment

Choose a reason for hiding this comment

JoanneBogart Dec 6, 2023

Choose a reason for hiding this comment

stuartmcalpine commented Dec 2, 2023 •

edited

Loading

stuartmcalpine commented Dec 6, 2023 •

edited

Loading