Skip to content

Commit

Permalink
Update the docs for overwriting datasets
Browse files Browse the repository at this point in the history
  • Loading branch information
stuartmcalpine committed Nov 29, 2023
1 parent 670e9af commit 807c33f
Showing 1 changed file with 7 additions and 3 deletions.
10 changes: 7 additions & 3 deletions docs/source/tutorial_notebooks/getting_started.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,9 @@
"\n",
"### The relative path\n",
"\n",
"Datasets are registered at the data registry shared space under a relative path. For those interested, the eventual full path for the dataset will be `<root_dir>/<owner_type>/<owner>/<relative_path>`. The relative path is one of the two required parameters you must specify when registering a dataset (in the example here our relative path is `nersc_tutorial/my_desc_dataset`).\n",
"Datasets are registered at the data registry shared space under a path relative to the root directory. For those interested, the eventual full path for the dataset will be `<root_dir>/<owner_type>/<owner>/<relative_path>`. This means that the combination of `relative_path`, `owner` and `owner_type` must be unique within the registry, and therefore cannot already be taken when you register a new dataset (an exception to this is if you allow your datasets to be overwritable, see below). \n",
"\n",
"The relative path is one of the two required parameters you must specify when registering a dataset (in the example here our relative path is `nersc_tutorial/my_desc_dataset`).\n",
"\n",
"### The version string\n",
"\n",
Expand All @@ -186,7 +188,9 @@
"\n",
"### Overwriting datasets\n",
"\n",
"By default datasets are not overwritable. In those scenarios you will need to choose a combination of `relative_path`, `owner` and `owner_type` that is not already taken in the database. For our example we have set it so that the dataset can be overwritten so that it does not raise an error through multiple tests. Note that when a dataset has `is_overwritable=true`, the data in the shared space will be overwritten with each registration, but the entry in the data registry database is never lost (a new unique entry is created each time, and the 'old' entries will obtain `true` for their `is_overwritten` row).\n",
"By default, datasets in the data registry, once registered, are not overwritable. You can change this behavior by setting `is_overwritable=true` when registering your datasets. If `is_overwritable=true` on one of your previous datasets, you can register a new dataset with the same combination of `relative_path`, `owner` and `owner_type` as before (be warned that any previous data stored under this path will be deleted first). \n",
"\n",
"Note that whilst the data in the shared space will be overwritten with each registration when `is_overwritable=true`, the original entries in the data registry database are never lost (a new unique entry is created each time, and the 'old' entries will obtain `true` for their `is_overwritten` row).\n",
"\n",
"### Copying the data\n",
"\n",
Expand Down Expand Up @@ -362,7 +366,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
"version": "3.10.12"
}
},
"nbformat": 4,
Expand Down

0 comments on commit 807c33f

Please sign in to comment.