-
Notifications
You must be signed in to change notification settings - Fork 45
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
11 changed files
with
1,071 additions
and
30 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,126 @@ | ||
# Export Process | ||
|
||
## Why | ||
|
||
DataJoint does not have any built-in functionality for exporting vertical slices | ||
of a database. A lab can maintain a shared DataJoint pipeline across multiple | ||
projects, but conforming to NIH data sharing guidelines may require that data | ||
from only one project be shared during publication. | ||
|
||
## Requirements | ||
|
||
To export data with the current implementation, you must do the following: | ||
|
||
- All custom tables must inherit from `SpyglassMixin` (e.g., | ||
`class MyTable(SpyglassMixin, dj.ManualOrOther):`) | ||
- Only one export can be active at a time. | ||
- Start the export process with `ExportSelection.start_export()`, run all | ||
functions associated with a given analysis, and end the export process with | ||
`ExportSelection.end_export()`. | ||
|
||
## How | ||
|
||
The current implementation relies on two classes in the `spyglass` package: | ||
`SpyglassMixin` and `RestrGraph` and the `Export` tables. | ||
|
||
- `SpyglassMixin`: See `spyglass/utils/dj_mixin.py` | ||
- `RestrGraph`: See `spyglass/utils/dj_graph.py` | ||
- `Export`: See `spyglass/common/common_usage.py` | ||
|
||
### Mixin | ||
|
||
The `SpyglassMixin` class is a subclass of DataJoint's `Manual` class. A subset | ||
of methods are used to set an environment variable, `SPYGLASS_EXPORT_ID`, and, | ||
while active, intercept all `fetch`/`fetch_nwb` calls to tables. When `fetch` is | ||
called, the mixin grabs the table name and the restriction applied to the table | ||
and stores them in the `ExportSelection` part tables. | ||
|
||
- `fetch_nwb` is specific to Spyglass and logs all analysis nwb files that are | ||
fetched. | ||
- `fetch` is a DataJoint method that retrieves data from a table. | ||
|
||
### Graph | ||
|
||
The `RestrGraph` class uses DataJoint's networkx graph to store each of the | ||
tables and restrictions intercepted by the `SpyglassMixin`'s `fetch` as | ||
'leaves'. The class then cascades these restrictions up from each leaf to all | ||
ancestors. Use is modeled in the methods of `ExportSelection`. | ||
|
||
```python | ||
from spyglass.utils.dj_graph import RestrGraph | ||
|
||
restr_graph = RestrGraph(seed_table=AnyTable, leaves=None, verbose=False) | ||
restr_graph.add_leaves( | ||
leaves=[ | ||
{ | ||
"table_name": MyTable.full_table_name, | ||
"restriction": "any_restriction", | ||
}, | ||
{ | ||
"table_name": AnotherTable.full_table_name, | ||
"restriction": "another_restriction", | ||
}, | ||
] | ||
) | ||
restr_graph.cascade() | ||
restricted_leaves = restr_graph.leaf_ft | ||
all_restricted_tables = restr_graph.all_ft | ||
|
||
restr_graph.write_export(paper_id="my_paper_id") # part of `populate` below | ||
``` | ||
|
||
By default, a `RestrGraph` object is created with a seed table to have access to | ||
a DataJoint connection and graph. One or more leaves can be added at | ||
initialization or later with the `add_leaves` method. The cascade process is | ||
delayed until `cascade`, or another method that requires the cascade, is called. | ||
|
||
Cascading a single leaf involves transforming the leaf's restriction into its | ||
parent's restriction, then repeating the process until all ancestors are | ||
reached. If two leaves share a common ancestor, the restrictions are combined. | ||
This process also accommodates projected fields, which appear as numeric alias | ||
nodes in the graph. | ||
|
||
### Export Table | ||
|
||
The `ExportSelection` is where users should interact with this process. | ||
|
||
```python | ||
from spyglass.common.common_usage import ExportSelection | ||
from spyglass.common.common_usage import Export | ||
|
||
export_key = {paper_id: "my_paper_id", analysis_id: "my_analysis_id"} | ||
ExportSelection().start_export(**export_key) | ||
ExportSelection().restart_export(**export_key) # to clear previous attempt | ||
analysis_data = (MyTable & my_restr).fetch() | ||
analysis_nwb = (MyTable & my_restr).fetch_nwb() | ||
ExportSelection().end_export() | ||
|
||
# Visual inspection | ||
touched_files = DS().list_file_paths(**export_key) | ||
restricted_leaves = DS().preview_tables(**export_key) | ||
|
||
# Export | ||
Export().populate() | ||
``` | ||
|
||
`Export` will invoke `RestrGraph.write_export` to collect cascaded restrictions | ||
and file paths in its part tables, and write out a bash script to export the | ||
data using a series of `mysqldump` commands. The script is saved to Spyglass's | ||
directory, `base_dir/export/paper_id/`, using credentials from `dj_config`. To | ||
use alternative credentials, create a | ||
[mysql config file](https://dev.mysql.com/doc/refman/8.0/en/option-files.html). | ||
|
||
## External Implementation | ||
|
||
To implement an export for a non-Spyglass database, you will need to ... | ||
|
||
- Create a modified version of `SpyglassMixin`, including ... | ||
- `_export_table` method to lazy load an export table like `ExportSelection` | ||
- `export_id` attribute, plus setter and deleter methods, to manage the status | ||
of the export. | ||
- `fetch` and other methods to intercept and log exported content. | ||
- Create a modified version of `ExportSelection`, that adjusts fields like | ||
`spyglass_version` to match the new database. | ||
|
||
Or, optionally, you can use the `RestrGraph` class to cascade hand-picked tables | ||
and restrictions without the background logging of `SpyglassMixin`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,152 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"tags": [] | ||
}, | ||
"source": [ | ||
"# Export\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Intro\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"_Developer Note:_ if you may make a PR in the future, be sure to copy this\n", | ||
"notebook, and use the `gitignore` prefix `temp` to avoid future conflicts.\n", | ||
"\n", | ||
"This is one notebook in a multi-part series on Spyglass.\n", | ||
"\n", | ||
"- To set up your Spyglass environment and database, see\n", | ||
" [the Setup notebook](./00_Setup.ipynb)\n", | ||
"- To insert data, see [the Insert Data notebook](./01_Insert_Data.ipynb)\n", | ||
"- For additional info on DataJoint syntax, including table definitions and\n", | ||
" inserts, see\n", | ||
" [these additional tutorials](https://github.com/datajoint/datajoint-tutorials)\n", | ||
"- For information on what's goint on behind the scenes of an export, see\n", | ||
" [documentation](https://lorenfranklab.github.io/spyglass/0.5/misc/export/)\n", | ||
"\n", | ||
"In short, Spyglass offers the ability to generate exports of one or more subsets\n", | ||
"of the database required for a specific analysis as long as you do the following:\n", | ||
"\n", | ||
"- Inherit `SpyglassMixin` for all custom tables.\n", | ||
"- Run only one export at a time.\n", | ||
"- Start and stop each export logging process.\n", | ||
"\n", | ||
"**NOTE:** For demonstration purposes, this notebook relies on a more populated\n", | ||
"database to highlight restriction merging capabilities of the export process.\n", | ||
"Adjust the restrictions to suit your own dataset.\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Imports\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Let's start by importing the `spyglass` package, along with a few others.\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"metadata": { | ||
"tags": [] | ||
}, | ||
"outputs": [ | ||
{ | ||
"name": "stderr", | ||
"output_type": "stream", | ||
"text": [ | ||
"[2024-01-29 16:15:00,903][INFO]: Connecting root@localhost:3309\n", | ||
"[2024-01-29 16:15:00,912][INFO]: Connected root@localhost:3309\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"import os\n", | ||
"import datajoint as dj\n", | ||
"\n", | ||
"# change to the upper level folder to detect dj_local_conf.json\n", | ||
"if os.path.basename(os.getcwd()) == \"notebooks\":\n", | ||
" os.chdir(\"..\")\n", | ||
"dj.config.load(\"dj_local_conf.json\") # load config for database connection info\n", | ||
"\n", | ||
"# ignore datajoint+jupyter async warnings\n", | ||
"from spyglass.common.common_usage import Export, ExportSelection\n", | ||
"from spyglass.lfp.analysis.v1 import LFPBandV1\n", | ||
"from spyglass.position.v1 import TrodesPosV1\n", | ||
"from spyglass.spikesorting.v1.curation import CurationV1\n", | ||
"\n", | ||
"# TODO: Add commentary, describe helpers on ExportSelection\n", | ||
"\n", | ||
"paper_key = {\"paper_id\": \"paper1\"}\n", | ||
"ExportSelection().start_export(**paper_key, analysis_id=\"test1\")\n", | ||
"a = (\n", | ||
" LFPBandV1 & \"nwb_file_name LIKE 'med%'\" & {\"filter_name\": \"Theta 5-11 Hz\"}\n", | ||
").fetch()\n", | ||
"b = (\n", | ||
" LFPBandV1\n", | ||
" & {\n", | ||
" \"nwb_file_name\": \"mediumnwb20230802_.nwb\",\n", | ||
" \"filter_name\": \"Theta 5-10 Hz\",\n", | ||
" }\n", | ||
").fetch()\n", | ||
"ExportSelection().start_export(**paper_key, analysis_id=\"test2\")\n", | ||
"c = (CurationV1 & \"curation_id = 1\").fetch_nwb()\n", | ||
"d = (TrodesPosV1 & 'trodes_pos_params_name = \"single_led\"').fetch()\n", | ||
"ExportSelection().stop_export()\n", | ||
"Export().populate_paper(**paper_key)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Up Next\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"In the [next notebook](./10_Spike_Sorting.ipynb), we'll start working with\n", | ||
"ephys data with spike sorting.\n" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "spy", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.9.16" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 4 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
# --- | ||
# jupyter: | ||
# jupytext: | ||
# text_representation: | ||
# extension: .py | ||
# format_name: light | ||
# format_version: '1.5' | ||
# jupytext_version: 1.16.0 | ||
# kernelspec: | ||
# display_name: spy | ||
# language: python | ||
# name: python3 | ||
# --- | ||
|
||
# # Export | ||
# | ||
|
||
# ## Intro | ||
# | ||
|
||
# _Developer Note:_ if you may make a PR in the future, be sure to copy this | ||
# notebook, and use the `gitignore` prefix `temp` to avoid future conflicts. | ||
# | ||
# This is one notebook in a multi-part series on Spyglass. | ||
# | ||
# - To set up your Spyglass environment and database, see | ||
# [the Setup notebook](./00_Setup.ipynb) | ||
# - To insert data, see [the Insert Data notebook](./01_Insert_Data.ipynb) | ||
# - For additional info on DataJoint syntax, including table definitions and | ||
# inserts, see | ||
# [these additional tutorials](https://github.com/datajoint/datajoint-tutorials) | ||
# - For information on what's goint on behind the scenes of an export, see | ||
# [documentation](https://lorenfranklab.github.io/spyglass/0.5/misc/export/) | ||
# | ||
# In short, Spyglass offers the ability to generate exports of one or more subsets | ||
# of the database required for a specific analysis as long as you do the following: | ||
# | ||
# - Inherit `SpyglassMixin` for all custom tables. | ||
# - Run only one export at a time. | ||
# - Start and stop each export logging process. | ||
# | ||
# **NOTE:** For demonstration purposes, this notebook relies on a more populated | ||
# database to highlight restriction merging capabilities of the export process. | ||
# Adjust the restrictions to suit your own dataset. | ||
# | ||
|
||
# ## Imports | ||
# | ||
|
||
# Let's start by importing the `spyglass` package, along with a few others. | ||
# | ||
|
||
# + | ||
import os | ||
import datajoint as dj | ||
|
||
# change to the upper level folder to detect dj_local_conf.json | ||
if os.path.basename(os.getcwd()) == "notebooks": | ||
os.chdir("..") | ||
dj.config.load("dj_local_conf.json") # load config for database connection info | ||
|
||
# ignore datajoint+jupyter async warnings | ||
from spyglass.common.common_usage import Export, ExportSelection | ||
from spyglass.lfp.analysis.v1 import LFPBandV1 | ||
from spyglass.position.v1 import TrodesPosV1 | ||
from spyglass.spikesorting.v1.curation import CurationV1 | ||
|
||
# TODO: Add commentary, describe helpers on ExportSelection | ||
|
||
paper_key = {"paper_id": "paper1"} | ||
ExportSelection().start_export(**paper_key, analysis_id="test1") | ||
a = ( | ||
LFPBandV1 & "nwb_file_name LIKE 'med%'" & {"filter_name": "Theta 5-11 Hz"} | ||
).fetch() | ||
b = ( | ||
LFPBandV1 | ||
& { | ||
"nwb_file_name": "mediumnwb20230802_.nwb", | ||
"filter_name": "Theta 5-10 Hz", | ||
} | ||
).fetch() | ||
ExportSelection().start_export(**paper_key, analysis_id="test2") | ||
c = (CurationV1 & "curation_id = 1").fetch_nwb() | ||
d = (TrodesPosV1 & 'trodes_pos_params_name = "single_led"').fetch() | ||
ExportSelection().stop_export() | ||
Export().populate_paper(**paper_key) | ||
# - | ||
|
||
# ## Up Next | ||
# | ||
|
||
# In the [next notebook](./10_Spike_Sorting.ipynb), we'll start working with | ||
# ephys data with spike sorting. | ||
# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.