Add `NwbTimeSeriesExtractor` to load non-electrical series data from NWB #3587

h-mayorquin · 2024-12-17T17:09:08Z

Some users would like to access data from NWB that is not an ElectricalSeries to analyze it with SpikeInterface. TimeSeries data can be loaded lazily to Spikeinterface but it does not have all the ElectricalSeries infrastructure like electrodes so it can not be loaded by the current extractor. This PR adds another extractor that can be used to load TimeSeries data.

@steevelaquitaine @borrepp

zm711

Just a couple cosmetic things.

src/spikeinterface/extractors/nwbextractors.py

zm711 · 2025-01-13T13:22:09Z

src/spikeinterface/extractors/nwbextractors.py

+    and returns a list with their paths.
+    """
+    if backend == "hdf5":
+        import h5py


I don't know how big h5py is or zarr for that matter. Is there any benefit to just import the exact thing we need (ie import h5py.Group rather than the full package?

Most packages do not keep the type of discipline that you would need for this to make a difference. I venture to guess that it would not make a difference in this case but I don't have time to profile.

I honestly just didn't know the size of these packages. Something like scipy would be so slow and heavy for this kind of check, but if these are rather small/fast (which has been my experience playing with h5py at least) then it doesn't matter.

Yeah, what I mean is that usually both of them are equivalent because importing one attribute imports the full module in most python packages. Anyway, here are the measurements:

@h-laptop$ conda activate work (work) @h-laptop$ python -m timeit -s "import h5py" 50000000 loops, best of 5: 5.43 nsec per loop (work) @h-laptop$ python -m timeit -s "from h5py import Group" 50000000 loops, best of 5: 6.33 nsec per loop (work) @h-laptop$ python -m timeit -s "from zarr import Group" 50000000 loops, best of 5: 5.39 nsec per loop (work) @h-laptop$ python -m timeit -s "import zarr" 50000000 loops, best of 5: 5.31 nsec per loop

zm711 · 2025-01-13T13:24:02Z

src/spikeinterface/extractors/nwbextractors.py

+        If True, the time vector is loaded into the recording object. Useful when
+        precise timing information is needed.
+    samples_for_rate_estimation : int, default: 1000
+        The number of timestamp samples used for estimating the sampling rate when


I'm trying to understand the writing. This would just be the number of timestamps right? I'm not quite sure what it would mean to say timestamp samples? But I don't know this format so I could be totally off on this :)

Yeah, confusing, the thing is that the object/data container is called timestamps. That time series (timestamps) has samples.

Let me think this through.

I welcome any suggestion though.

I think I would need to see an example. Maybe I'll look at the test and see if I can better understand how this works. If it is confusing naming then we just have to live with it.

src/spikeinterface/extractors/nwbextractors.py

zm711 · 2025-01-13T13:28:16Z

src/spikeinterface/extractors/nwbextractors.py

+                self.timeseries_path = list(time_series_dict.keys())[0]
+            else:
+                raise ValueError(
+                    f"Multiple TimeSeries found! Specify 'timeseries_path'. Options: {list(time_series_dict.keys())}"


Co-authored-by: Zach McKenzie <[email protected]>

src/spikeinterface/extractors/nwbextractors.py

add time series extractor from nwb

2739ea7

h-mayorquin self-assigned this Dec 17, 2024

h-mayorquin changed the title ~~Add NwbTimeSeriesExtractor to extract TimeSeries data from NWB~~ Add NwbTimeSeriesExtractor to load non-electrical series data from NWB Dec 17, 2024

h-mayorquin mentioned this pull request Dec 17, 2024

Question: How to insert new data into a recording object. Purpose: "remove_artifacts" by inserting our own cleaned data #3585

Open

alejoe91 added the extractors Related to extractors module label Jan 7, 2025

zm711 reviewed Jan 13, 2025

View reviewed changes

h-mayorquin and others added 2 commits January 13, 2025 12:48

Apply suggestions from code review

3532788

Co-authored-by: Zach McKenzie <[email protected]>

change to elif

e0e8859

alejoe91 approved these changes Jan 14, 2025

View reviewed changes

alejoe91 reviewed Jan 14, 2025

View reviewed changes

src/spikeinterface/extractors/nwbextractors.py Outdated Show resolved Hide resolved

Update src/spikeinterface/extractors/nwbextractors.py

a985b4f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `NwbTimeSeriesExtractor` to load non-electrical series data from NWB #3587

Add `NwbTimeSeriesExtractor` to load non-electrical series data from NWB #3587

h-mayorquin commented Dec 17, 2024 •

edited

Loading

zm711 left a comment

zm711 Jan 13, 2025

h-mayorquin Jan 13, 2025

zm711 Jan 13, 2025

h-mayorquin Jan 16, 2025

zm711 Jan 13, 2025

h-mayorquin Jan 13, 2025

h-mayorquin Jan 13, 2025

zm711 Jan 13, 2025

zm711 Jan 13, 2025

Add NwbTimeSeriesExtractor to load non-electrical series data from NWB #3587

Are you sure you want to change the base?

Add NwbTimeSeriesExtractor to load non-electrical series data from NWB #3587

Conversation

h-mayorquin commented Dec 17, 2024 • edited Loading

zm711 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Add `NwbTimeSeriesExtractor` to load non-electrical series data from NWB #3587

Add `NwbTimeSeriesExtractor` to load non-electrical series data from NWB #3587

h-mayorquin commented Dec 17, 2024 •

edited

Loading