Add documentation of preprocessing and sorting split by channel group. #2316

JoeZiminski · 2023-12-08T11:52:38Z

This PR adds a 'How to' page on preprocessing and sorting when splitting the recording into channel groups. I could not get the formatting (e.g. inclusion of notes, warning sections) to work with the sphinx-gallery approach as described in this example. I am not 100% sure why, I thought to have the docuemntation reviewed and the best place for it descided before persevering with this.

I also had some general questions

Does probeinterface have a way to show only certain channel groups? If possible wanted to also include a plot of the different channel groups, highlighted by color or something.
Sphinx is pinned to an older version, I can see why as sphinx version changes are a nightmare. Nonetheless the newer versions have some cool features (like dropdowns, currently would would be a dropdown is included as a note at the end). At least, I think the version is why this is not working. Is it worth considering unpinning this dependency?

zm711 · 2023-12-08T13:42:28Z

So if the goal of this is to be a full tutorial run locally and converted to .rst for the docs then the desired flow is to place here as a .py file that can then be converted later (typically real examples using NPs etc). This used to be a more strict rule. Super light examples (simulated data or mearec currently) would go to the module gallery in the appropriate section. I honestly am neutral on the location of the tutorials, but just to FYI you as you work on this (in case it is decided it needs to move and be rewritten in .py style).

And thanks! I think the growing number of multishank datasets really requires some documentation for SI users to understand how to do this with the machinery.

JoeZiminski · 2023-12-08T14:13:34Z

Thanks @zm711! It is pretty light with toy example data so I think 'Module Example Gallery' would be the right place, maybe under 'core' as it contains information on both preprocessing and sorting?

alejoe91 · 2023-12-12T10:54:08Z

Hi @JoeZiminski

Thanks so much for this!! @zm711 honestly I think that the gallery is mainly there as legacy and we will remove it in a future refactoring of the docs. The modules documentation is already the place for detailed information about each module, and IMO the gallery is just redundant (plus hard to maintain!). What do you think?

In that regard, I'd keep this in the "How to" section!

zm711 · 2023-12-12T11:03:38Z

I was always in support of the How to. It just makes mental sense to me whereas module gallery was always confusing to me as a concept.

I think getting rid of them would be okay (they do break the most--although I will say they did help me catch a bug in one of the widgets one time :) ). My biggest problem with them is that they currently (almost) all require datalad. So if they stick around I really think they need to switch to being a simulated dataset so that anyone can run them rather than just people with the fortitude to get datalad working.

samuelgarcia · 2023-12-12T11:35:47Z

I think we need to keep the gallery which auto generated as long it is fast enought.
The howto section was heavy stuuf that we generate locally but we keep a jupytext file to be able to regenrate when necessary.

alejoe91 · 2023-12-13T10:50:04Z

@JoeZiminski did you intend to include this: doc/sg_execution_times.rst? What is it?

samuelgarcia · 2023-12-13T13:38:04Z

Hi @JoeZiminski
When we do a howto doc which is generated from a notebook. (I think this is your case). Then , we need to keep the notebook inside the source code in case we when to mody it one day.
In that case we decided to use jupytext to put the notebook in examples/how_to/*.py to work on it later.
We push the py but not the ipynb and finally convert it to rst and copy/paste the rst into the doc/how_to/*.rst
This is important to able to re-run the notebook without keeping the ipynb.
Here you only push the converted RST.
Have a look to this https://github.com/SpikeInterface/spikeinterface/tree/main/examples/how_to

Maybe I am wrong and you directly wrote the rst. Then it is OK.

In short, we have 3 types of documentation:

pure RST which is at the moment more or less modules/*. If we modify the API this files are not chnaged
generated by examples/*.py and rendered automatically as html/py/ipynb in the sphinx-gallery
locally run notebook and manually transformed into rst.

For me:
case 2 is very important even if Alessio not like the actual content. We can improve the content.
case 3 should be avoid when the generation is super fast and we should put it in case 2 but when it imply long computation or heavy data we do not have choice.

zm711 · 2023-12-13T13:48:06Z

Maybe I should open an issue where we can discuss the rules for example generation and then when we have the official rules laid out I can open a PR to add it to the development portion of the docs so we can have it laid out in stone how we want to organize examples. Does that sound like a plan? I think the structure isn't clear currently so we keep rehashing where examples should go.

samuelgarcia · 2023-12-13T14:06:35Z

I am agree that the structure is not perfect and is should be orthogonal to this case 1 but the sphinx-gallary need to be somehow a section in the main toctree.
lets put this on an issue and maybe lets wirte it in the doc itself!!

JoeZiminski · 2023-12-14T16:34:38Z

Thanks everyone!

For now I did not make the documentation in a .py file because I could not get some cool features like notes / warnings to work. Also, dropdowns are not working. I think this is because of the sphinx-version but am not sure. I was going to propose updating the sphinx version, however that is outside the scope of this PR.

I will redo this page in the standard format as described above (.py > .ipynb > .rst) and can move discussion of the docs to #2327 and add those features back later. Cheers!

h-mayorquin · 2023-12-18T14:58:01Z

This looks great.

I second @alejoe91:

@JoeZiminski did you intend to include this: doc/sg_execution_times.rst? What is it?

Out of curiosity, what do you mean here by unpredictable behavior:

Also you say:

I will redo this page in the standard format as described above (.py > .ipynb > .rst)

Why? I think the rst works better than .py and Sam said.

Maybe I am wrong and you directly wrote the rst. Then it is OK.

doc/how_to/process_by_channel_group.rst

h-mayorquin · 2023-12-18T15:05:32Z

doc/how_to/process_by_channel_group.rst

+    )
+
+
+Further notes on preprocessing by channel group


I find this section a bit confusing. Is the goal of it eliminating the confusion that people might have of thinking that when they apply a pre-processing step to aggregate recording it will still apply per-group?

If so, I will start the session by stating the possible confusion, then saying that this is not correct and then your illustrative example.

I feel that the first paragraph is trying to add some context but I don't think that's necessary as this is at the end of a tutorial that should have make this all clear.

Thanks @h-mayorquin this is a very good point, it is strange to have a seciton at the end re-visiting the entire article in more detail. I've dropped the second half of this section and integrated the rest into a note within the page body. Let me know what you think!

I re-read the current one and it looks great to me!

h-mayorquin · 2023-12-18T15:06:56Z

doc/how_to/process_by_channel_group.rst

+    import numpy as np
+
+    # Create a toy 384 channel recording with 4 shanks (each shank contain 96 channels)
+    recording, _ = se.toy_example(duration=[1.00], num_segments=1, num_channels=384)


Open question:
When do we use toy_example and we do we use generate_recording or generate_ground_truth_recording. I personally would prefer to only use the latter but maybe it does not matter.

I am not sure on this and will defer to @alejoe91 @samuelgarcia

I think that for docs toy_example is ok! It's actually easier to intepret

doc/how_to/process_by_channel_group.rst

Co-authored-by: Heberto Mayorquin <[email protected]>

alejoe91 · 2024-01-22T14:36:37Z

@JoeZiminski maybe this was buried in the discussion!

@JoeZiminski did you intend to include this: doc/sg_execution_times.rst? What is it?

Did you include on purpose? If so, should it be referenced somewhere? I would remove it for now!

alejoe91 · 2024-01-30T11:08:13Z

@JoeZiminski just pinging you on this! We are planning a release on Friday, so it would be great if you can tackle the remaining points by then :)

JoeZiminski · 2024-02-13T12:39:00Z

@alejoe91 @h-mayorquin @samuelgarcia my apologies for the delay!

the execution times was a mistake apologies, somehow got created when building the docs
the warning section on repeatedly applying aggregate_channels is a little unspecific. Once for a pipeline I tried to perform the split-aggregate separately for each preprocessing step (for non-important reasons). However, after this get_traces() became extremely slow. This was resolved by only splitting / aggreating once and performing the whole preprocessing within this split/aggregate as suggested in the docs. So I thought it was worth mentioning this, but did not look into it any further. With the restructuring this is now a sentence in the 'note' section rather than a big warning, so there is less emphasis on this point now. Do you think it is still worth including? It would be interesting to look further into this slowing behaviour but it seems non-critical and is a result of nonstandard use of the function so is probably low priority.

Otherwise from my end I think this is good to go, thanks a lot for your feedback!

zm711

one super tiny typo :)

doc/how_to/process_by_channel_group.rst

Co-authored-by: Zach McKenzie <[email protected]>

JoeZiminski · 2024-03-08T13:17:39Z

Thanks all for your reviews!

JoeZiminski added 2 commits December 7, 2023 23:48

Split by recording first commit.

1ba5b86

Reword and tidy up, currently fixing recording issue.

9a14de5

JoeZiminski changed the title ~~Split by recording first commit.~~ Add documentatino of preprocessing and sorting split by channel group. Dec 8, 2023

JoeZiminski changed the title ~~Add documentatino of preprocessing and sorting split by channel group.~~ Add documentation of preprocessing and sorting split by channel group. Dec 8, 2023

zm711 added the documentation Improvements or additions to documentation label Dec 8, 2023

JoeZiminski added 2 commits December 11, 2023 13:56

Update example to be neuropixels-like.

b4faa9a

Tidy up and check.

1506fda

JoeZiminski marked this pull request as ready for review December 11, 2023 15:31

Merge branch 'main' into adding_to_split_by_group_docs

7fa28ef

zm711 mentioned this pull request Dec 13, 2023

Desired structure for examples in the code base #2327

Open

h-mayorquin reviewed Dec 18, 2023

View reviewed changes

alejoe91 added this to the 0.100.0 milestone Jan 9, 2024

alejoe91 and others added 2 commits January 19, 2024 11:17

Update doc/how_to/process_by_channel_group.rst

47049c9

Co-authored-by: Heberto Mayorquin <[email protected]>

Merge branch 'main' into adding_to_split_by_group_docs

7f1f6db

alejoe91 removed this from the 0.100.0 milestone Feb 5, 2024

JoeZiminski added 2 commits February 13, 2024 11:47

Remove execution times .rst

9feee09

Integrate the end section of the page into the text.

738dc6a

alejoe91 approved these changes Feb 13, 2024

View reviewed changes

zm711 reviewed Feb 13, 2024

View reviewed changes

doc/how_to/process_by_channel_group.rst Outdated Show resolved Hide resolved

Update doc/how_to/process_by_channel_group.rst

649b807

Co-authored-by: Zach McKenzie <[email protected]>

samuelgarcia merged commit fe937a1 into SpikeInterface:main Mar 8, 2024
9 of 10 checks passed

JoeZiminski mentioned this pull request May 21, 2024

Functionality to merge recordings after splitting #2033

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add documentation of preprocessing and sorting split by channel group. #2316

Add documentation of preprocessing and sorting split by channel group. #2316

JoeZiminski commented Dec 8, 2023 •

edited

Loading

zm711 commented Dec 8, 2023

JoeZiminski commented Dec 8, 2023

alejoe91 commented Dec 12, 2023

zm711 commented Dec 12, 2023

samuelgarcia commented Dec 12, 2023

alejoe91 commented Dec 13, 2023

samuelgarcia commented Dec 13, 2023

zm711 commented Dec 13, 2023

samuelgarcia commented Dec 13, 2023

JoeZiminski commented Dec 14, 2023 •

edited

Loading

h-mayorquin commented Dec 18, 2023

h-mayorquin Dec 18, 2023

JoeZiminski Feb 13, 2024

h-mayorquin Feb 13, 2024

h-mayorquin Dec 18, 2023

JoeZiminski Feb 13, 2024

alejoe91 Feb 13, 2024

alejoe91 commented Jan 22, 2024

alejoe91 commented Jan 30, 2024

JoeZiminski commented Feb 13, 2024

zm711 left a comment

JoeZiminski commented Mar 8, 2024

Add documentation of preprocessing and sorting split by channel group. #2316

Add documentation of preprocessing and sorting split by channel group. #2316

Conversation

JoeZiminski commented Dec 8, 2023 • edited Loading

zm711 commented Dec 8, 2023

JoeZiminski commented Dec 8, 2023

alejoe91 commented Dec 12, 2023

zm711 commented Dec 12, 2023

samuelgarcia commented Dec 12, 2023

alejoe91 commented Dec 13, 2023

samuelgarcia commented Dec 13, 2023

zm711 commented Dec 13, 2023

samuelgarcia commented Dec 13, 2023

JoeZiminski commented Dec 14, 2023 • edited Loading

h-mayorquin commented Dec 18, 2023

h-mayorquin Dec 18, 2023

Choose a reason for hiding this comment

JoeZiminski Feb 13, 2024

Choose a reason for hiding this comment

h-mayorquin Feb 13, 2024

Choose a reason for hiding this comment

h-mayorquin Dec 18, 2023

Choose a reason for hiding this comment

JoeZiminski Feb 13, 2024

Choose a reason for hiding this comment

alejoe91 Feb 13, 2024

Choose a reason for hiding this comment

alejoe91 commented Jan 22, 2024

alejoe91 commented Jan 30, 2024

JoeZiminski commented Feb 13, 2024

zm711 left a comment

Choose a reason for hiding this comment

JoeZiminski commented Mar 8, 2024

JoeZiminski commented Dec 8, 2023 •

edited

Loading

JoeZiminski commented Dec 14, 2023 •

edited

Loading