Run / test the notebooks from the command line automatically #334

jonwright · 2024-10-11T15:52:52Z

We accumulated a lot of code in notebooks, and so I guess I will sometimes break them. Here are some notes on how to test notebooks. ToDo's seem to be:

set up or find a few test cases that run (at least) at ESRF using files in /data/id11 that stay around
ensure we do not overwrite something when testing
define/select the jupyter kernels (e.g. the batteries included one at ESRF and/or a clean venv)
add something to run these independent of the github CI (because of big files)
look into rendering some notebooks into the documentation

Some links:

https://github.com/nteract/testbook

https://nbconvert.readthedocs.io/en/latest/execute_api.html

https://github.com/jupyter/nbconvert/blob/main/docs/api_examples/template_path/make_html.py

https://stackoverflow.com/questions/70671733/testing-a-jupyter-notebook

import nbformat
import pytest
from nbconvert.preprocessors import ExecutePreprocessor

@pytest.mark.parametrize("notebook", ["passed.ipynb", "failed.ipynb"])
def test_notebook_exec(notebook):
  with open(notebook) as f:
      nb = nbformat.read(f, as_version=4)
      ep = ExecutePreprocessor(timeout=600, kernel_name='python3')
      try:
        assert ep.preprocess(nb) is not None, f"Got empty notebook for {notebook}"
      except Exception:
          assert False, f"Failed executing {notebook}"

The text was updated successfully, but these errors were encountered:

jadball · 2024-10-18T07:55:11Z

Will this still be possible if we move the notebooks to a different repository?

jonwright · 2024-10-18T08:18:43Z

Anything is possible? I would like to have historical as well as modern notebooks working. I guess that means putting version requirements at the top and bumping ImageD11 versions more often. It would go into testing against several venvs. At least the recent ones on pypi. Some kind of a matrix to run.

The "reproducible" part of FAIR is worth aiming for. So far as I am aware, the code can still be backward compatible except for cases where it had a bug.

jonwright · 2024-12-06T13:37:54Z

Going over to parameterised notebooks, we could try:
https://papermill.readthedocs.io/en/latest/

jadball · 2025-01-21T20:56:56Z

currently working on this with papermill

loichuder · 2025-01-22T10:13:39Z

I worked in the past with testbook, very good project but not that much activity lately.

For reaching files/notebooks, there are several possibilities.

For example, we have something in silx to fetch data from silx.org (used in pyFAI tests). Or you can imagine running the tests on a ESRF Gitlab runner that is set-up specifically to have access to the test files via GFPS/NFS.

Don't hesitate to contact DAU if you need help for this.

jadball · 2025-01-22T10:41:08Z

Thanks @loichuder !

We already have a little function to fetch some test data from Zenodo which is working well:
https://github.com/FABLE-3DXRD/ImageD11/blob/master/ImageD11/fetch_data.py

One of our goals anyway is to parameterise the notebooks so that they can be run as ewoks tasks. Ewoks uses the same cell tagging system for notebook parameterization as papermill, so it makes sense for us to use that going forward:
https://ewokscore.readthedocs.io/en/stable/howtoguides/notebook_task.html

Parameterising the notebooks isn't actually as bad as I thought it would be, so I'm halfway done with that already.

Regarding running the tests - for now the tests need SLURM access (for Astra reconstruction) so they can't run on the Github CI yet, but should run locally in a git checkout on an ESRF machine with SLURM deployment access. Not sure how we could integrate the ESRF Gitlab runner into our existing CI pipeline here?

loichuder · 2025-01-22T11:46:54Z

We already have a little function to fetch some test data from Zenodo which is working well: https://github.com/FABLE-3DXRD/ImageD11/blob/master/ImageD11/fetch_data.py

Good to know. That can be useful to us to test our tasks!

One of our goals anyway is to parameterise the notebooks so that they can be run as ewoks tasks. Ewoks uses the same cell tagging system for notebook parameterization as papermill, so it makes sense for us to use that going forward: https://ewokscore.readthedocs.io/en/stable/howtoguides/notebook_task.html

Parameterising the notebooks isn't actually as bad as I thought it would be, so I'm halfway done with that already.

Indeed. Great to read 🙂

Regarding running the tests - for now the tests need SLURM access (for Astra reconstruction) so they can't run on the Github CI yet, but should run locally in a git checkout on an ESRF machine with SLURM deployment access. Not sure how we could integrate the ESRF Gitlab runner into our existing CI pipeline here?

Hm, a potential strategy would be that GitHub CI could trigger a pipeline on a Gitlab runner that could have the needed access to launch the tests and provide the results. We could do it either via GitHub webhooks or requests to the GitLab API.

I don't have the details fully fleshed out but I am sure it is possible. I'll ask my colleagues and come back to you.

jadball · 2025-01-22T12:48:44Z

@loichuder many thanks! For now the tests can be manually run by myself and @jonwright on our ESRF machines in the meantime. I've excluded them from the Github CI and flake8 linting.
I've now implemented end-to-end testing for the tomographic route and the point-by-point route for scanning 3DXRD for the single-crystal Si dataset that lives on Zenodo testing:
master...jadball:ImageD11:master
Right now, we are just testing that the notebooks run all the way through without raising any errors.
At some point we should start testing the outputs (what UBI did we find for the Si grain, what translation did we fit, etc etc...) too - @jonwright do you want me to start working on that?

jonwright · 2025-01-22T13:01:09Z

@jadball : I do not like testing numerical results in testcases, unless the numbers came from a simulation or theory and you test against a "truth". Getting the same answer as last time is not a priority: the new answer in the future is almost always more accurate or "better" in some other way. To finish up: can you add rendered notebooks somewhere (docs?), then we can get these out for users (e.g. https://imaged11.readthedocs.io/en/latest/) and only need to review them when doing a release.

@haixing0a is motivated to find and use some testcase datasets. Maybe we choose a folder on /data/id11/inhouseN for now. With the peak segmentation, I still need to pick out a test dataset to debug that monkeypatch thing.

jadball · 2025-01-22T13:12:06Z

@jonwright makes sense to me, thanks!
I copied out the rendered notebooks from the papermill test outputs for the Si cube data into docs/notebooks/S3DXRD (sans notebook 0 because we import from sparse already) - what do you think?
https://github.com/jadball/ImageD11/tree/master/docs/notebooks/S3DXRD

jonwright · 2025-01-22T14:13:36Z

On github it looks good as it renders the ipynb. For readthedocs (etc) I wonder whether we are better with HTML so that the source (ipynb) is distinct from the output, and it does not try to render again?

jadball · 2025-01-22T20:24:00Z

will do!

jadball · 2025-01-23T21:08:41Z

Progress made - end-to-end tomo test for multiple layers with multiple phases now on my fork.

master...jadball:ImageD11:master

todo:

PBP route minor phase dedicated notebooks
PBP route end-to-end test on FeAu
box-beam parameterization
box-beam end-to-end test
investigate changing the multi-layer looping logic - could use a final notebook to run papermill instead!

loichuder · 2025-01-31T14:44:27Z

Hm, a potential strategy would be that GitHub CI could trigger a pipeline on a Gitlab runner that could have the needed access to launch the tests and provide the results. We could do it either via GitHub webhooks or requests to the GitLab API.

I don't have the details fully fleshed out but I am sure it is possible. I'll ask my colleagues and come back to you.

Little update on this: we did manage to have a Gitlab runner running our CI on data present in /data. For the Gitlab to GitHub connection, a quick search pointed out that there are steps clearly laid out to achieve it: https://docs.gitlab.com/ee/ci/ci_cd_for_external_repos/github_integration.html

We could meet shortly next week for me to gather your exact CI requirements and I can work on setting this up.

This was referenced Jan 10, 2025

Ewoks integration #358

Open

New release #371

Open

jadball self-assigned this Jan 21, 2025

jadball mentioned this issue Jan 24, 2025

Parameterise S3DXRD notebooks #391

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run / test the notebooks from the command line automatically #334

Run / test the notebooks from the command line automatically #334

jonwright commented Oct 11, 2024 •

edited by jadball

Loading

jadball commented Oct 18, 2024

jonwright commented Oct 18, 2024

jonwright commented Dec 6, 2024

jadball commented Jan 21, 2025

loichuder commented Jan 22, 2025

jadball commented Jan 22, 2025

loichuder commented Jan 22, 2025

jadball commented Jan 22, 2025

jonwright commented Jan 22, 2025

jadball commented Jan 22, 2025

jonwright commented Jan 22, 2025

jadball commented Jan 22, 2025

jadball commented Jan 23, 2025 •

edited

Loading

loichuder commented Jan 31, 2025 •

edited

Loading

Run / test the notebooks from the command line automatically #334

Run / test the notebooks from the command line automatically #334

Comments

jonwright commented Oct 11, 2024 • edited by jadball Loading

jadball commented Oct 18, 2024

jonwright commented Oct 18, 2024

jonwright commented Dec 6, 2024

jadball commented Jan 21, 2025

loichuder commented Jan 22, 2025

jadball commented Jan 22, 2025

loichuder commented Jan 22, 2025

jadball commented Jan 22, 2025

jonwright commented Jan 22, 2025

jadball commented Jan 22, 2025

jonwright commented Jan 22, 2025

jadball commented Jan 22, 2025

jadball commented Jan 23, 2025 • edited Loading

loichuder commented Jan 31, 2025 • edited Loading

jonwright commented Oct 11, 2024 •

edited by jadball

Loading

jadball commented Jan 23, 2025 •

edited

Loading

loichuder commented Jan 31, 2025 •

edited

Loading