Jobs return standardized and flexible output #815

njbrake · 2025-02-06T01:27:10Z

What's changing

Tl;dr this design de-couples the presentation (frontend) layer from the data management/storage layer (backend), as well as the execution layer (job). Although the frontend still has hardcoded references to several keys, they are now organized in such a way that the frontend can implement more dynamic techniques and management of keys when they're ready.

In order to make it easier to add an manage jobs as they are added, this PR creates a loose interface between job result (in the result.json file) and the backend.

The job can create a schema as they please as long as the main structure of the file they save has only the keys "artifacts", "parameters", and "metrics". This allows for easy storage in the tracking interface (e.g. mlflow) and easy access for the frontend.

The design focuses on keeping the jobs as a flexible entity that doesn't require knowledge about how the frontend works. The design also allows for new visualizations of existing data, which means that the frontend is able to make decisions about how to display the data, and it isn't tied to the backend.

If this PR is related to an issue or closes one, please link it here.

Refs #678

How to test it

The unit tests handle testing of the SDK and the backend, but this should be tested by running make local-up and then using the Web UI to upload a dataset, create an experiment, and view all the results.

Additional notes for reviewers

Anything you'd like to add to help the reviewer understand the changes you're proposing.

I already...

Tested the changes in a working environment to ensure they work as expected
Added some tests for any new functionality
Updated the documentation (both comments in code and product documentation under /docs)
Checked if a (backend) DB migration step was required and included it if required

…task-agnostic-format-for-frontend-consumption

njbrake · 2025-02-06T17:32:56Z

@khaledosman adding you to this review for the frontend perspective. I ran the UI to ensure that the changes I made to the Vue code don't break anything I could see. I think that these changes should enable you to simplify how you load and organize the data you get back from the backend.

lumigator/schemas/lumigator_schemas/jobs.py

lumigator/backend/backend/services/jobs.py

lumigator/backend/backend/services/workflows.py

javiermtorres

LGTM. Only a couple of issues:

Add another layer between artifacts and the columns (admittedly waivable if we assume this specific jobs will contain only these columns and will only accept one input, but would make it more generic, like having several in/out params)
Try to shape the result object fields (params, metrics, etc) into a more "appropriate" type that can be more flexible and at the same time more restrictive. Since I don't have a firm proposal yet :-/ we can leave it for next PR, but just let me know your thoughts on this.

Co-authored-by: javiermtorres <[email protected]> Signed-off-by: Nathan Brake <[email protected]>

njbrake

Thanks for the review, Javier! You made some great points 💯

lumigator/backend/backend/services/workflows.py

lumigator/backend/backend/services/jobs.py

lumigator/schemas/lumigator_schemas/jobs.py

…turn-results-data-in-a-task-agnostic-format-for-frontend-consumption

njbrake · 2025-02-10T14:06:55Z

LGTM. Only a couple of issues:

* Add another layer between `artifacts` and the columns (admittedly waivable if we assume this specific jobs will contain only these columns and will only accept one input, but would make it more generic, like having several in/out params)

Agree that this needs to be improved. I created #835 to track this, because implementing this change will require refactors in a handful of places and I'd like to prevent this PR from getting too unwieldy.

* Try to shape the result object fields (params, metrics, etc) into a more "appropriate" type that can be more flexible and at the same time more restrictive. Since I don't have a firm proposal yet :-/ we can leave it for next PR, but just let me know your thoughts on this.

This is a good comment and I think in a way it will be evergreen: there's this push and pull between making the output type standardized, opinionated, but also flexible. I liked your suggestion of turning the fields into at least Pydantic BaseModels and initializing them as empty dicts if the fields aren't provided; I implemented that change. I think that this will require more refinement down the road, but at least now this gets us into a world where we have three high level buckets to put things in for the SDK and frontend to access: parameters, metrics, and artifacts.

…task-agnostic-format-for-frontend-consumption

…ic-format-for-frontend-consumption' of github.com:mozilla-ai/lumigator into 678-backend-should-return-results-data-in-a-task-agnostic-format-for-frontend-consumption

njbrake · 2025-02-10T15:48:26Z

Try to shape the result object fields (params, metrics, etc) into a more "appropriate" type that can be more flexible and at the same time more restrictive. Since I don't have a firm proposal yet :-/ we can leave it for next PR, but just let me know your thoughts on this.

This is a good comment and I think in a way it will be evergreen: there's this push and pull between making the output type standardized, opinionated, but also flexible. I liked your suggestion of turning the fields into at least Pydantic BaseModels and initializing them as empty dicts if the fields aren't provided; I implemented that change. I think that this will require more refinement down the road, but at least now this gets us into a world where we have three high level buckets to put things in for the SDK and frontend to access: parameters, metrics, and artifacts.

Update: I had to undo the suggestion of changing the JobResultOutput fields into Pydantic BaseModels: pydantic models require you to know all of the keys that you want the class to hold, so it can't function as an arbitrary dict. :(

javiermtorres · 2025-02-11T06:42:50Z

Now I remember this issue about the use of Base model :( Thanks for trying out anyway!

Standardize the results file across jobs

a99b339

njbrake linked an issue Feb 6, 2025 that may be closed by this pull request

Backend should return results data in a task agnostic format for frontend consumption #678

Closed

1 task

github-actions bot added backend frontend schemas Changes to schemas (which may be public facing) labels Feb 6, 2025

njbrake and others added 5 commits February 6, 2025 07:59

Merge branch 'main' into 678-backend-should-return-results-data-in-a-…

d7aa289

…task-agnostic-format-for-frontend-consumption

Unified api for results

19d78cd

Merge branch 'main' into 678-backend-should-return-results-data-in-a-…

8e2829b

…task-agnostic-format-for-frontend-consumption

fix the notebook utility file

933b1a1

fix sdk test

f6231cb

github-actions bot added the sdk label Feb 6, 2025

njbrake marked this pull request as ready for review February 6, 2025 17:30

njbrake requested review from aittalam, ividal and khaledosman February 6, 2025 17:30

njbrake requested review from javiermtorres and removed request for aittalam February 6, 2025 18:02

javiermtorres reviewed Feb 10, 2025

View reviewed changes

lumigator/schemas/lumigator_schemas/jobs.py Outdated Show resolved Hide resolved

javiermtorres reviewed Feb 10, 2025

View reviewed changes

lumigator/backend/backend/services/jobs.py Outdated Show resolved Hide resolved

javiermtorres reviewed Feb 10, 2025

View reviewed changes

lumigator/backend/backend/services/jobs.py Show resolved Hide resolved

javiermtorres reviewed Feb 10, 2025

View reviewed changes

lumigator/backend/backend/services/workflows.py Show resolved Hide resolved

javiermtorres approved these changes Feb 10, 2025

View reviewed changes

Apply suggestions from code review

bdfb5e0

Co-authored-by: javiermtorres <[email protected]> Signed-off-by: Nathan Brake <[email protected]>

njbrake commented Feb 10, 2025

View reviewed changes

lumigator/backend/backend/services/workflows.py Show resolved Hide resolved

lumigator/backend/backend/services/jobs.py Show resolved Hide resolved

lumigator/schemas/lumigator_schemas/jobs.py Outdated Show resolved Hide resolved

njbrake added 2 commits February 10, 2025 08:47

Merge remote-tracking branch 'origin/main' into 678-backend-should-re…

0d15f67

…turn-results-data-in-a-task-agnostic-format-for-frontend-consumption

But make it pydantic-y

b194e43

njbrake changed the title ~~Standardize the results file across jobs~~ Jobs return standardized and flexible output Feb 10, 2025

let it be none.

1f58409

javiermtorres and others added 3 commits February 10, 2025 15:47

Merge branch 'main' into 678-backend-should-return-results-data-in-a-…

213452f

…task-agnostic-format-for-frontend-consumption

Can't make an object a pydantic model if you don't know the fields.

72d48dc

Merge branch '678-backend-should-return-results-data-in-a-task-agnost…

f7f0a11

…ic-format-for-frontend-consumption' of github.com:mozilla-ai/lumigator into 678-backend-should-return-results-data-in-a-task-agnostic-format-for-frontend-consumption

njbrake merged commit c815c32 into main Feb 10, 2025
15 checks passed

njbrake deleted the 678-backend-should-return-results-data-in-a-task-agnostic-format-for-frontend-consumption branch February 10, 2025 16:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jobs return standardized and flexible output #815

Jobs return standardized and flexible output #815

njbrake commented Feb 6, 2025 •

edited

Loading

njbrake commented Feb 6, 2025

javiermtorres left a comment

njbrake left a comment

njbrake commented Feb 10, 2025

njbrake commented Feb 10, 2025

javiermtorres commented Feb 11, 2025

Jobs return standardized and flexible output #815

Jobs return standardized and flexible output #815

Conversation

njbrake commented Feb 6, 2025 • edited Loading

What's changing

How to test it

Additional notes for reviewers

I already...

njbrake commented Feb 6, 2025

javiermtorres left a comment

Choose a reason for hiding this comment

njbrake left a comment

Choose a reason for hiding this comment

njbrake commented Feb 10, 2025

njbrake commented Feb 10, 2025

javiermtorres commented Feb 11, 2025

njbrake commented Feb 6, 2025 •

edited

Loading