Skip to content

Commit

Permalink
Merge branch 'main' into multiple_backends_refactor
Browse files Browse the repository at this point in the history
  • Loading branch information
stephenpardy authored Nov 16, 2023
2 parents 7a9e67d + e227eb5 commit a05ece4
Show file tree
Hide file tree
Showing 7 changed files with 369 additions and 24 deletions.
12 changes: 6 additions & 6 deletions docs/docs-environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,14 @@ dependencies:

# install rubicon-ml dependencies
# so local pip install doesn't
- dash<=2.14.0,>=2.0.0
- dash<=2.14.1,>=2.0.0
- dash-bootstrap-components<=1.5.0,>=1.0.0
- fsspec<=2023.9.2,>=2021.4.0
- fsspec<=2023.10.0,>=2021.4.0
- intake[dataframe]<=0.7.0,>=0.5.2
- jsonpath-ng<=1.6.0,>=1.5.3
- numpy<=1.26.0,>=1.22.0
- pandas<=2.1.1,>=1.0.0
- numpy<=1.26.2,>=1.22.0
- pandas<=2.1.3,>=1.0.0
- prefect<=1.2.4,>=0.12.0
- pyarrow<=13.0.0,>=0.18.0
- pyarrow<=14.0.1,>=14.0.1
- PyYAML<=6.0.1,>=5.4.0
- scikit-learn<=1.3.1,>=0.22.0
- scikit-learn<=1.3.2,>=0.22.0
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@ To install all extra modules, use the ``all`` extra.
integrations/integration-sklearn
logging-examples/logging-feature-plots
logging-examples/multiple-backend
logging-examples/manage-experiment-relationships
logging-examples/register-custom-schema
logging-examples/set-schema
logging-examples/visualizing-logged-dataframes
Expand Down
14 changes: 7 additions & 7 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,23 +6,23 @@ dependencies:
- pip

- click<=8.1.7,>=7.1
- fsspec<=2023.9.2,>=2021.4.0
- fsspec<=2023.10.0,>=2021.4.0
- intake[dataframe]<=0.7.0,>=0.5.2
- jsonpath-ng<=1.6.0,>=1.5.3
- numpy<=1.26.0,>=1.22.0
- pandas<=2.1.1,>=1.0.0
- pyarrow<=13.0.0,>=0.18.0
- numpy<=1.26.2,>=1.22.0
- pandas<=2.1.3,>=1.0.0
- pyarrow<=14.0.1,>=14.0.1
- PyYAML<=6.0.1,>=5.4.0
- scikit-learn<=1.3.1,>=0.22.0
- scikit-learn<=1.3.2,>=0.22.0

# for prefect extras
- prefect<=1.2.4,>=0.12.0

# for s3fs extras
- s3fs<=2023.9.2,>=0.4
- s3fs<=2023.10.0,>=0.4

# for viz extras
- dash<=2.14.0,>=2.0.0
- dash<=2.14.1,>=2.0.0
- dash-bootstrap-components<=1.5.0,>=1.0.0

# for testing
Expand Down
235 changes: 235 additions & 0 deletions notebooks/logging-examples/manage-experiment-relationships.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "815af48c-67f4-4335-bf16-11068f7094bb",
"metadata": {},
"source": [
"# Manage Experiment Relationships\n",
"\n",
"``rubicon-ml`` experiments can be tagged with special identifiers to denote a parent/child relationship.\n",
"This can be used to track hierarchical or iterative experiments, among other things.\n",
"\n",
"First, let's create a project."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "c44f140a-0d40-4919-b058-8f986dd9bcb1",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<rubicon_ml.client.project.Project at 0x121d7af50>"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from rubicon_ml import Rubicon\n",
"\n",
"rubicon = Rubicon(persistence=\"memory\")\n",
"project = rubicon.create_project(name=\"hierarchical experiments\")\n",
"project"
]
},
{
"cell_type": "markdown",
"id": "d9aee6c2-8891-4a5b-98d6-37cce80bb40f",
"metadata": {},
"source": [
"## Hierarchical experiments\n",
"\n",
"Now we can log some experiments in a nested loop. Imagine logging an experiment for each node of a\n",
"gradient boosted tree, or something along those lines.\n",
"\n",
"We can use ``parent_experiment.add_child_experiment(child_experiment)`` to automatically add tags\n",
"to both ``parent_experiment`` and ``child_experiment`` that represent their relationship."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "af1bd79b-dd77-4ccb-affb-8abea69f581b",
"metadata": {},
"outputs": [],
"source": [
"root_experiment = project.log_experiment(name=\"root\")\n",
"\n",
"for n in range(3):\n",
" node_experiment = project.log_experiment(name=f\"node_{n}\")\n",
" root_experiment.add_child_experiment(node_experiment)\n",
"\n",
" for m in range(2):\n",
" nested_node_experiment = project.log_experiment(name=f\"node_{n}_{m}\")\n",
" node_experiment.add_child_experiment(nested_node_experiment)"
]
},
{
"cell_type": "markdown",
"id": "a2f6583c-081a-4afe-ba3a-5e6b8744f274",
"metadata": {},
"source": [
"To retrieve experiments, start at the root experiment and call ``get_child_experiments`` to return a\n",
"list of ``rubicon-ml`` objects representing each of the tagged child experiments."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "30068979-44ad-4fd5-9dab-6a2dbee66078",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"id: f8c897d7-852d-4020-8715-264452a5b8ab\n",
"tags: ['child:754dff35-aa87-4385-9bcd-af5c1f5b0b7e', 'child:fdf25d72-d1bb-47bc-8cb2-4a088ed0ba33'] \n",
"\n",
"\tid: 754dff35-aa87-4385-9bcd-af5c1f5b0b7e\n",
"\ttags: ['parent:f8c897d7-852d-4020-8715-264452a5b8ab'] \n",
"\n",
"\tid: fdf25d72-d1bb-47bc-8cb2-4a088ed0ba33\n",
"\ttags: ['parent:f8c897d7-852d-4020-8715-264452a5b8ab'] \n",
"\n",
"id: 5f8c14c1-50d9-4c49-b0b2-6a59c6f3d707\n",
"tags: ['child:d9012c99-2888-43d1-833b-78d51de75a3a', 'child:c65f8c5e-bf5b-4a82-94cb-8b669545b951'] \n",
"\n",
"\tid: d9012c99-2888-43d1-833b-78d51de75a3a\n",
"\ttags: ['parent:5f8c14c1-50d9-4c49-b0b2-6a59c6f3d707'] \n",
"\n",
"\tid: c65f8c5e-bf5b-4a82-94cb-8b669545b951\n",
"\ttags: ['parent:5f8c14c1-50d9-4c49-b0b2-6a59c6f3d707'] \n",
"\n",
"id: da33e918-96c0-4e56-9075-7941515cc18f\n",
"tags: ['child:7c5b2f4b-1e7c-40be-8cda-8f3a00067e98', 'child:cd30f3b2-bd63-4318-974a-6668648bf4ac'] \n",
"\n",
"\tid: 7c5b2f4b-1e7c-40be-8cda-8f3a00067e98\n",
"\ttags: ['parent:da33e918-96c0-4e56-9075-7941515cc18f'] \n",
"\n",
"\tid: cd30f3b2-bd63-4318-974a-6668648bf4ac\n",
"\ttags: ['parent:da33e918-96c0-4e56-9075-7941515cc18f'] \n",
"\n"
]
}
],
"source": [
"for experiment in root_experiment.get_child_experiments():\n",
" print(\"id:\", experiment.id)\n",
" print(\"tags:\", [t for t in experiment.tags if \"child\" in t], \"\\n\")\n",
"\n",
" for nested_experiment in experiment.get_child_experiments():\n",
" print(\"\\tid:\", nested_experiment.id)\n",
" print(\"\\ttags:\", nested_experiment.tags, \"\\n\")"
]
},
{
"cell_type": "markdown",
"id": "12bc18eb-0edd-4054-840b-c4b969a150fb",
"metadata": {},
"source": [
"## Iterative experiments\n",
"\n",
"We can leverage ``add_child_experiment`` to maintain iterative relationships too. This could be\n",
"used to log metadata about of each iteration of recursive feature elimination and preserve the\n",
"linear history of the model training."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "0f104792-7cbf-4508-b37c-11dfb158b608",
"metadata": {},
"outputs": [],
"source": [
"current_experiment = project.log_experiment(name=\"experiment_0\")\n",
"\n",
"for n in range(3):\n",
" next_experiment = project.log_experiment(name=f\"experiment_{n+1}\")\n",
" current_experiment.add_child_experiment(next_experiment)\n",
"\n",
" current_experiment = next_experiment\n",
"\n",
"last_experiment = current_experiment"
]
},
{
"cell_type": "markdown",
"id": "a0f42e74-8fc2-460a-9282-ed0492639d75",
"metadata": {},
"source": [
"Similarly to ``get_child_experiments``, we can use ``get_parent_experiment`` to return a ``rubicon-ml``\n",
"object representing the tagged parent experiment."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "35613e52-84f1-4d2e-8e3c-8c0f2b731d89",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"name: experiment_3\n",
"\tid: a85ba9d9-1473-44c2-b3e9-8a744534485b\n",
"\ttags: ['parent:47ecf8a7-b799-4308-994b-aa6d8698dc2b'] \n",
"\n",
"name: experiment_2\n",
"\tid: 47ecf8a7-b799-4308-994b-aa6d8698dc2b\n",
"\ttags: ['child:a85ba9d9-1473-44c2-b3e9-8a744534485b', 'parent:aea5f005-9792-442c-b98c-8d9b9e39f99b'] \n",
"\n",
"name: experiment_1\n",
"\tid: aea5f005-9792-442c-b98c-8d9b9e39f99b\n",
"\ttags: ['child:47ecf8a7-b799-4308-994b-aa6d8698dc2b', 'parent:f4a393ef-0b32-4f70-ac82-07a0877da328'] \n",
"\n",
"name: experiment_0\n",
"\tid: f4a393ef-0b32-4f70-ac82-07a0877da328\n",
"\ttags: ['child:aea5f005-9792-442c-b98c-8d9b9e39f99b'] \n",
"\n"
]
}
],
"source": [
"experiments = [last_experiment]\n",
"\n",
"while len(experiments) != 0:\n",
" experiment = experiments[0]\n",
"\n",
" print(\"name:\", experiment.name)\n",
" print(\"\\tid:\", experiment.id)\n",
" print(\"\\ttags:\", experiment.tags, \"\\n\")\n",
"\n",
" experiments = experiment.get_parent_experiments()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
66 changes: 65 additions & 1 deletion rubicon_ml/client/experiment.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from __future__ import annotations

from typing import TYPE_CHECKING
from typing import TYPE_CHECKING, List

from rubicon_ml import domain
from rubicon_ml.client import (
Expand All @@ -14,6 +14,7 @@
)
from rubicon_ml.client.utils.exception_handling import failsafe
from rubicon_ml.client.utils.tags import filter_children
from rubicon_ml.exceptions import RubiconException

if TYPE_CHECKING:
from rubicon_ml.client import Project
Expand Down Expand Up @@ -360,6 +361,69 @@ def parameter(self, name=None, id=None):
else:
return [p for p in self.parameters() if p.id == id][0]

def add_child_experiment(self, experiment: Experiment):
"""Add tags to denote `experiment` as a descendent of this experiment.
Parameters
----------
experiment : rubicon_ml.client.Experiment
The experiment to mark as a descendent of this experiment.
Raises
------
RubiconException
If `experiment` and this experiment are not logged to the same project.
"""
if experiment.project.id != self.project.id:
raise RubiconException(
"Descendents must be logged to the same project. Project"
f"{experiment.project.id} does not match project {self.project.id}."
)

child_tag = f"child:{experiment.id}"
parent_tag = f"parent:{self.id}"

self.add_tags([child_tag])
experiment.add_tags([parent_tag])

def _get_experiments_from_tags(self, tag_key: str):
"""Get the experiments with `experiment_id`s in this experiment's tags
that match the format `tag_key:experiment_id`.
Returns
-------
list of rubicon_ml.client.Experiment
The experiments with `experiment_id`s in this experiment's tags.
"""
experiments = []

for tag in self.tags:
if f"{tag_key}:" in tag:
experiment_id = tag.split(":")[-1]
experiments.append(self.project.experiment(id=experiment_id))

return experiments

def get_child_experiments(self) -> List[Experiment]:
"""Get the experiments that are tagged as children of this experiment.
Returns
-------
list of rubicon_ml.client.Experiment
The experiments that are tagged as children of this experiment.
"""
return self._get_experiments_from_tags("child")

def get_parent_experiments(self) -> List[Experiment]:
"""Get the experiments that are tagged as parents of this experiment.
Returns
-------
list of rubicon_ml.client.Experiment
The experiments that are tagged as parents of this experiment.
"""
return self._get_experiments_from_tags("parent")

@property
def id(self):
"""Get the experiment's id."""
Expand Down
Loading

0 comments on commit a05ece4

Please sign in to comment.