Skip to content

Commit

Permalink
doc(katib): update example and description for Push MC.
Browse files Browse the repository at this point in the history
Signed-off-by: Electronic-Waste <[email protected]>
  • Loading branch information
Electronic-Waste committed Sep 4, 2024
1 parent 0ed3a66 commit 05975ad
Showing 1 changed file with 17 additions and 36 deletions.
53 changes: 17 additions & 36 deletions content/en/docs/components/katib/user-guides/metrics-collector.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,10 @@ To define the pull-based metrics collector for your Experiment:
be `timestamp`:

```json
{"epoch": 0, "foo": “bar", “fizz": “buzz", "timestamp": 1638422847.28721…}
{"epoch": 1, "foo": “bar", “fizz": “buzz", "timestamp": 1638422847.287801…}
{"epoch": 2, "foo": “bar", “fizz": “buzz", "timestamp": "2021-12-02T14:27:50.000035161+09:00"…}
{"epoch": 3, "foo": “bar", “fizz": “buzz", "timestamp": "2021-12-02T14:27:50.000037459+09:00"…}
{"epoch": 0, "foo": "bar", "fizz": "buzz", "timestamp": "2021-12-02T14:27:51"}
{"epoch": 1, "foo": "bar", "fizz": "buzz", "timestamp": "2021-12-02T14:27:52"}
{"epoch": 2, "foo": "bar", "fizz": "buzz", "timestamp": "2021-12-02T14:27:53"}
{"epoch": 3, "foo": "bar", "fizz": "buzz", "timestamp": "2021-12-02T14:27:54"}
```

Check the file metrics collector example for [`TEXT`](https://github.com/kubeflow/katib/blob/ea46a7f2b73b2d316b6b7619f99eb440ede1909b/examples/v1beta1/metrics-collector/file-metrics-collector.yaml#L14-L24)
Expand Down Expand Up @@ -87,15 +86,18 @@ To define the pull-based metrics collector for your Experiment:

## Push-based Metrics Collector

Your training code needs to call [`report_metrics`](https://github.com/kubeflow/katib/blob/master/sdk/python/v1beta1/kubeflow/katib/api/report_metrics.py#L26) function in Python SDK to record metrics.
Your training code needs to call [`report_metrics()`](https://github.com/kubeflow/katib/blob/e251a07cb9491e2d892db306d925dddf51cb0930/sdk/python/v1beta1/kubeflow/katib/api/report_metrics.py#L26) function in Python SDK to record metrics.
The `report_metrics()` function works by parsing the metrics in `metrics` field into a gRPC request, automatically adding the current timestamp for users, and sending the request to Katib DB Manager.

But before that, `kubeflow-katib` package should be installed in your training container.

To define the push-based metrics collector for your Experiment, you have two options:

- YAML File

1. Specify the collector type `Push` in the `.collector.kind` field.

2. Write code in your training container to call `report_metrics` to report metrics.
2. Write code in your training container to call `report_metrics()` to report metrics.

- [`tune`](https://github.com/kubeflow/katib/blob/master/sdk/python/v1beta1/kubeflow/katib/api/katib_client.py#L166) function

Expand All @@ -104,44 +106,23 @@ To define the push-based metrics collector for your Experiment, you have two opt
```
import kubeflow.katib as katib
# Step 1. Create an objective function with push-based metrics collection.
def objective(parameters):
# Import required packages.
import time
import kubeflow.katib as katib
time.sleep(5)
# Calculate objective function.
result = 4 * int(parameters["a"]) - float(parameters["b"]) ** 2
result = 4 * int(parameters["a"])
# Push metrics to Katib DB.
katib.report_metrics({"result": result})
# Step 2. Create HyperParameter search space.
parameters = {
"a": katib.search.int(min=10, max=20),
"b": katib.search.double(min=0.1, max=0.2)
}
# Step 3. Create Katib Experiment with 4 Trials and 2 CPUs per Trial.
# We choose to install the latest changes of Python SDK because `report_metrics` has not been supported yet.
# Thus, the base image must have `git` command to download the package.
katib_client = katib.KatibClient(namespace="kubeflow")
name = "tune-experiment"
katib_client.tune(
name=name,
katib.KatibClient(namespace="kubeflow").tune(
name="push-metrics-exp",
objective=objective,
parameters=parameters,
base_image="electronicwaste/push-metrics-collector:v0.0.9", # python:3.11-slim + git
parameters= {"a": katib.search.int(min=10, max=20)}
objective_metric_name="result",
max_trial_count=4,
resources_per_trial={"cpu": "2"},
packages_to_install=["git+https://github.com/kubeflow/katib.git@master#subdirectory=sdk/python/v1beta1"],
# packages_to_install=["kubeflow-katib==0.18.0"],
max_trial_count=2,
metrics_collector_config={"kind": "Push"},
# When SDK is released, replace it with packages_to_install=["kubeflow-katib==0.18.0"].
# Currently, the training container should have `git` package to install this SDK.
packages_to_install=["git+https://github.com/kubeflow/katib.git@master#subdirectory=sdk/python/v1beta1"],
)
# Step 4. Wait until Katib Experiment is complete
katib_client.wait_for_experiment_condition(name=name)
# Step 5. Get the best HyperParameters.
print(katib_client.get_optimal_hyperparameters(name))
```

0 comments on commit 05975ad

Please sign in to comment.