implement more useful alerts on chainhook node health #466

lgalabru · 2023-12-22T21:43:00Z

Now that the ping endpoint is reporting the bitcoin / stacks block heights, the alarming should be revisited, taking this into account.

lgalabru · 2023-12-22T21:45:32Z

smcclellan · 2024-01-02T21:38:35Z

@lgalabru to indicate which alarms are needed on this repo.

MicaiahReid · 2024-01-09T21:20:37Z

Just adding a quick follow up here - I'm taking on this issue. We'll be adding alerts to Grafana that will notify us if Chainhook block ingestion falls behind block production for either the stacks or bitcoin nodes.

This is requiring us to update Chainhook to emit Prometheus metrics. PR will be incoming soon!

### Description To enable improved alerts on downtime for Hiro's hosted Chainhook service, we need Chainhook to provide metrics that can be ingested by Prometheus. This PR changes some how we track our metrics (that are served over the `/ping` endpoint of the observer) to enable Prometheus compatibility, and adds a flag to optionally start a server to supply metrics to a Prometheus client. ### Example Starting chainhook with the `--prometheus-port XXXX` flag now enables a service that can supply Prometheus metrics at `localhost:XXXX/metrics`. If using a config file, this option can be specified via: ```yaml [monitoring] prometheus_monitoring_port = XXXX ``` Chainhook will behave as usual with this flag ommitted - metrics can still be retrieved via the observer's `/ping` endpoint, but they will not be formatted for ingestion by a Prometheus client. --- ### Checklist - [X] All tests pass - [X] Tests added in this PR (if applicable) Fixes #474, addresses #466

MicaiahReid · 2024-02-12T15:24:54Z

Closing, because everything is complete on the chainhook side with PR #473. The devops side is being tracked by https://github.com/hirosystems/devops/issues/1543

github-project-automation bot added this to DevTools Dec 22, 2023

github-project-automation bot moved this to 🆕 New in DevTools Dec 22, 2023

smcclellan moved this from 🆕 New to 📋 Backlog in DevTools Jan 2, 2024

smcclellan assigned lgalabru Jan 2, 2024

MicaiahReid assigned MicaiahReid and unassigned lgalabru Jan 9, 2024

MicaiahReid moved this from 📋 Backlog to 🏗 In Progress in DevTools Jan 9, 2024

MicaiahReid changed the title ~~Revisit alarms~~ implement more useful alerts on chainhook node health Jan 9, 2024

MicaiahReid mentioned this issue Jan 11, 2024

feat: optionally serve Prometheus metrics #473

Merged

2 tasks

smcclellan added this to the Q1-2024 milestone Jan 19, 2024

smcclellan moved this from 🏗 In Progress to 📋 Backlog in DevTools Jan 25, 2024

MicaiahReid moved this from 📋 Backlog to 🚢 Ready to Release in DevTools Feb 12, 2024

MicaiahReid moved this from 🚢 Ready to Release to ✅ Done in DevTools Feb 12, 2024

MicaiahReid closed this as completed Feb 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement more useful alerts on chainhook node health #466

implement more useful alerts on chainhook node health #466

lgalabru commented Dec 22, 2023

lgalabru commented Dec 22, 2023

smcclellan commented Jan 2, 2024

MicaiahReid commented Jan 9, 2024

MicaiahReid commented Feb 12, 2024

implement more useful alerts on chainhook node health #466

implement more useful alerts on chainhook node health #466

Comments

lgalabru commented Dec 22, 2023

lgalabru commented Dec 22, 2023

smcclellan commented Jan 2, 2024

MicaiahReid commented Jan 9, 2024

MicaiahReid commented Feb 12, 2024