-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Prometheus Metrics #675
Comments
Yes. I'd probably prefer your "nag2prom" idea +docs, to avoid cluttering
the code--at least at first.
…On September 20, 2021 09:19:02 Ben Yanke ***@***.***> wrote:
Would you be open to either a PR, or documentation on how to use sanoid
--monitoring-* flags but with prometheus? Prometheus is a growing
monitoring tool that many sysadmins are using, including myself.
I'm thinking potential implementation could either be adding three new
flags like --monitor-capacity-prometheus, or documentation on using a third
party tool that could convert nagios format metrics into prometheus, like
this (nag2prom doesn't exist yet, just an example):
sanoid --monitor-health | nag2prom >
/var/lib/prometheus/node-exporter/sanoid-metrics.prom
Would you be open to a code or docs PR like this?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
|
ok - I'll play with that. |
@benyanke you should look into zpool_influxdb (included with recent zfs versions), which can be scraped by prometheus via telegraf |
@benyanke have you made any progress on this? |
Sadly not. My perl is not great, so despite it being somewhat simple, I've not made much progress. |
@jimsalterjrs would you be open to PR that just added a I can see the value in keeping the code uncluttered and so I'm happy to keep the prometheus-specific stuff in a separate script, but I would rather have a structured way to extract the variables rather than parsing commandline output. |
@Hooloovoo that sounds fine, as long as it's implemented cleanly. |
I like that idea, allows far more flexible integration into any monitoring stack, not just Prometheus. |
I have started to put something together in: This is my first time ever coding in Perl, so I have likely made silly mistakes. So far I have only done the I have the bones of a simple Python script that uses the Prometheus Python library to write the metrics in text format so that it can be picked up by the textfile collector I already have running on the nodes. I was planning to hold off on submitting an MP until I have actually made this all work, but I wanted to mention it to avoid anyone else duplicating the work. At the moment I have taken the approach of only exposing the metrics and Sanoid configuration and I have not exported Sanoid's calculations of whether those metrics should result in a warning or critical, as I think this should be possible within the e.g. Prometheus alert (so e.g. the alert would trigger a critical if |
I have uploaded my simple Python script here: |
I have put up a merge proposal that outputs JSON for snapshot information: So far this MP only deals with the snapshot information, as this is what really needs to come from Sanoid. There are other ways to extract the zpool health and capacity (I use a free Grafana cloud account and recent versions of grafana-agent export metrics like I have designed the output format and code to accommodate someone (maybe me) adding the output of |
Would you be open to either a PR, or documentation on how to use sanoid
--monitoring-*
flags but with prometheus? Prometheus is a growing monitoring tool that many sysadmins are using, including myself.I'm thinking potential implementation could either be adding three new flags like
--monitor-capacity-prometheus
, or documentation on using a third party tool that could convert nagios format metrics into prometheus, like this (nag2prom doesn't exist yet, just an example):sanoid --monitor-health | nag2prom > /var/lib/prometheus/node-exporter/sanoid-metrics.prom
Would you be open to a code or docs PR like this?
The text was updated successfully, but these errors were encountered: