Skip to content

Commit

Permalink
[sophora-server] Adjust alert "SophoraServerAPISlow" (#141)
Browse files Browse the repository at this point in the history
* [sophora-server] Adjust alert "SophoraServerAPISlow"

* [sophora-server] Use variable in runbook for alert "SophoraServerAPISlow"
  • Loading branch information
muffl0n authored Jan 23, 2025
1 parent e67f23f commit f1db591
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 5 deletions.
2 changes: 1 addition & 1 deletion charts/sophora-server/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 2.5.2
version: 2.6.0

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
Expand Down
4 changes: 2 additions & 2 deletions charts/sophora-server/alerting-runbook.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ This document is a reference to the alerts this Helm chart can fire.

**Severity:** high

**Summary:** The API of the server exhibits a response time exceeding 300ms for more than 15 minutes at the 95th percentile.
**Summary:** The API of the server exhibits a response time exceeding ${threshold} for more than 15 minutes at the 95th percentile.

**Remediation steps:**

Expand Down Expand Up @@ -105,4 +105,4 @@ This document is a reference to the alerts this Helm chart can fire.
* Check if the primary server is running
* Check the logs of the server
* Check the logs of the primary server
* Check whether there are any network issues
* Check whether there are any network issues
4 changes: 2 additions & 2 deletions charts/sophora-server/templates/prometheusrule.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,12 @@ spec:
runbook_url: 'https://github.com/subshell/helm-charts/blob/main/charts/sophora-server/alerting-runbook.md'
- alert: SophoraServerAPISlow
for: 15m
expr: 'histogram_quantile(0.95, sum(rate(sophora_server_contentmanager_call_duration_seconds_bucket{job="{{ include "sophora-server.fullname" . }}"}[1m])) by (pod, le)) > 0.3'
expr: 'histogram_quantile(0.95, sum(rate(sophora_server_contentmanager_call_duration_seconds_bucket{job="{{ include "sophora-server.fullname" . }}"}[1m])) by (pod, le)) > 0.5'
labels:
severity: high
annotations:
summary: Sophora Server API is slow
description: The API of the server "{{`{{ $labels.pod }}`}}" exhibits a response time exceeding 300ms for more than 15 minutes at the 95th percentile.
description: The API of the server "{{`{{ $labels.pod }}`}}" exhibits a response time exceeding 500ms for more than 15 minutes at the 95th percentile.
runbook_url: 'https://github.com/subshell/helm-charts/blob/main/charts/sophora-server/alerting-runbook.md'
- alert: SophoraServerAsyncEventQueueBlocked
for: 10m
Expand Down

0 comments on commit f1db591

Please sign in to comment.