Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: remove workload detail templates #3433

Merged
merged 1 commit into from
Jan 28, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions cmd/poller/collector/helpers.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,17 @@ import (
"strings"
)

var ExcludeTemplates = map[string]map[string]struct{}{
"ZapiPerf": {
"workload_detail": {},
"workload_detail_volume": {},
},
"RestPerf": {
"api/cluster/counter/tables/qos_detail": {},
"api/cluster/counter/tables/qos_detail_volume": {},
},
}

// ImportTemplate looks for a collector's template by searching confPaths for the first template that exists in
// confPath/collectorName/templateName
func ImportTemplate(confPaths []string, templateName, collectorName string) (*node.Node, error) {
Expand Down Expand Up @@ -135,6 +146,15 @@ nextFile:
}
}

if finalTemplate != nil {
if queries, exists := ExcludeTemplates[c.Name]; exists {
templateQuery := finalTemplate.GetChildContentS("query")
if _, ok := queries[templateQuery]; ok {
return nil, "", fmt.Errorf("%w: template '%s' does not support query '%s' in template '%s'", errs.ErrTemplateNotSupported, c.Object, templateQuery, filename)
}
}
}

return finalTemplate, templatePath, err
}

Expand Down
47 changes: 0 additions & 47 deletions cmd/tools/generate/counter.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -530,53 +530,6 @@ counters:
- API: ZAPI
Unit: b_per_sec

- Name: qos_detail_resource_latency
Description: |
This refers to the average latency for workloads within the subsystems of Data ONTAP. These subsystems are the various modules or components within the system that could contribute to delays or latency during data or task processing. The calculated latency includes both the processing time within the subsystem and the waiting time at that subsystem. Below is the description of subsystems' latency.

* **frontend**: Represents the delays in the network layer of ONTAP.
* **backend**: Represents the delays in the data/WAFL layer of ONTAP.
* **cluster**: Represents delays caused by the cluster switches, cables, and adapters which physically connect clustered nodes.If the cluster interconnect component is in contention, it means high wait time for I/O requests at the cluster interconnect is impacting the latency of one or more workloads.
* **cp**: Represents delays due to buffered write flushes, called consistency points (cp).
* **disk**: Represents slowness due to attached hard drives or solid state drives.
* **network**: `Note:` Typically these latencies only apply to SAN not NAS. Represents the wait time of I/O requests by the external networking protocols on the cluster. The wait time is time spent waiting for transfer ready transactions to finish before the cluster can respond to an I/O request. If the network component is in contention, it means high wait time at the protocol layer is impacting the latency of one or more workloads.
* **nvlog**: Represents delays due to mirroring writes to the NVRAM/NVLOG memory and to the HA partner NVRAM/NVLOG memory.
* **suspend**: Represents delays due to operations suspending on a delay mechanism. Typically this is diagnosed by NetApp Support.
* **throttle**: Represents the throughput maximum (ceiling) setting of the storage Quality of Service (QoS) policy group assigned to the workload. If the policy group component is in contention, it means all workloads in the policy group are being throttled by the set throughput limit, which is impacting the latency of one or more of those workloads.
* **qos_min**: Represents the latency to a workload that is being caused by QoS throughput floor (expected) setting assigned to other workloads. If the QoS floor set on certain workloads use the majority of the bandwidth to guarantee the promised throughput, other workloads will be throttled and see more latency.
* **cloud**: Represents the software component in the cluster involved with I/O processing between the cluster and the cloud tier on which user data is stored. If the cloud latency component is in contention, it means that a large amount of reads from volumes that are hosted on the cloud tier are impacting the latency of one or more workloads.
APIs:
- API: REST
Endpoint: api/cluster/counter/tables/qos_detail
ONTAPCounter: Harvest generated
Template: conf/restperf/9.12.0/workload_detail.yaml
Unit: microseconds
Type: average
BaseCounter: ops
- API: ZAPI
Endpoint: perf-object-get-instances workload_detail
ONTAPCounter: Harvest generated
Template: conf/zapiperf/9.12.0/workload_detail.yaml
Unit: microseconds
Type: average
BaseCounter: ops

- Name: qos_detail_ops
Description: This field is the workload's rate of operations that completed during the measurement interval measured per second.
APIs:
- API: REST
Endpoint: api/cluster/counter/tables/qos, api/cluster/counter/tables/qos_volume
ONTAPCounter: ops
Template: conf/restperf/9.12.0/workload_detail.yaml
Unit: per_sec
Type: rate
- API: ZAPI
Endpoint: perf-object-get-instances workload, workload_volume
ONTAPCounter: ops
Template: conf/zapiperf/9.12.0/workload_detail.yaml
Unit: per_sec
Type: rate

- Name: quota_disk_limit
Description: Maximum amount of disk space, in kilobytes, allowed for the quota target
(hard disk space limit). The value is -1 if the limit is unlimited.
Expand Down
24 changes: 11 additions & 13 deletions cmd/tools/grafana/dashboard_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -270,19 +270,17 @@ func TestUnitsAndExprMatch(t *testing.T) {

// Exceptions are meant to reduce false negatives
allowedSuffix := map[string][]string{
"_count": {"none", "short", "locale"},
"_lag_time": {"", "s", "short"},
"qos_detail_service_time_latency": {"µs", "percent"},
"qos_detail_resource_latency": {"µs", "percent"},
"volume_space_physical_used": {"bytes", "binBps"}, // Growth rate uses bytes/sec unit
"volume_space_logical_used": {"bytes", "binBps"}, // Growth rate uses bytes/sec unit
"qos_ops": {"iops", "percent"},
"qos_total_data": {"Bps", "percent"},
"aggr_space_used": {"bytes", "percent"},
"volume_size_used": {"bytes", "percent"},
"shelf_power": {"watt", "watth"},
"environment_sensor_power": {"watt", "watth"},
"volume_num_compress_fail": {"percent", "short"},
"_count": {"none", "short", "locale"},
"_lag_time": {"", "s", "short"},
"volume_space_physical_used": {"bytes", "binBps"}, // Growth rate uses bytes/sec unit
"volume_space_logical_used": {"bytes", "binBps"}, // Growth rate uses bytes/sec unit
"qos_ops": {"iops", "percent"},
"qos_total_data": {"Bps", "percent"},
"aggr_space_used": {"bytes", "percent"},
"volume_size_used": {"bytes", "percent"},
"shelf_power": {"watt", "watth"},
"environment_sensor_power": {"watt", "watth"},
"volume_num_compress_fail": {"percent", "short"},
}

// Normalize rates to their base unit
Expand Down
71 changes: 0 additions & 71 deletions conf/restperf/9.12.0/workload_detail.yaml

This file was deleted.

69 changes: 0 additions & 69 deletions conf/restperf/9.12.0/workload_detail_volume.yaml

This file was deleted.

8 changes: 2 additions & 6 deletions conf/restperf/default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,5 @@ objects:
VscanSVM: vscan_svm.yaml

# Uncomment to collect workload/QOS counters.
# Workload: workload.yaml
# WorkloadVolume: workload_volume.yaml

# The following workload templates may slow down data collection due to a high number of metrics.
# WorkloadDetail: workload_detail.yaml
# WorkloadDetailVolume: workload_detail_volume.yaml
Workload: workload.yaml
WorkloadVolume: workload_volume.yaml
73 changes: 0 additions & 73 deletions conf/zapiperf/cdot/9.8.0/workload_detail.yaml

This file was deleted.

Loading
Loading