Skip to content

Commit

Permalink
Merge branch 'main' into gsub-processor
Browse files Browse the repository at this point in the history
  • Loading branch information
vagimeli authored Jun 6, 2024
2 parents 2220da4 + 7dd0961 commit 06958f0
Show file tree
Hide file tree
Showing 34 changed files with 4,557 additions and 111 deletions.
2 changes: 2 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ _Describe what this change achieves._
### Issues Resolved
_List any issues this PR will resolve, e.g. Closes [...]._

### Version
_List the OpenSearch version to which this PR applies, e.g. 2.14, 2.12--2.14, or all._

### Checklist
- [ ] By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the [Developers Certificate of Origin](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin).
Expand Down
1 change: 1 addition & 0 deletions .github/vale/styles/Vocab/OpenSearch/Products/accept.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ Amazon SageMaker
Ansible
Auditbeat
AWS Cloud
Cohere Command
Cognito
Dashboards Query Language
Data Prepper
Expand Down
80 changes: 41 additions & 39 deletions STYLE_GUIDE.md

Large diffs are not rendered by default.

76 changes: 38 additions & 38 deletions TERMS.md

Large diffs are not rendered by default.

158 changes: 158 additions & 0 deletions _aggregations/metric/median-absolute-deviation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
---
layout: default
title: Median absolute deviation
parent: Metric aggregations
grand_parent: Aggregations
nav_order: 65
redirect_from:
- /query-dsl/aggregations/metric/median-absolute-deviation/
---

# Median absolute deviation aggregations

The `median_absolute_deviation` metric is a single-value metric aggregation that returns a median absolute deviation field. Median absolute deviation is a statistical measure of data variability. Because the median absolute deviation measures dispersion from the median, it provides a more robust measure of variability that is less affected by outliers in a dataset.

Median absolute deviation is calculated as follows:<br>
median_absolute_deviation = median(|X<sub>i</sub> - Median(X<sub>i</sub>)|)

The following example calculates the median absolute deviation of the `DistanceMiles` field in the sample dataset `opensearch_dashboards_sample_data_flights`:


```json
GET opensearch_dashboards_sample_data_flights/_search
{
"size": 0,
"aggs": {
"median_absolute_deviation_DistanceMiles": {
"median_absolute_deviation": {
"field": "DistanceMiles"
}
}
}
}
```
{% include copy-curl.html %}

#### Example response

```json
{
"took": 35,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 10000,
"relation": "gte"
},
"max_score": null,
"hits": []
},
"aggregations": {
"median_absolute_deviation_distanceMiles": {
"value": 1829.8993624441966
}
}
}
```

### Missing

By default, if a field is missing or has a null value in a document, it is ignored during computation. However, you can specify a value to be used for those missing or null fields by using the `missing` parameter, as shown in the following request:

```json
GET opensearch_dashboards_sample_data_flights/_search
{
"size": 0,
"aggs": {
"median_absolute_deviation_distanceMiles": {
"median_absolute_deviation": {
"field": "DistanceMiles",
"missing": 1000
}
}
}
}
```
{% include copy-curl.html %}

#### Example response

```json
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 10000,
"relation": "gte"
},
"max_score": null,
"hits": []
},
"aggregations": {
"median_absolute_deviation_distanceMiles": {
"value": 1829.6443646143355
}
}
}
```

### Compression

The median absolute deviation is calculated using the [t-digest](https://github.com/tdunning/t-digest/tree/main) data structure, which balances between performance and estimation accuracy through the `compression` parameter (default value: `1000`). Adjusting the `compression` value affects the trade-off between computational efficiency and precision. Lower `compression` values improve performance but may reduce estimation accuracy, while higher values enhance accuracy at the cost of increased computational overhead, as shown in the following request:

```json
GET opensearch_dashboards_sample_data_flights/_search
{
"size": 0,
"aggs": {
"median_absolute_deviation_DistanceMiles": {
"median_absolute_deviation": {
"field": "DistanceMiles",
"compression": 10
}
}
}
}
```
{% include copy-curl.html %}

#### Example response

```json
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 10000,
"relation": "gte"
},
"max_score": null,
"hits": []
},
"aggregations": {
"median_absolute_deviation_DistanceMiles": {
"value": 1836.265614211182
}
}
}
```
11 changes: 11 additions & 0 deletions _api-reference/nodes-apis/nodes-stats.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ indexing_pressure | Statistics about the node's indexing pressure.
shard_indexing_pressure | Statistics about shard indexing pressure.
search_backpressure | Statistics related to search backpressure.
cluster_manager_throttling | Statistics related to throttled tasks on the cluster manager node.
weighted_routing | Statistics relevant to weighted round robin requests.
resource_usage_stats | Node-level resource usage statistics, such as CPU and JVM memory.
admission_control | Statistics about admission control.
caches | Statistics about caches.
Expand Down Expand Up @@ -834,6 +835,7 @@ http.total_opened | Integer | The total number of HTTP connections the node has
[shard_indexing_pressure](#shard_indexing_pressure) | Object | Statistics related to indexing pressure at the shard level.
[search_backpressure]({{site.url}}{{site.baseurl}}/opensearch/search-backpressure#search-backpressure-stats-api) | Object | Statistics related to search backpressure.
[cluster_manager_throttling](#cluster_manager_throttling) | Object | Statistics related to throttled tasks on the cluster manager node.
[weighted_routing](#weighted_routing) | Object | Statistics relevant to weighted round robin requests.
[resource_usage_stats](#resource_usage_stats) | Object | Statistics related to resource usage for the node.
[admission_control](#admission_control) | Object | Statistics related to admission control for the node.
[caches](#caches) | Object | Statistics related to caches on the node.
Expand Down Expand Up @@ -1294,6 +1296,15 @@ stats | Object | Statistics about throttled tasks on the cluster manager node.
stats.total_throttled_tasks | Long | The total number of throttled tasks.
stats.throttled_tasks_per_task_type | Object | A breakdown of statistics by individual task type, specified as key-value pairs. The keys are individual task types, and their values represent the number of requests that were throttled.

### `weighted_routing`

The `weighted_routing` object contains statistics about weighted round robin requests. Specifically, it contains a counter of times this node has server a request while it was "zoned out".

Field | Field type | Description
:--- |:-----------| :---
stats | Object | Statistics about weighted routing.
fail_open_count | Integer | Number of times a shard on this node has served a request while the routing weight for the node was set to zero.

### `resource_usage_stats`

The `resource_usage_stats` object contains the resource usage statistics. Each entry is specified by the node ID and has the following properties.
Expand Down
1 change: 1 addition & 0 deletions _dashboards/dashboards-assistant/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,3 +122,4 @@ The following screenshot shows a saved conversation, along with actions you can

- [Getting started guide for OpenSearch Assistant in OpenSearch Dashboards](https://github.com/opensearch-project/dashboards-assistant/blob/main/GETTING_STARTED_GUIDE.md)
- [OpenSearch Assistant configuration through the REST API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/opensearch-assistant/)
- [Build your own chatbot]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/build-chatbot/)
Loading

0 comments on commit 06958f0

Please sign in to comment.