Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Schema] Align OpenTelemetry metrics index templates with Data Pepper #197

Open
KarstenSchnitter opened this issue Oct 7, 2024 · 3 comments
Labels
schema schema related issue

Comments

@KarstenSchnitter
Copy link

Which domain protocol is relevant for this schema ?

The catalog describes a schema to be used with OpenTelemetry metrics data at https://github.com/opensearch-project/opensearch-catalog/tree/main/docs/schema/observability/metrics. Unfortunately, this schema is not compatible with the schema generated by Data Prepper. This can be explored using this example.

What is the schema resource ?

The Data Prepper schema for OpenTelemetry metrics follows closely the schema used for spans (and logs). All three issues allow for filters on resource attributes and instrumentation scopes to be applied to all signals. data-prepper#3929 introduces a mapping template for the metrics index. The same PR also contains mappings for traces and logs.

Source Schema - Add necessary repository


Do you have any additional context?

To be added on request.

@KarstenSchnitter KarstenSchnitter added schema schema related issue untriaged labels Oct 7, 2024
@juergen-walter
Copy link

Aligning the schema is a foundation to further invest into visualizations on top. What do you think @YANG-DB? Can you help to push this topic forward.

@YANG-DB
Copy link
Member

YANG-DB commented Oct 8, 2024

@juergen-walter thanks for your review
I'm not exactly sure what is the exact diff between the two ? is it only the index_type ?

@YANG-DB YANG-DB removed the untriaged label Oct 8, 2024
@KarstenSchnitter
Copy link
Author

I ran the example linked above, to extract a JSON sample. I ordered the fields alphabetically.

{
  "_index": "otel_metrics",
  "_id": "i7eJe5IBPqA3feadeJBE",
  "_score": 1,
  "_source": {
    "aggregationTemporality": "AGGREGATION_TEMPORALITY_CUMULATIVE",
    "description": "Total seconds each logical CPU spent on each mode.",
    "exemplars": [],
    "flags": 0,
    "instrumentationScope.name": "otelcol/hostmetricsreceiver/cpu",
    "instrumentationScope.version": "0.97.0"
    "isMonotonic": true,
    "kind": "SUM",
    "metric.attributes.cpu": "cpu0",
    "metric.attributes.state": "system",
    "name": "system.cpu.time",
    "resource.attributes.service@name": "otel-collector",
    "schemaUrl": "https://opentelemetry.io/schemas/1.9.0",
    "serviceName": "otel-collector",
    "startTime": "2024-10-11T12:22:24Z",
    "time": "2024-10-11T12:23:01.611880002Z",
    "unit": "s",
    "value": 0.28,
  }
}

Compared with the sum.json sample, there are the following differences:

  • metrics attributes are prefixed with metrics.attributes. by Data Prepper and not just by attributes.;
  • monotonicity is called isMonotonic by Data Prepper not just monotonic;
  • resource attributes are prefixed with resource.attributes. by Data Prepper and not just by resource.;
  • the current time is called time by Data Prepper not @timestamp;
  • the value is always created as a double value by Data Prepper without distinction into value.int or value.double. Due to the naming scheme, this causes a field type conflict, if Data Prepper was to write into the catalogue index.

I briefly checked the gauge and histogram example as well. There might be similar issues, if the data points get richer, e.g., by containing exemplars. I found in the http histogram samples, that they contain attributes without dedotted names (network.protocol.name). That will not happen with Data Prepper.

These differences should be resolved in a way, that leads to compatible index templates for all OpenTelemetry signals. This enables filtering by resource attributes or timestamps for different signal types in the same dashboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
schema schema related issue
Projects
None yet
Development

No branches or pull requests

3 participants