Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify internal observability documentation - 1 of 3 #4246

Merged
merged 34 commits into from
Apr 17, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
f895d47
Break 4192 into smaller PR - 1 of 3
tiffany76 Apr 4, 2024
6197e06
Merge branch 'main' into internal-obs-1
tiffany76 Apr 4, 2024
8522441
Fix refcache issue
tiffany76 Apr 4, 2024
e2cb980
Make copyedits
tiffany76 Apr 4, 2024
e75aa33
Add tabbed panes
tiffany76 Apr 4, 2024
c929e4c
Make markdown linter fix
tiffany76 Apr 4, 2024
43b2cd2
Fix tab labels
tiffany76 Apr 4, 2024
8da4e89
Merge branch 'main' into internal-obs-1
tiffany76 Apr 5, 2024
f7b40e7
Modify intro so doc is MVP-ready
tiffany76 Apr 5, 2024
5b753d8
Add content to metrics and logs sections
tiffany76 Apr 5, 2024
d0e4a0c
Results from /fix:refcache
opentelemetrybot Apr 5, 2024
3f33443
Apply suggestions from review
tiffany76 Apr 8, 2024
b2ab2af
Merge branch 'main' into internal-obs-1
tiffany76 Apr 8, 2024
45c14f0
Make prettier fixes
tiffany76 Apr 8, 2024
ca60916
Fix links to headings
tiffany76 Apr 8, 2024
28d15f7
Make requested changes to first subsection
tiffany76 Apr 8, 2024
91fb850
Merge branch 'main' into internal-obs-1
tiffany76 Apr 10, 2024
5ba8786
Clarify the use of local host for metrics
tiffany76 Apr 10, 2024
007fce4
Fix refcache
tiffany76 Apr 10, 2024
cce84ff
Merge branch 'main' into internal-obs-1
tiffany76 Apr 11, 2024
625af6f
Add cautionary note to self-monitoring section
tiffany76 Apr 11, 2024
0f3ab41
Update format of logs note
tiffany76 Apr 11, 2024
d23e6af
Merge branch 'main' into internal-obs-1
tiffany76 Apr 11, 2024
97035fc
Make linter fixes
tiffany76 Apr 11, 2024
493a389
Add reciprocal links
tiffany76 Apr 11, 2024
41f8c25
Merge branch 'main' into internal-obs-1
tiffany76 Apr 12, 2024
c2fef88
Remove traces
tiffany76 Apr 12, 2024
702ce1e
Merge branch 'main' into internal-obs-1
tiffany76 Apr 15, 2024
eda8108
Add review suggestion
tiffany76 Apr 16, 2024
3634aae
Merge branch 'main' into internal-obs-1
tiffany76 Apr 16, 2024
9f2d105
Remove Grafana dashboard
tiffany76 Apr 16, 2024
c42680f
Fix default log emitting
tiffany76 Apr 16, 2024
a9af6e3
Fix other stdout
tiffany76 Apr 16, 2024
06fa139
Merge branch 'main' into internal-obs-1
svrnm Apr 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
163 changes: 163 additions & 0 deletions content/en/docs/collector/internal-telemetry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
---
title: Internal Telemetry
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
weight: 25
cSpell:ignore: journalctl kube otecol pprof tracez zpages
---

The Collector offers multiple ways to measure and monitor its own health. In
this section, you'll learn how to enable internal observability.
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

## Enabling observability internal to the Collector

By default, the Collector exposes service telemetry in two ways:

- Internal [metrics](#configuring-metrics) are exposed using a Prometheus
interface which defaults to port `8888`.
- [Logs](#configuring-logs) are emitted to `stdout`.

[Traces](#configuring-traces) are not exposed by default but two feature gates
offer experimental support for a configuration based on the OpenTelemetry
Configuration schema.
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

### Configuring metrics
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

Prometheus metrics are exposed locally on port `8888` and path `/metrics`. For
containerized environments, you may want to expose this port on a public
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
interface instead of only locally.
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

Set the address in the config `service::telemetry::metrics`:

```yaml
service:
telemetry:
metrics:
address: ':8888'
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
```

To visualize these metrics, you can use the
[Grafana dashboard](https://grafana.com/grafana/dashboards/15983-opentelemetry-collector/),
for example.
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

You can enhance the metrics telemetry level using the `level` field. The
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
following is a list of all possible values and their explanations.

- `none` indicates that no telemetry data should be collected.
- `basic` is the recommended value and covers the basics of the service
telemetry.
- `normal` adds other indicators on top of basic.
- `detailed` adds dimensions and views to the previous levels.

For example:

```yaml
service:
telemetry:
metrics:
level: detailed
address: ':8888'
```

The Collector can also be configured to scrape its own metrics and send them
through configured pipelines. For example:
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

```yaml
receivers:
prometheus:
config:
scrape_configs:
- job_name: 'otelcol'
scrape_interval: 10s
static_configs:
- targets: ['0.0.0.0:8888']
metric_relabel_configs:
- source_labels: [__name__]
regex: '.*grpc_io.*'
action: drop
exporters:
debug:
service:
pipelines:
metrics:
receivers: [prometheus]
processors: []
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
exporters: [debug]
```

### Configuring logs
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

You can find log output in `stdout`. The verbosity level for logs defaults to
`INFO`, but you can adjust it in the config `service::telemetry::logs`:

```yaml
service:
telemetry:
logs:
level: 'debug'
```

You can also see logs for the Collector on a Linux systemd system using
`journalctl`:

{{< tabpane text=true >}} {{% tab "All logs" %}}

```sh
journalctl | grep otelcol
```

{{% /tab %}} {{% tab "Errors only" %}}

```sh
journalctl | grep otelcol | grep Error
```

{{% /tab %}} {{< /tabpane >}}
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

### Configuring traces

Although the Collector does not expose traces by default, an effort is underway
to
[change this](https://github.com/open-telemetry/opentelemetry-collector/issues/7532).
The work includes supporting configuration of the OpenTelemetry SDK used to
produce the Collector's internal telemetry. This feature is currently behind two
feature gates:
theletterf marked this conversation as resolved.
Show resolved Hide resolved

```sh
--feature-gates=telemetry.useOtelWithSDKConfigurationForInternalTelemetry
```

The gate `useOtelWithSDKConfigurationForInternalTelemetry` enables the Collector
to parse configuration that aligns with the
[OpenTelemetry Configuration](https://github.com/open-telemetry/opentelemetry-configuration)
schema. Support for this schema is still experimental, but it does allow
telemetry to be exported using OTLP.

The following configuration can be used in combination with the aforementioned
feature gates to emit internal metrics and traces from the Collector to an OTLP
backend:

```yaml
service:
telemetry:
metrics:
readers:
- periodic:
interval: 5000
exporter:
otlp:
protocol: grpc/protobuf
endpoint: https://backend:4317
traces:
processors:
- batch:
exporter:
otlp:
protocol: grpc/protobuf
endpoint: https://backend2:4317
```

See the
[example configuration](https://github.com/open-telemetry/opentelemetry-configuration/blob/main/examples/kitchen-sink.yaml)
for additional configuration options.

> Note that this configuration does not support emitting logs as there is no
> support for logs in OpenTelemetry Go SDK at this time.
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
4 changes: 4 additions & 0 deletions static/refcache.json
Original file line number Diff line number Diff line change
Expand Up @@ -3047,6 +3047,10 @@
"StatusCode": 200,
"LastSeen": "2024-01-18T19:36:56.082576-05:00"
},
"https://github.com/open-telemetry/opentelemetry-collector/issues/7532": {
"StatusCode": 200,
"LastSeen": "2024-04-04T11:07:15.276911438-07:00"
},
"https://github.com/open-telemetry/opentelemetry-collector/pull/6140": {
"StatusCode": 200,
"LastSeen": "2024-01-30T05:18:24.402543-05:00"
Expand Down
Loading