Skip to content

Commit

Permalink
remove certs and simplify telemetry summarize (#1750)
Browse files Browse the repository at this point in the history
The goal here is to remove the need for certificates. Any worker that is not in our VPC can talk directly to fluentbit, and fluentbit will be configured with certificates to talk to Tempo. The implementation implication is that we need to run telemetry stuff ONLY on nodes in our VPC. To avoid needing to move all jobs to these nodes, we instead temporarily store telemetry data as artifacts, and in one final job, we process and send telemetry info for all jobs from one job.

Part of rapidsai/shared-workflows#269 and rapidsai/shared-actions#28

Authors:
  - Mike Sarahan (https://github.com/msarahan)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #1750
  • Loading branch information
msarahan authored Dec 17, 2024
1 parent 65858a9 commit 1af03eb
Showing 1 changed file with 7 additions and 10 deletions.
17 changes: 7 additions & 10 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@ jobs:
- conda-python-build
- conda-python-tests
- docs-build
- telemetry-setup
- wheel-build-cpp
- wheel-build-python
- wheel-tests
- devcontainer
- telemetry-setup
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
if: always()
Expand All @@ -33,9 +33,11 @@ jobs:
runs-on: ubuntu-latest
continue-on-error: true
env:
OTEL_SERVICE_NAME: "pr-rmm"
OTEL_SERVICE_NAME: "pr-rmm"
steps:
- name: Telemetry setup
# This gate is here and not at the job level because we need the job to not be skipped,
# since other jobs depend on it.
if: ${{ vars.TELEMETRY_ENABLED == 'true' }}
uses: rapidsai/shared-actions/telemetry-dispatch-stash-base-env-vars@main
changed-files:
Expand Down Expand Up @@ -141,16 +143,11 @@ jobs:
sccache -s;
telemetry-summarize:
runs-on: ubuntu-latest
# This job must use a self-hosted runner to record telemetry traces.
runs-on: linux-amd64-cpu4
needs: pr-builder
if: ${{ vars.TELEMETRY_ENABLED == 'true' && !cancelled() }}
continue-on-error: true
steps:
- name: Load stashed telemetry env vars
uses: rapidsai/shared-actions/telemetry-dispatch-load-base-env-vars@main
with:
load_service_name: true
- name: Telemetry summarize
uses: rapidsai/shared-actions/telemetry-dispatch-write-summary@main
with:
cert_concat: "${{ secrets.OTEL_EXPORTER_OTLP_CA_CERTIFICATE }};${{ secrets.OTEL_EXPORTER_OTLP_CLIENT_CERTIFICATE }};${{ secrets.OTEL_EXPORTER_OTLP_CLIENT_KEY }}"
uses: rapidsai/shared-actions/telemetry-dispatch-summarize@main

0 comments on commit 1af03eb

Please sign in to comment.