Skip to content

Commit

Permalink
Mpp prod move (#541)
Browse files Browse the repository at this point in the history
Rebased on top of @majamassarini PR, so it's easy to retry next time.
  • Loading branch information
mfocko authored Feb 6, 2024
2 parents 7178bd9 + 5db7ea6 commit 4f8102c
Show file tree
Hide file tree
Showing 11 changed files with 75 additions and 40 deletions.
44 changes: 18 additions & 26 deletions docs/deployment/logs.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,9 @@ title: Logs

# Logs

See a research for [Logs aggregation](https://github.com/packit/research/tree/main/logs-aggregation).
See a research for [Logs aggregation](https://packit.dev/research/monitoring/logs-aggregation).

Each worker pod has a sidecar container running [Fluentd](https://docs.fluentd.org),
which is a data collector allowing us to get the logs from a worker via
[syslog](https://docs.fluentd.org/input/syslog) and send them to Splunk.

We use [our fluentd-splunk-hec image](https://quay.io/repository/packit/fluentd-splunk-hec),
built via [a workflow](https://github.com/packit/fluent-plugin-splunk-hec/blob/main/.github/workflows/rebuild-and-push-image.yml)
because we don't want to use [docker.io/splunk/fluentd-hec image](https://hub.docker.com/r/splunk/fluentd-hec).
We are following the first solution described in this [document](https://source.redhat.com/departments/it/devit/it-infrastructure/itcloudservices/itocp/it_paas_kb/logging_to_splunk_on_managed_platform), _logging to stdout_ with no need for a forwarder sidecar pod.

## Where do I find the logs?

Expand All @@ -21,33 +15,31 @@ First, you have to [get access to Splunk](https://source.redhat.com/departments/

Then go to https://rhcorporate.splunkcloud.com`Search & Reporting`

You should be able to see some logs using [this query](https://rhcorporate.splunkcloud.com/en-US/app/search/search?q=search%20index%3D%22rh_paas%22%20source%3D%22%2Fvar%2Flog%2Fcontainers%2Fpackit-worker*.log"):

index="rh_paas" source="/var/log/containers/packit-worker*.log"

If the above query doesn't return any results, [request access](https://source.redhat.com/departments/it/splunk/splunk_wiki/faq#jive_content_id_How_do_I_request_access_to_additional_data_sets_in_Splunk) to `rh_paas` index.

:::caution

If you cannot see _Access to Additional Datasets_ (as suggested by the instructions), use _Update Permissions_ as the _Request Type_ and ask to access the `rh_paas` index in the additional details.

:::

[The more specific search, the faster it'll be](https://source.redhat.com/departments/it/splunk/splunk_wiki/splunk_training_search_best_practices#jive_content_id_Be_more_specific).
At least, specify `index`, `source` and `msgid`.
You can start with [this search ](https://rhcorporate.splunkcloud.com/en-US/app/search/search?q=search%20index%3Drh_linux%20source%3Dsyslog%20msgid%3Dpackit-prod)
At least, specify `index`, `source`.
You can start with [this search ](https://rhcorporate.splunkcloud.com/en-US/app/search/search?q=search%20index%3D%22rh_paas%22%20source%3D%22%2Fvar%2Flog%2Fcontainers%2Fpackit-worker*.log%22%20NOT%20pidbox)
and tune it from there.
For example:

- change `msgid=packit-prod` to service instance you want to see logs from, e.g. to `msgid=packit-stg` or `msgid=stream-prod`
- add `| search message!="pidbox*"` to remove the ["pidbox received method" message which Celery pollutes the log with](https://stackoverflow.com/questions/43633914/pidbox-received-method-enable-events-reply-tonone-ticketnone-in-django-cel)
- add `| reverse` if you want to se the results from oldest to newest
- add `| fields _time, message | fields - _raw` to leave only time and message fields
- add `| fields _raw | fields - _time` to leave only message field without timestamp duplication

All in one URL [here](https://rhcorporate.splunkcloud.com/en-US/app/search/search?q=search%20index%3Drh_linux%20source%3Dsyslog%20msgid%3Dpackit-prod%20%7C%20search%20message!%3D%22pidbox*%22%20%7C%20reverse%20%7C%20fields%20_time%2C%20message%20%7C%20fields%20-%20_raw) -
now just export it to csv; and you have almost the same log file
All in one URL [here](https://rhcorporate.splunkcloud.com/en-US/app/search/search?q=search%20index%3D%22rh_paas%22%20source%3D%22%2Fvar%2Flog%2Fcontainers%2Fpackit-worker-short-running-0_packit--stg_packit-worker-*.log%22%20%7C%20fields%20_raw%20%7C%20fields%20-%20_time%20%7C%20reverse) - now just export it to csv; and you have almost the same log file
as you'd get by exporting logs from a worker pod.

For more info, see (Red Hat internal):

- [demo](https://drive.google.com/file/d/15BIsRl7fP9bPdyLBQvoljF2yHy52ZqHm)
- [Splunk wiki @ Source](https://source.redhat.com/departments/it/splunk)

## Debugging

To see the sidecar container logs, select a worker pod → `Logs``fluentd-sidecar`.

To [manually send some event to Splunk](https://docs.splunk.com/Documentation/SplunkCloud/8.2.2203/Data/UsetheHTTPEventCollector#Send_data_to_HTTP_Event_Collector)
try this (get the host & token from Bitwarden):

$ curl -v "https://${SPLUNK_HEC_HOST}:443/services/collector/event" \
-H "Authorization: Splunk ${SPLUNK_HEC_TOKEN}" \
-d '{"event": "jpopelkastest"}'
1 change: 0 additions & 1 deletion openshift/dashboard.yml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,6 @@ spec:
name: {{ image_dashboard }}
importPolicy:
# Periodically query registry to synchronize tag and image metadata.
# DOES NOT WORK on Openshift Online.
scheduled: {{ auto_import_images }}
lookupPolicy:
# allows all resources pointing to this image stream to use it in the image field
Expand Down
23 changes: 22 additions & 1 deletion openshift/nginx.yml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ kind: Deployment
apiVersion: apps/v1
metadata:
name: nginx
annotations:
# https://docs.openshift.com/container-platform/4.11/openshift_images/triggering-updates-on-imagestream-changes.html
image.openshift.io/triggers: >-
[{"from":{"kind":"ImageStreamTag","name":"nginx:{{ deployment }}"},"fieldPath":"spec.template.spec.containers[?(@.name==\"nginx\")].image"}]
spec:
selector:
matchLabels:
Expand All @@ -30,7 +34,7 @@ spec:
secretName: flower-htpasswd
containers:
- name: nginx
image: ghcr.io/nginxinc/nginx-unprivileged
image: nginx:{{ deployment }}
ports:
- containerPort: 8443
volumeMounts:
Expand Down Expand Up @@ -135,3 +139,20 @@ spec:
tls:
insecureEdgeTerminationPolicy: Redirect
termination: passthrough
---
kind: ImageStream
apiVersion: image.openshift.io/v1
metadata:
name: nginx
spec:
tags:
- name: {{ deployment }}
from:
kind: DockerImage
name: ghcr.io/nginxinc/nginx-unprivileged
importPolicy:
# Periodically query registry to synchronize tag and image metadata.
scheduled: {{ auto_import_images }}
lookupPolicy:
# allows all resources pointing to this image stream to use it in the image field
local: true
23 changes: 22 additions & 1 deletion openshift/pushgateway.yml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ kind: Deployment
apiVersion: apps/v1
metadata:
name: pushgateway
annotations:
# https://docs.openshift.com/container-platform/4.11/openshift_images/triggering-updates-on-imagestream-changes.html
image.openshift.io/triggers: >-
[{"from":{"kind":"ImageStreamTag","name":"pushgateway:{{ deployment }}"},"fieldPath":"spec.template.spec.containers[?(@.name==\"pushgateway\")].image"}]
spec:
selector:
matchLabels:
Expand All @@ -20,7 +24,7 @@ spec:
spec:
containers:
- name: pushgateway
image: ghcr.io/zapier/prom-aggregation-gateway:v0.7.0
image: pushgateway:{{ deployment }}
args:
- "--apiListen=:9091"
imagePullPolicy: IfNotPresent
Expand Down Expand Up @@ -54,3 +58,20 @@ spec:
targetPort: 9091
selector:
component: pushgateway
---
kind: ImageStream
apiVersion: image.openshift.io/v1
metadata:
name: pushgateway
spec:
tags:
- name: {{ deployment }}
from:
kind: DockerImage
name: ghcr.io/zapier/prom-aggregation-gateway:v0.7.0
importPolicy:
# Periodically query registry to synchronize tag and image metadata.
scheduled: {{ auto_import_images }}
lookupPolicy:
# allows all resources pointing to this image stream to use it in the image field
local: true
2 changes: 1 addition & 1 deletion playbooks/deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
# project_dir is set in tasks/project-dir.yml
path_to_secrets: "{{ project_dir }}/secrets/{{ service }}/{{ deployment }}"
# to be used in Image streams as importPolicy:scheduled value
auto_import_images: "{{(deployment != 'prod')}}"
auto_import_images: true
# used in dev/zuul deployment to tag & push images to cluster
# https://github.com/packit/deployment/issues/112#issuecomment-673343049
# container_engine: "{{ lookup('pipe', 'command -v podman 2> /dev/null || echo docker') }}"
Expand Down
2 changes: 1 addition & 1 deletion playbooks/import-images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
with_fedmsg: true
with_dashboard: true
with_tokman: true
with_fluentd_sidecar: true
with_fluentd_sidecar: false
tasks:
- name: Include variables
ansible.builtin.include_vars: ../vars/{{ service }}/{{ deployment }}.yml
Expand Down
5 changes: 2 additions & 3 deletions secrets/packit/prod/packit-service.yaml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,12 @@ enabled_projects_for_internal_tf:
command_handler: sandcastle
command_handler_work_dir: /tmp/sandcastle
command_handler_image_reference: quay.io/packit/sandcastle:prod
command_handler_k8s_namespace: packit-prod-sandbox
command_handler_k8s_namespace: packit--prod-sandbox
command_handler_pvc_volume_specs:
- path: /repository-cache
pvc_from_env: SANDCASTLE_REPOSITORY_CACHE_VOLUME
read_only: true
# [TODO]: Switch to <aws-ebs> during migration of prod to MP+
command_handler_storage_class: gp2
command_handler_storage_class: aws-ebs

repository_cache: /repository-cache
# The maintenance of the cache (adding, updating) is done externally,
Expand Down
2 changes: 1 addition & 1 deletion vars/fedora-source-git/prod_template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ with_pushgateway: false

with_repository_cache: false

with_fluentd_sidecar: true
with_fluentd_sidecar: false

# image to use for service
# image: quay.io/packit/packit-service:{{ deployment }}
Expand Down
8 changes: 6 additions & 2 deletions vars/packit/prod_template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@
project: packit-prod

# Openshift cluster url
host: https://api.auto-prod.gi0n.p1.openshiftapps.com:6443
# For the URL of the MP+ API endpoint, see Bitwarden Secure note
host: ‹TODO›

# oc login <the above host value>, oc whoami -t
# OR via Openshift web GUI: click on your login in top right corner, 'Copy Login Command', take the part after --token=
Expand Down Expand Up @@ -42,7 +43,7 @@ with_flower: true
# with_repository_cache: true
# repository_cache_storage: 4Gi

with_fluentd_sidecar: true
with_fluentd_sidecar: false

# image to use for service
# image: quay.io/packit/packit-service:{{ deployment }}
Expand Down Expand Up @@ -70,6 +71,9 @@ with_fluentd_sidecar: true
# If you still want to use docker even when podman is installed, set:
# container_engine: docker

# We're using 15 on MP+
postgres_version: 15

# Celery retry parameters
# celery_retry_limit: 2
# celery_retry_backoff: 3
Expand Down
2 changes: 1 addition & 1 deletion vars/packit/stg_template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ with_flower: true
# with_repository_cache: true
# repository_cache_storage: 4Gi

with_fluentd_sidecar: true
with_fluentd_sidecar: false

# image to use for service
# image: quay.io/packit/packit-service:{{ deployment }}
Expand Down
3 changes: 1 addition & 2 deletions vars/template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,7 @@ api_key: ""

# with_repository_cache: true

# with_fluentd_sidecar: false

with_fluentd_sidecar: false
# image to use for service
# image: quay.io/packit/packit-service:{{ deployment }}

Expand Down

0 comments on commit 4f8102c

Please sign in to comment.