From 0157c097ceec275e77405c6df8712cbdf238d86d Mon Sep 17 00:00:00 2001
From: Adam Gardner <adam.gardner@dynatrace.com>
Date: Fri, 27 Sep 2024 09:21:48 +1000
Subject: [PATCH] Deployed e82445b with MkDocs version: 1.6.0

---
 search/search_index.json                | 2 +-
 view-acceptance-test-results/index.html | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/search/search_index.json b/search/search_index.json
index 1ac1d2f..4efb1dc 100755
--- a/search/search_index.json
+++ b/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Release Validation for DevOps Engineers with Site Reliability Guardian","text":"<p>In this demo, you take on the role of a Product Manager or DevOps engineer. You are running an application, and wish to enable a new feature.</p> <p>The application is already instrumented to emit tracing data, using the OpenTelemetry standard. The demo system will be automatically configured to transport that data to Dynatrace for storage and processing.</p> <p>Your job is to:</p> <ul> <li>Ensure each service in the application is healthy.</li> <li>Ensure that any new release of a microservice does not negatively impact the application.</li> </ul> <p>To achieve these objectives, you will:</p> <ul> <li>Create a Site Reliability Guardian to test and ensure the health of your microservices (starting with the most user impacting service - the <code>checkoutservice</code>)</li> <li>Use the auto baselining capability of Dynatrace to suggest (and dynamically adjust) thresholds based on current and past performance.</li> </ul>"},{"location":"#a-new-release","title":"A New Release","text":"<p>Your company utilises feature flags to enable new features. A product manager informs you that they wish to release a new feature.</p> <p>It is your job to:</p> <ul> <li>Enable that feature flag in a development environment.</li> <li>Judge the impact (if any) of that change on the application.</li> <li>If an impact is observed, gather the evidence and then disable the feature flag.</li> <li>Make the \"go / no go\" decision for that feature.</li> <li>Provide feedback on why you made the decision you did.</li> </ul>"},{"location":"#logical-architecture","title":"Logical Architecture","text":"<p>Below is the \"flow\" of information and actors during this demo.</p> <p>This architecture also holds true for other load testing tools (eg. JMeter).</p> <ol> <li> <p>A load test is executed. The HTTP requests are annotated with the standard header values.</p> </li> <li> <p>Metrics are streamed during the load test (if the load testing tool supports this) or the metrics are send at the end of the load test.</p> </li> <li> <p>The load testing tool is responsible for sending an event to signal \"test is finished\". Integrators are responsible for crafting this event to contain any important information required by Dynatrace such as the test duration.</p> </li> <li> <p>A workflow is triggered on receipt of this event. The workflow triggers the Site Reliability Guardian.</p> </li> <li> <p>The Site Reliability Guardian processes the load testing metrics and to provide an automated load testing report. This can be used for information only or can be used as an automated \"go / no go\" decision point.</p> </li> <li> <p>Dynatrace users can view the results in a dashboard, notebook or use the result as a trigger for further automated workflows.</p> </li> <li> <p>Integrators have the choice to send (emit) the results to an external tool. This external tool can then use this result. One example would be sending the SRG result to Jenkins to progress or prevent a deployment.</p> </li> </ol> <p></p>"},{"location":"#compatibility","title":"Compatibility","text":"Deployment Tutorial Compatible Dynatrace Managed \u274c Dynatrace SaaS \u2714\ufe0f <ul> <li>Click Here to Begin </li> </ul>"},{"location":"automate-srg/","title":"Automate the Site Reliability Guardian","text":"<p>Site reliability guardians can be automated so they happen whenever you prefer (on demand / on schedule / event based). A Dynatrace workflow is used to achieve this.</p> <p>In this demo:</p> <ul> <li>A load test will run and send a \"load test finished\" Software Delivery Lifecycle event into Dynatrace (see below).</li> <li>A Dynatrace workflow will react to that event and trigger a guardian.</li> </ul> <p>Let's plumb that together now.</p> <p>Sample k6 teardown test finished event</p> <p>For information only, no action is required.</p> <p>This is already coded into the demo load test script.</p> <pre><code>export function teardown() {\n    // Send event at the end of the test\n    let payload = {\n      \"entitySelector\": \"type(SERVICE),entityName.equals(checkoutservice)\",\n      \"eventType\": \"CUSTOM_INFO\",\n      \"properties\": {\n        \"tool\": \"k6\",\n        \"action\": \"test\",\n        \"state\": \"finished\",\n        \"purpose\": `${__ENV.LOAD_TEST_PURPOSE}`,\n        \"duration\": test_duration\n      },\n      \"title\": \"k6 load test finished\"\n    }\n\n    let res = http.post(`${__ENV.K6_DYNATRACE_URL}/api/v2/events/ingest`, JSON.stringify(payload), post_params);\n  }\n}\n</code></pre>"},{"location":"automate-srg/#create-a-workflow-to-trigger-guardian","title":"Create a Workflow to Trigger Guardian","text":"<p>Ensure you are still on the <code>Three golden signals (checkoutservice)</code> screen.</p> <ul> <li>Click the <code>Automate</code> button. This will create a template workflow.</li> <li>Change the <code>event type</code> from <code>bizevents</code> to <code>events</code>.</li> <li>Change the <code>Filter query</code> to:</li> </ul> <pre><code>event.type == \"CUSTOM_INFO\" and\ndt.entity.service.name == \"checkoutservice\" and\ntool == \"k6\" and\naction == \"test\" and\nstate == \"finished\"\n</code></pre> <ul> <li>Click the <code>run_validation</code> node.</li> <li>Remove <code>event.timeframe.from</code> and replace with:</li> </ul> <pre><code>now-{{ event()['duration'] }}\n</code></pre> <p>The UI will change this to <code>now-event.duration</code>.</p> <ul> <li> <p>Remove <code>event.timeframe.to</code> and replace with: <pre><code>now\n</code></pre></p> </li> <li> <p>Click the <code>Save</code> button.</p> </li> </ul>"},{"location":"automate-srg/#workflow-created","title":"Workflow Created","text":"<p>The workflow is now created and connected to the guardian. It will be triggered whenever the platform receives an event like below.</p> <p> </p> <p>The workflow is now live and listening for events.</p> <ul> <li>Click Here to Continue</li> </ul>"},{"location":"cleanup/","title":"Cleanup","text":"<p>Go to https://github.com/codespaces and delete the codespace which will delete the demo environment.</p> <p>You may also wish to delete the API token.</p> <ul> <li>View all resources related to this demo </li> </ul>"},{"location":"create-srg/","title":"Create Site Reliability Guardian","text":"<p>Site reliability guardians are a mechanism to automate analysis when changes are made. They can be used in production (on a CRON) or as deployment checks (eg. pre and post deployment health checks, security checks, infrastructure health checks).</p> <p>We will create a guardian to check the <code>checkoutservice</code> microservice which is used during the purchase journey.</p> <ul> <li>Press <code>ctrl + k</code> search for <code>Site Reliability Guardian</code> and select the app.</li> <li>Click <code>+ Guardian</code> to add a new guardian.</li> <li>Under <code>Four Golden Signals</code> choose <code>Use template</code>.</li> <li>Click <code>Run query</code> and toggle <code>50</code> rows per page to see more services.</li> <li>Select the <code>checkoutservice</code>. Click <code>Apply to template (1)</code>.</li> <li>Hover over the <code>Saturation</code> objective and delete it (there are no resource statistics from OpenTelemetry available so this objective cannot be evaluated).</li> <li>At the top right of the screen, customise the guardian name to be called <code>Three golden signals (checkoutservice)</code>.</li> <li>Click <code>Save</code></li> </ul> <p> </p> <p>Automate at scale</p> <p>This process can be automated for at-scale usage using Monaco or Terraform.</p> <ul> <li>Click Here to Continue</li> </ul>"},{"location":"enable-auto-baselines/","title":"Enable Automatic Baselining for Site Reliability Guardian","text":"<p>Objectives that are set to \"auto baseline\" in Dynatrace Site Reliability Guardians require <code>5</code> runs in order to enable the baselines.</p> <p>In a real scenario, these test runs would likely be spread over hours, days or weeks. This provides Dynatrace with ample time to gather sufficient usage data.</p> <p>For demo purposes, 5 seperate \"load tests\" will be triggered in quick succession to enable the baselining.</p> <p>First, open a new terminal window and apply the load test script:</p> <pre><code>kubectl apply -f .devcontainer/k6/k6-load-test-script.yaml\n</code></pre>"},{"location":"enable-auto-baselines/#trigger-the-first-load-test","title":"Trigger the First Load Test","text":"<pre><code>kubectl apply -f .devcontainer/k6/k6-srg-training-run1.yaml\n</code></pre>"},{"location":"enable-auto-baselines/#trigger-the-second-load-test","title":"Trigger the Second Load Test","text":"<p>Wait a few seconds and trigger the second load test:</p> <pre><code>kubectl apply -f .devcontainer/k6/k6-srg-training-run2.yaml\n</code></pre>"},{"location":"enable-auto-baselines/#trigger-the-third-load-test","title":"Trigger the Third Load Test","text":"<p>Wait a few seconds and trigger the third load test:</p> <pre><code>kubectl apply -f .devcontainer/k6/k6-srg-training-run3.yaml\n</code></pre>"},{"location":"enable-auto-baselines/#trigger-the-fourth-load-test","title":"Trigger the Fourth Load Test","text":"<p>Wait a few seconds and trigger the fourth load test:</p> <pre><code>kubectl apply -f .devcontainer/k6/k6-srg-training-run4.yaml\n</code></pre>"},{"location":"enable-auto-baselines/#trigger-the-final-training-load-test","title":"Trigger the Final Training Load Test","text":"<p>Wait a few seconds and trigger the final (fifth) load test:</p> <pre><code>kubectl apply -f .devcontainer/k6/k6-srg-training-run5.yaml\n</code></pre>"},{"location":"enable-auto-baselines/#wait-for-completion","title":"Wait for Completion","text":"<p>Each load test runs for 1 minute. Run this command to wait for all jobs to complete.</p> <p>This command will appear to hang until the jobs are done. Be patient. It should take about 2mins:</p> <pre><code>kubectl -n default wait --for=condition=Complete --all --timeout 120s jobs\n</code></pre> <pre><code>\u279c /workspaces/obslab-release-validation (main) $ kubectl get jobs\nNAME               STATUS     COMPLETIONS   DURATION   AGE\nk6-training-run1   Complete   1/1           95s        2m2s\nk6-training-run2   Complete   1/1           93s        115s\nk6-training-run3   Complete   1/1           93s        108s\nk6-training-run4   Complete   1/1           90s        100s\nk6-training-run5   Complete   1/1           84s        94s\n</code></pre>"},{"location":"enable-auto-baselines/#view-completed-training-runs","title":"View Completed Training Runs","text":"<p>In Dynatrace, go to <code>workflows</code> and select <code>Executions</code>. You should see 5 successful workflow executions:</p> <p></p>"},{"location":"enable-auto-baselines/#view-srg-status-using-dql","title":"View SRG Status using DQL","text":"<p>You can also use this DQL to see the Site Reliability Guardian results in a notebook:</p> <pre><code>fetch bizevents\n| filter event.provider == \"dynatrace.site.reliability.guardian\"\n| filter event.type == \"guardian.validation.finished\"\n| fieldsKeep guardian.id, validation.id, timestamp, guardian.name, validation.status, validation.summary, validation.from, validation.to\n</code></pre> <p></p>"},{"location":"enable-auto-baselines/#view-srg-status-in-the-site-reliability-guardian-app","title":"View SRG Status in the Site Reliability Guardian App","text":"<p>The SRG results are also available in the Site Reliabiltiy Guardian app:</p> <ul> <li>Press <code>ctrl + k</code></li> <li>Search for <code>site reliability guardian</code> or <code>srg</code></li> <li>Open the app and click <code>Open</code> on your guardian</li> </ul> <p>You should see the <code>5</code> runs listed:</p> <p></p> <p>Training Complete</p> <p>The automatic baselines for the guardian are now enabled.</p> <p>You can proceed to use the guardian for \"real\" evaluations.</p> <ul> <li>Click Here to Continue</li> </ul>"},{"location":"enable-change/","title":"8. Make a Change","text":"<p>A product manager informs you that they're ready to release their new feature. They ask you to enable the feature and run the load test in a dev environment.</p> <p>They tell you that the new feature is behind a flag called <code>paymentServiceFailure</code> (yes, an obvious name for this demo) and they tell you to change the <code>defaultValue</code> from <code>off</code> to <code>on</code>.</p>"},{"location":"enable-change/#update-the-feature-flag-and-inform-dynatrce","title":"Update the Feature Flag and Inform Dynatrce","text":"<p>Run the following script which notifies Dynatrace using a <code>CUSTOM_INFO</code> event of the change inc. the new value.</p> <pre><code>./runtimeChange.sh paymentServiceFailure on\n</code></pre>"},{"location":"enable-change/#change-flag-value","title":"Change Flag Value","text":"<p>Locate the <code>flags.yaml</code> file. Change the <code>defaultValue</code> of the <code>paymentServiceFailure</code> flag from <code>\"off\"</code> to <code>\"on\"</code> (line <code>84</code>).</p> <p>Apply those changes:</p> <pre><code>kubectl apply -f $CODESPACE_VSCODE_FOLDER/flags.yaml\n</code></pre> <p>You should see:</p> <pre><code>configmap/my-otel-demo-flagd-config configured\n</code></pre>"},{"location":"enable-change/#run-acceptance-load-test","title":"Run Acceptance Load Test","text":"<p>It is time to run an acceptance load test to see if the new feature has caused a regression.</p> <p>This load test will run for 3 minutes and then trigger the site reliability guardian again:</p> <pre><code>kubectl apply -f .devcontainer/k6/k6-after-change.yaml\n</code></pre>"},{"location":"enable-change/#configuration-change-events","title":"Configuration Change Events","text":"<p>While you are waiting for the load test to complete, it is worth noting that each time a feature flag is changed, you should execute <code>runtimeChange.sh</code> shell script to send an event to the service that is affected.</p> <p>The feature flag changes the behaviour of the <code>paymentservice</code> (which the <code>checkoutservice</code> depends on).</p> <p>Look at the <code>paymentservice</code> and notice the configuration changed events.</p> <p>Tip</p> <p>You can send event for anything you like: deployments, load tests, security scans, configuration changes and more.</p> <p></p> <ul> <li>Click Here to Continue</li> </ul>"},{"location":"getting-started/","title":"Getting Started","text":"<p>You must have the following to use this hands on demo.</p> <ul> <li>A Dynatrace environment (sign up here)</li> <li>A Dynatrace API token (see below)</li> </ul>"},{"location":"getting-started/#format-dynatrace-environment-url","title":"Format Dynatrace Environment URL","text":"<p>Save the Dynatrace environment URL:</p> <ul> <li>Without the trailing slash</li> <li>Without <code>.apps.</code> in the URL</li> </ul> <p>The generic format is:</p> <pre><code>https://&lt;EnvironmentID&gt;.&lt;Environment&gt;.&lt;URL&gt;\n</code></pre> <p>For example: <pre><code>https://abc12345.live.dynatrace.com\n</code></pre></p>"},{"location":"getting-started/#create-api-token","title":"Create API Token","text":"<p>In Dynatrace:</p> <ul> <li>Press <code>ctrl + k</code>. Search for <code>access tokens</code>.</li> <li>Create a new access token with the following permissions:<ul> <li><code>metrics.ingest</code></li> <li><code>logs.ingest</code></li> <li><code>events.ingest</code></li> <li><code>openTelemetryTrace.ingest</code></li> </ul> </li> </ul>"},{"location":"getting-started/#start-demo","title":"Start Demo","text":"<p>Click this button to open the demo environment. This will open in a new tab.</p> <p></p> <ul> <li>Click Here to Continue </li> </ul>"},{"location":"resources/","title":"Resources","text":"<ul> <li>Free Dynatrace trial</li> <li>This repository and documentation on GitHub</li> </ul> <ul> <li>Where to next? </li> </ul>"},{"location":"run-production-srg/","title":"7. Run a Production SRG","text":"<p>Preparation Complete</p> <p>The preparation phase is now complete. Everything before now is a one-off task.</p> <p>In day-to-day operations, you would begin from here.</p>"},{"location":"run-production-srg/#run-an-evaluation","title":"Run an Evaluation","text":"<p>Now that the Site Reliability Guardian is trained, run another evaluation by triggering a load test.</p> <p>Tip</p> <p>Remember, the workflow is currently configured to listen for <code>test finished</code> events but you could easily create additional workflows with different triggers such as on-demand on time-based CRON triggers.</p> <p>This provides an ability to continuously test your service (eg. in production).</p> <p>Run another load test to trigger a sixth evaluation. <pre><code>kubectl apply -f .devcontainer/k6/k6.yaml\n</code></pre></p> <p>Again, wait for all jobs to complete. This run will take longer. Approximately 2mins.</p> <pre><code>kubectl -n default wait --for=condition=Complete --all --timeout 120s jobs\n</code></pre> <p>When the above command returns, you should see:</p> <pre><code>NAME               STATUS     COMPLETIONS   DURATION   AGE\nk6-training-run1   Complete   1/1           102s       9m41s\nk6-training-run2   Complete   1/1           100s       9m33s\nk6-training-run3   Complete   1/1           101s       9m23s\nk6-training-run4   Complete   1/1           93s        9m17s\nk6-training-run5   Complete   1/1           91s        9m11s\nrun-k6             Complete   1/1           79s        81s\n</code></pre> <p>When this evaluation is completed, click the <code>Refresh</code> button in the <code>Validation history</code> panel of the site reliability guardian app (when viewing an individual guardian) and the heatmap should look like the image below</p> <p>Your results may vary</p> <p>Your results may vary. In this example below, the <code>Traffic</code> objective failed because the auto-adaptive thresholds detected that a traffic level below <code>1171</code> requests is too low and the actual traffic level was <code>1158</code>.</p> <p>Because one objective failed, the guardian failed.</p> <p>5 training runs and 1 \"real\" run:</p> <p></p> <p></p> <p>Information Only Objectives</p> <p>It is possible to add objectives that are \"informational only\" and do not contribute to the pass / fail decisions.</p> <p>This is useful for new services where you are trying to \"get a feel for\" the real-world data values of your metrics.</p> <p>To set an objective as \"information only\": * Select the objective to open the side panel * Scroll down to <code>Define thresholds</code> * Select the <code>No thresholds</code> option</p> <p></p> <ul> <li>Click Here to Continue</li> </ul>"},{"location":"validate-telemetry/","title":"Start The Demo","text":"<p>After the codespaces has started, the post creation script should begin. This will install everything and will take a few moments.</p> <p>When the script has completed, a success message will briefly be displayed (it is so quick you'll probably miss it) and an empty terminal window will be shown.</p> <p></p> <p></p>"},{"location":"validate-telemetry/#wait-for-demo-to-start","title":"Wait For Demo to Start","text":"<p>Wait for the demo application pods to start:</p> <pre><code>kubectl -n default wait --for=condition=Ready --all --timeout 300s pod\n</code></pre>"},{"location":"validate-telemetry/#access-demo-user-interface","title":"Access Demo User Interface","text":"<p>Start port forwarding to access the user interface:</p> <pre><code>kubectl -n default port-forward svc/my-otel-demo-frontendproxy 8080\n</code></pre> <p>Leave this command running. Open a new terminal window to run any other commands.</p> <p>Go to ports tab, right click the <code>demo app</code> entry and choose <code>Open in browser</code>.</p> <p></p> <p>You should see the OpenTelemetry demo:</p> <p></p>"},{"location":"validate-telemetry/#validate-telemetry","title":"Validate Telemetry","text":"<p>It is time to ensure telemetry is flowing correctly into Dynatrace.</p> <p>In Dynatrace, follow these steps:</p>"},{"location":"validate-telemetry/#validate-services","title":"Validate Services","text":"<ul> <li>Press <code>ctrl + k</code>. Search for <code>services</code>. Go to services screen and validate you can see services.</li> <li>Open a service and validate that the URL contains <code>SERVICE-****</code>.<ul> <li>If the URL contains <code>CUSTOM_DEVICE-****</code>:<ul> <li>Press <code>ctrl + k</code> and search for <code>settings</code>.</li> <li>Go to <code>Service Detection &gt; Unified services for OpenTelemetry</code> and ensure the toggle is on.</li> </ul> </li> </ul> </li> </ul>"},{"location":"validate-telemetry/#validate-traces","title":"Validate Traces","text":"<ul> <li>Press <code>ctrl + k</code>. Search for <code>distributed traces</code>.</li> <li>Go to distributed traces and validate data is flowing.</li> </ul>"},{"location":"validate-telemetry/#validate-metrics","title":"Validate Metrics","text":"<ul> <li>Press <code>ctrl + k</code>. Search for <code>metrics</code>.</li> <li>Go to metrics and search for <code>app.</code> and validate you can see some metrics.</li> </ul>"},{"location":"validate-telemetry/#validate-logs","title":"Validate Logs","text":"<ul> <li>Press <code>ctrl + k</code>. Search for <code>notebooks</code>.</li> <li>Create a new notebook then click <code>+</code> to add a new <code>DQL</code> section.</li> <li>Use this Dynatrace Query Language. Validate you can see some log lines.</li> </ul> <pre><code>fetch logs, scanLimitGBytes: 1\n| filter contains(content, \"conversion\")\n</code></pre>"},{"location":"validate-telemetry/#telemetry-flowing","title":"Telemetry Flowing?","text":"<p>If these four things are OK, your telemetry is flowing correctly into Dynatrace.</p> <p>If not, please search for similar problems and / or raise an issue here.</p> <ul> <li>Click Here to Continue</li> </ul>"},{"location":"view-acceptance-test-results/","title":"9. View Acceptance Test Results","text":""},{"location":"view-acceptance-test-results/#view-data","title":"View Data","text":"<p>Wait for all jobs to complete:</p> <pre><code>kubectl -n default wait --for=condition=Complete --all --timeout 120s jobs\n</code></pre> <p>All jobs (including the <code>acceptance-load-test</code>) should now be <code>Complete</code>.</p> <p>Refresh the Site Reliability Guardian results heatmap again and notice that the guardian has failed.</p> <p></p> <p>The guardian has failed due to the error rate being too high.</p> <p></p> <p>Navigating to the <code>checkoutservice</code> (<code>ctrl + k</code> &gt; <code>services</code> &gt; <code>checkoutservice</code>), you can see the increase in failure rate.</p> <p></p> <p>Scroll down the services screen until you see the OpenTelemetry traces list. Notice lots of failed requests:</p> <p></p>"},{"location":"view-acceptance-test-results/#analyse-a-failed-request","title":"Analyse a Failed Request","text":"<p>Drill into one of the failed requests and notice lots of failures.</p> <p>These failures are bubbling up through the request chain back towards the checkoutservice.</p> <p>Ultimately though, the failure comes from the final span in the trace: The call to <code>PaymentService/Charge</code>.</p> <p>Investigating the span events the cause of the failure becomes clear: The payment service cuase an exception. The exception message and stacktrace is given:</p> <pre><code>exception.message   PaymentService Fail Feature Flag Enabled\nexception.stacktrace    Error: PaymentService Fail Feature Flag Enabled at module.exports.charge\n  (/usr/src/app/charge.js:21:11) at process.processTicksAndRejections\n  (node:internal/process/task_queues:95:5) at async Object.chargeServiceHandler\n  [as charge] (/usr/src/app/index.js:21:22)\nexception.type  Error\n</code></pre> <p></p>"},{"location":"view-acceptance-test-results/#roll-back-change","title":"Roll Back Change","text":"<p>Inform Dynatrace that a change in configuration is coming. The <code>paymentServiceFailure</code> flag will be set to <code>off</code></p> <pre><code>./runtimeChange.sh paymentServiceFailure off\n</code></pre> <p>Again edit <code>flags.yaml</code> and set the <code>defaultValue</code> of <code>paymentServiceFailure</code> from <code>\"on\"</code> to <code>\"off\"</code> (line <code>84</code>)</p> <p>Apply the chnages:</p> <pre><code>kubectl apply -f $CODESPACE_VSCODE_FOLDER/flags.yaml\n</code></pre>"},{"location":"view-acceptance-test-results/#summary","title":"Summary","text":"<p>Looking back at the initial brief, it was your job to:</p> <ul> <li>Enable a feature flag in a development environment.</li> <li>Judge the impact (if any) of that change on the application.</li> <li>If an impact is observed, gather the evidence and then disable the feature flag.</li> <li>Make the \"go / no go\" decision for that feature.</li> <li>Provide feedback to the product managers on why you made the decision you did.</li> </ul> <p>So how did things turn out?</p> <ul> <li>You have enabled a feature flag and send contextual event information to Dynatrace.</li> <li>You used OpenTelemetry and Dynatrace to make an evidence-based analysis of the new software quality.</li> <li>You have automated the change analysis, noticing an impact and remediated it.</li> <li>You have protected users by automating this analysis in a development environment (of course, you could repeat this setup in production too).</li> <li>You have made the <code>no go</code> decision based on evidence provided by OpenTelemetry and the Dynatrace Site Reliability Guardian.</li> <li>You can provide this evidence (down to the stacktrace and line of code) back to the product manager so they can prioritise fixes.</li> </ul> <p>Works with any metric</p> <p>The techniques described here work with any metric, from any source.</p> <p>You are encouraged to use metrics from other devices and sources (such as business related metrics like revenue).</p> <p>Success</p> <p>The Dynatrace Platform, Site Reliability Guardian and Workflows have provided visibility and automated change analysis.</p> <ul> <li>Cleanup Resources </li> </ul>"},{"location":"whats-next/","title":"What's Next?","text":"<p>Content about how the user progresses after this demo.</p>"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Release Validation for DevOps Engineers with Site Reliability Guardian","text":"<p>In this demo, you take on the role of a Product Manager or DevOps engineer. You are running an application, and wish to enable a new feature.</p> <p>The application is already instrumented to emit tracing data, using the OpenTelemetry standard. The demo system will be automatically configured to transport that data to Dynatrace for storage and processing.</p> <p>Your job is to:</p> <ul> <li>Ensure each service in the application is healthy.</li> <li>Ensure that any new release of a microservice does not negatively impact the application.</li> </ul> <p>To achieve these objectives, you will:</p> <ul> <li>Create a Site Reliability Guardian to test and ensure the health of your microservices (starting with the most user impacting service - the <code>checkoutservice</code>)</li> <li>Use the auto baselining capability of Dynatrace to suggest (and dynamically adjust) thresholds based on current and past performance.</li> </ul>"},{"location":"#a-new-release","title":"A New Release","text":"<p>Your company utilises feature flags to enable new features. A product manager informs you that they wish to release a new feature.</p> <p>It is your job to:</p> <ul> <li>Enable that feature flag in a development environment.</li> <li>Judge the impact (if any) of that change on the application.</li> <li>If an impact is observed, gather the evidence and then disable the feature flag.</li> <li>Make the \"go / no go\" decision for that feature.</li> <li>Provide feedback on why you made the decision you did.</li> </ul>"},{"location":"#logical-architecture","title":"Logical Architecture","text":"<p>Below is the \"flow\" of information and actors during this demo.</p> <p>This architecture also holds true for other load testing tools (eg. JMeter).</p> <ol> <li> <p>A load test is executed. The HTTP requests are annotated with the standard header values.</p> </li> <li> <p>Metrics are streamed during the load test (if the load testing tool supports this) or the metrics are send at the end of the load test.</p> </li> <li> <p>The load testing tool is responsible for sending an event to signal \"test is finished\". Integrators are responsible for crafting this event to contain any important information required by Dynatrace such as the test duration.</p> </li> <li> <p>A workflow is triggered on receipt of this event. The workflow triggers the Site Reliability Guardian.</p> </li> <li> <p>The Site Reliability Guardian processes the load testing metrics and to provide an automated load testing report. This can be used for information only or can be used as an automated \"go / no go\" decision point.</p> </li> <li> <p>Dynatrace users can view the results in a dashboard, notebook or use the result as a trigger for further automated workflows.</p> </li> <li> <p>Integrators have the choice to send (emit) the results to an external tool. This external tool can then use this result. One example would be sending the SRG result to Jenkins to progress or prevent a deployment.</p> </li> </ol> <p></p>"},{"location":"#compatibility","title":"Compatibility","text":"Deployment Tutorial Compatible Dynatrace Managed \u274c Dynatrace SaaS \u2714\ufe0f <ul> <li>Click Here to Begin </li> </ul>"},{"location":"automate-srg/","title":"Automate the Site Reliability Guardian","text":"<p>Site reliability guardians can be automated so they happen whenever you prefer (on demand / on schedule / event based). A Dynatrace workflow is used to achieve this.</p> <p>In this demo:</p> <ul> <li>A load test will run and send a \"load test finished\" Software Delivery Lifecycle event into Dynatrace (see below).</li> <li>A Dynatrace workflow will react to that event and trigger a guardian.</li> </ul> <p>Let's plumb that together now.</p> <p>Sample k6 teardown test finished event</p> <p>For information only, no action is required.</p> <p>This is already coded into the demo load test script.</p> <pre><code>export function teardown() {\n    // Send event at the end of the test\n    let payload = {\n      \"entitySelector\": \"type(SERVICE),entityName.equals(checkoutservice)\",\n      \"eventType\": \"CUSTOM_INFO\",\n      \"properties\": {\n        \"tool\": \"k6\",\n        \"action\": \"test\",\n        \"state\": \"finished\",\n        \"purpose\": `${__ENV.LOAD_TEST_PURPOSE}`,\n        \"duration\": test_duration\n      },\n      \"title\": \"k6 load test finished\"\n    }\n\n    let res = http.post(`${__ENV.K6_DYNATRACE_URL}/api/v2/events/ingest`, JSON.stringify(payload), post_params);\n  }\n}\n</code></pre>"},{"location":"automate-srg/#create-a-workflow-to-trigger-guardian","title":"Create a Workflow to Trigger Guardian","text":"<p>Ensure you are still on the <code>Three golden signals (checkoutservice)</code> screen.</p> <ul> <li>Click the <code>Automate</code> button. This will create a template workflow.</li> <li>Change the <code>event type</code> from <code>bizevents</code> to <code>events</code>.</li> <li>Change the <code>Filter query</code> to:</li> </ul> <pre><code>event.type == \"CUSTOM_INFO\" and\ndt.entity.service.name == \"checkoutservice\" and\ntool == \"k6\" and\naction == \"test\" and\nstate == \"finished\"\n</code></pre> <ul> <li>Click the <code>run_validation</code> node.</li> <li>Remove <code>event.timeframe.from</code> and replace with:</li> </ul> <pre><code>now-{{ event()['duration'] }}\n</code></pre> <p>The UI will change this to <code>now-event.duration</code>.</p> <ul> <li> <p>Remove <code>event.timeframe.to</code> and replace with: <pre><code>now\n</code></pre></p> </li> <li> <p>Click the <code>Save</code> button.</p> </li> </ul>"},{"location":"automate-srg/#workflow-created","title":"Workflow Created","text":"<p>The workflow is now created and connected to the guardian. It will be triggered whenever the platform receives an event like below.</p> <p> </p> <p>The workflow is now live and listening for events.</p> <ul> <li>Click Here to Continue</li> </ul>"},{"location":"cleanup/","title":"Cleanup","text":"<p>Go to https://github.com/codespaces and delete the codespace which will delete the demo environment.</p> <p>You may also wish to delete the API token.</p> <ul> <li>View all resources related to this demo </li> </ul>"},{"location":"create-srg/","title":"Create Site Reliability Guardian","text":"<p>Site reliability guardians are a mechanism to automate analysis when changes are made. They can be used in production (on a CRON) or as deployment checks (eg. pre and post deployment health checks, security checks, infrastructure health checks).</p> <p>We will create a guardian to check the <code>checkoutservice</code> microservice which is used during the purchase journey.</p> <ul> <li>Press <code>ctrl + k</code> search for <code>Site Reliability Guardian</code> and select the app.</li> <li>Click <code>+ Guardian</code> to add a new guardian.</li> <li>Under <code>Four Golden Signals</code> choose <code>Use template</code>.</li> <li>Click <code>Run query</code> and toggle <code>50</code> rows per page to see more services.</li> <li>Select the <code>checkoutservice</code>. Click <code>Apply to template (1)</code>.</li> <li>Hover over the <code>Saturation</code> objective and delete it (there are no resource statistics from OpenTelemetry available so this objective cannot be evaluated).</li> <li>At the top right of the screen, customise the guardian name to be called <code>Three golden signals (checkoutservice)</code>.</li> <li>Click <code>Save</code></li> </ul> <p> </p> <p>Automate at scale</p> <p>This process can be automated for at-scale usage using Monaco or Terraform.</p> <ul> <li>Click Here to Continue</li> </ul>"},{"location":"enable-auto-baselines/","title":"Enable Automatic Baselining for Site Reliability Guardian","text":"<p>Objectives that are set to \"auto baseline\" in Dynatrace Site Reliability Guardians require <code>5</code> runs in order to enable the baselines.</p> <p>In a real scenario, these test runs would likely be spread over hours, days or weeks. This provides Dynatrace with ample time to gather sufficient usage data.</p> <p>For demo purposes, 5 seperate \"load tests\" will be triggered in quick succession to enable the baselining.</p> <p>First, open a new terminal window and apply the load test script:</p> <pre><code>kubectl apply -f .devcontainer/k6/k6-load-test-script.yaml\n</code></pre>"},{"location":"enable-auto-baselines/#trigger-the-first-load-test","title":"Trigger the First Load Test","text":"<pre><code>kubectl apply -f .devcontainer/k6/k6-srg-training-run1.yaml\n</code></pre>"},{"location":"enable-auto-baselines/#trigger-the-second-load-test","title":"Trigger the Second Load Test","text":"<p>Wait a few seconds and trigger the second load test:</p> <pre><code>kubectl apply -f .devcontainer/k6/k6-srg-training-run2.yaml\n</code></pre>"},{"location":"enable-auto-baselines/#trigger-the-third-load-test","title":"Trigger the Third Load Test","text":"<p>Wait a few seconds and trigger the third load test:</p> <pre><code>kubectl apply -f .devcontainer/k6/k6-srg-training-run3.yaml\n</code></pre>"},{"location":"enable-auto-baselines/#trigger-the-fourth-load-test","title":"Trigger the Fourth Load Test","text":"<p>Wait a few seconds and trigger the fourth load test:</p> <pre><code>kubectl apply -f .devcontainer/k6/k6-srg-training-run4.yaml\n</code></pre>"},{"location":"enable-auto-baselines/#trigger-the-final-training-load-test","title":"Trigger the Final Training Load Test","text":"<p>Wait a few seconds and trigger the final (fifth) load test:</p> <pre><code>kubectl apply -f .devcontainer/k6/k6-srg-training-run5.yaml\n</code></pre>"},{"location":"enable-auto-baselines/#wait-for-completion","title":"Wait for Completion","text":"<p>Each load test runs for 1 minute. Run this command to wait for all jobs to complete.</p> <p>This command will appear to hang until the jobs are done. Be patient. It should take about 2mins:</p> <pre><code>kubectl -n default wait --for=condition=Complete --all --timeout 120s jobs\n</code></pre> <pre><code>\u279c /workspaces/obslab-release-validation (main) $ kubectl get jobs\nNAME               STATUS     COMPLETIONS   DURATION   AGE\nk6-training-run1   Complete   1/1           95s        2m2s\nk6-training-run2   Complete   1/1           93s        115s\nk6-training-run3   Complete   1/1           93s        108s\nk6-training-run4   Complete   1/1           90s        100s\nk6-training-run5   Complete   1/1           84s        94s\n</code></pre>"},{"location":"enable-auto-baselines/#view-completed-training-runs","title":"View Completed Training Runs","text":"<p>In Dynatrace, go to <code>workflows</code> and select <code>Executions</code>. You should see 5 successful workflow executions:</p> <p></p>"},{"location":"enable-auto-baselines/#view-srg-status-using-dql","title":"View SRG Status using DQL","text":"<p>You can also use this DQL to see the Site Reliability Guardian results in a notebook:</p> <pre><code>fetch bizevents\n| filter event.provider == \"dynatrace.site.reliability.guardian\"\n| filter event.type == \"guardian.validation.finished\"\n| fieldsKeep guardian.id, validation.id, timestamp, guardian.name, validation.status, validation.summary, validation.from, validation.to\n</code></pre> <p></p>"},{"location":"enable-auto-baselines/#view-srg-status-in-the-site-reliability-guardian-app","title":"View SRG Status in the Site Reliability Guardian App","text":"<p>The SRG results are also available in the Site Reliabiltiy Guardian app:</p> <ul> <li>Press <code>ctrl + k</code></li> <li>Search for <code>site reliability guardian</code> or <code>srg</code></li> <li>Open the app and click <code>Open</code> on your guardian</li> </ul> <p>You should see the <code>5</code> runs listed:</p> <p></p> <p>Training Complete</p> <p>The automatic baselines for the guardian are now enabled.</p> <p>You can proceed to use the guardian for \"real\" evaluations.</p> <ul> <li>Click Here to Continue</li> </ul>"},{"location":"enable-change/","title":"8. Make a Change","text":"<p>A product manager informs you that they're ready to release their new feature. They ask you to enable the feature and run the load test in a dev environment.</p> <p>They tell you that the new feature is behind a flag called <code>paymentServiceFailure</code> (yes, an obvious name for this demo) and they tell you to change the <code>defaultValue</code> from <code>off</code> to <code>on</code>.</p>"},{"location":"enable-change/#update-the-feature-flag-and-inform-dynatrce","title":"Update the Feature Flag and Inform Dynatrce","text":"<p>Run the following script which notifies Dynatrace using a <code>CUSTOM_INFO</code> event of the change inc. the new value.</p> <pre><code>./runtimeChange.sh paymentServiceFailure on\n</code></pre>"},{"location":"enable-change/#change-flag-value","title":"Change Flag Value","text":"<p>Locate the <code>flags.yaml</code> file. Change the <code>defaultValue</code> of the <code>paymentServiceFailure</code> flag from <code>\"off\"</code> to <code>\"on\"</code> (line <code>84</code>).</p> <p>Apply those changes:</p> <pre><code>kubectl apply -f $CODESPACE_VSCODE_FOLDER/flags.yaml\n</code></pre> <p>You should see:</p> <pre><code>configmap/my-otel-demo-flagd-config configured\n</code></pre>"},{"location":"enable-change/#run-acceptance-load-test","title":"Run Acceptance Load Test","text":"<p>It is time to run an acceptance load test to see if the new feature has caused a regression.</p> <p>This load test will run for 3 minutes and then trigger the site reliability guardian again:</p> <pre><code>kubectl apply -f .devcontainer/k6/k6-after-change.yaml\n</code></pre>"},{"location":"enable-change/#configuration-change-events","title":"Configuration Change Events","text":"<p>While you are waiting for the load test to complete, it is worth noting that each time a feature flag is changed, you should execute <code>runtimeChange.sh</code> shell script to send an event to the service that is affected.</p> <p>The feature flag changes the behaviour of the <code>paymentservice</code> (which the <code>checkoutservice</code> depends on).</p> <p>Look at the <code>paymentservice</code> and notice the configuration changed events.</p> <p>Tip</p> <p>You can send event for anything you like: deployments, load tests, security scans, configuration changes and more.</p> <p></p> <ul> <li>Click Here to Continue</li> </ul>"},{"location":"getting-started/","title":"Getting Started","text":"<p>You must have the following to use this hands on demo.</p> <ul> <li>A Dynatrace environment (sign up here)</li> <li>A Dynatrace API token (see below)</li> </ul>"},{"location":"getting-started/#format-dynatrace-environment-url","title":"Format Dynatrace Environment URL","text":"<p>Save the Dynatrace environment URL:</p> <ul> <li>Without the trailing slash</li> <li>Without <code>.apps.</code> in the URL</li> </ul> <p>The generic format is:</p> <pre><code>https://&lt;EnvironmentID&gt;.&lt;Environment&gt;.&lt;URL&gt;\n</code></pre> <p>For example: <pre><code>https://abc12345.live.dynatrace.com\n</code></pre></p>"},{"location":"getting-started/#create-api-token","title":"Create API Token","text":"<p>In Dynatrace:</p> <ul> <li>Press <code>ctrl + k</code>. Search for <code>access tokens</code>.</li> <li>Create a new access token with the following permissions:<ul> <li><code>metrics.ingest</code></li> <li><code>logs.ingest</code></li> <li><code>events.ingest</code></li> <li><code>openTelemetryTrace.ingest</code></li> </ul> </li> </ul>"},{"location":"getting-started/#start-demo","title":"Start Demo","text":"<p>Click this button to open the demo environment. This will open in a new tab.</p> <p></p> <ul> <li>Click Here to Continue </li> </ul>"},{"location":"resources/","title":"Resources","text":"<ul> <li>Free Dynatrace trial</li> <li>This repository and documentation on GitHub</li> </ul> <ul> <li>Where to next? </li> </ul>"},{"location":"run-production-srg/","title":"7. Run a Production SRG","text":"<p>Preparation Complete</p> <p>The preparation phase is now complete. Everything before now is a one-off task.</p> <p>In day-to-day operations, you would begin from here.</p>"},{"location":"run-production-srg/#run-an-evaluation","title":"Run an Evaluation","text":"<p>Now that the Site Reliability Guardian is trained, run another evaluation by triggering a load test.</p> <p>Tip</p> <p>Remember, the workflow is currently configured to listen for <code>test finished</code> events but you could easily create additional workflows with different triggers such as on-demand on time-based CRON triggers.</p> <p>This provides an ability to continuously test your service (eg. in production).</p> <p>Run another load test to trigger a sixth evaluation. <pre><code>kubectl apply -f .devcontainer/k6/k6.yaml\n</code></pre></p> <p>Again, wait for all jobs to complete. This run will take longer. Approximately 2mins.</p> <pre><code>kubectl -n default wait --for=condition=Complete --all --timeout 120s jobs\n</code></pre> <p>When the above command returns, you should see:</p> <pre><code>NAME               STATUS     COMPLETIONS   DURATION   AGE\nk6-training-run1   Complete   1/1           102s       9m41s\nk6-training-run2   Complete   1/1           100s       9m33s\nk6-training-run3   Complete   1/1           101s       9m23s\nk6-training-run4   Complete   1/1           93s        9m17s\nk6-training-run5   Complete   1/1           91s        9m11s\nrun-k6             Complete   1/1           79s        81s\n</code></pre> <p>When this evaluation is completed, click the <code>Refresh</code> button in the <code>Validation history</code> panel of the site reliability guardian app (when viewing an individual guardian) and the heatmap should look like the image below</p> <p>Your results may vary</p> <p>Your results may vary. In this example below, the <code>Traffic</code> objective failed because the auto-adaptive thresholds detected that a traffic level below <code>1171</code> requests is too low and the actual traffic level was <code>1158</code>.</p> <p>Because one objective failed, the guardian failed.</p> <p>5 training runs and 1 \"real\" run:</p> <p></p> <p></p> <p>Information Only Objectives</p> <p>It is possible to add objectives that are \"informational only\" and do not contribute to the pass / fail decisions.</p> <p>This is useful for new services where you are trying to \"get a feel for\" the real-world data values of your metrics.</p> <p>To set an objective as \"information only\": * Select the objective to open the side panel * Scroll down to <code>Define thresholds</code> * Select the <code>No thresholds</code> option</p> <p></p> <ul> <li>Click Here to Continue</li> </ul>"},{"location":"validate-telemetry/","title":"Start The Demo","text":"<p>After the codespaces has started, the post creation script should begin. This will install everything and will take a few moments.</p> <p>When the script has completed, a success message will briefly be displayed (it is so quick you'll probably miss it) and an empty terminal window will be shown.</p> <p></p> <p></p>"},{"location":"validate-telemetry/#wait-for-demo-to-start","title":"Wait For Demo to Start","text":"<p>Wait for the demo application pods to start:</p> <pre><code>kubectl -n default wait --for=condition=Ready --all --timeout 300s pod\n</code></pre>"},{"location":"validate-telemetry/#access-demo-user-interface","title":"Access Demo User Interface","text":"<p>Start port forwarding to access the user interface:</p> <pre><code>kubectl -n default port-forward svc/my-otel-demo-frontendproxy 8080\n</code></pre> <p>Leave this command running. Open a new terminal window to run any other commands.</p> <p>Go to ports tab, right click the <code>demo app</code> entry and choose <code>Open in browser</code>.</p> <p></p> <p>You should see the OpenTelemetry demo:</p> <p></p>"},{"location":"validate-telemetry/#validate-telemetry","title":"Validate Telemetry","text":"<p>It is time to ensure telemetry is flowing correctly into Dynatrace.</p> <p>In Dynatrace, follow these steps:</p>"},{"location":"validate-telemetry/#validate-services","title":"Validate Services","text":"<ul> <li>Press <code>ctrl + k</code>. Search for <code>services</code>. Go to services screen and validate you can see services.</li> <li>Open a service and validate that the URL contains <code>SERVICE-****</code>.<ul> <li>If the URL contains <code>CUSTOM_DEVICE-****</code>:<ul> <li>Press <code>ctrl + k</code> and search for <code>settings</code>.</li> <li>Go to <code>Service Detection &gt; Unified services for OpenTelemetry</code> and ensure the toggle is on.</li> </ul> </li> </ul> </li> </ul>"},{"location":"validate-telemetry/#validate-traces","title":"Validate Traces","text":"<ul> <li>Press <code>ctrl + k</code>. Search for <code>distributed traces</code>.</li> <li>Go to distributed traces and validate data is flowing.</li> </ul>"},{"location":"validate-telemetry/#validate-metrics","title":"Validate Metrics","text":"<ul> <li>Press <code>ctrl + k</code>. Search for <code>metrics</code>.</li> <li>Go to metrics and search for <code>app.</code> and validate you can see some metrics.</li> </ul>"},{"location":"validate-telemetry/#validate-logs","title":"Validate Logs","text":"<ul> <li>Press <code>ctrl + k</code>. Search for <code>notebooks</code>.</li> <li>Create a new notebook then click <code>+</code> to add a new <code>DQL</code> section.</li> <li>Use this Dynatrace Query Language. Validate you can see some log lines.</li> </ul> <pre><code>fetch logs, scanLimitGBytes: 1\n| filter contains(content, \"conversion\")\n</code></pre>"},{"location":"validate-telemetry/#telemetry-flowing","title":"Telemetry Flowing?","text":"<p>If these four things are OK, your telemetry is flowing correctly into Dynatrace.</p> <p>If not, please search for similar problems and / or raise an issue here.</p> <ul> <li>Click Here to Continue</li> </ul>"},{"location":"view-acceptance-test-results/","title":"9. View Acceptance Test Results","text":""},{"location":"view-acceptance-test-results/#view-data","title":"View Data","text":"<p>Wait for all jobs to complete:</p> <pre><code>kubectl -n default wait --for=condition=Complete --all --timeout 120s jobs\n</code></pre> <p>All jobs (including the <code>acceptance-load-test</code>) should now be <code>Complete</code>.</p> <p>Refresh the Site Reliability Guardian results heatmap again and notice that the guardian has failed.</p> <p></p> <p>The guardian has failed due to the error rate being too high.</p> <p></p> <p>Navigating to the <code>checkoutservice</code> (<code>ctrl + k</code> &gt; <code>services</code> &gt; <code>checkoutservice</code>), you can see the increase in failure rate.</p> <p></p> <p>Scroll down the services screen until you see the OpenTelemetry traces list. Notice lots of failed requests:</p> <p></p>"},{"location":"view-acceptance-test-results/#analyse-a-failed-request","title":"Analyse a Failed Request","text":"<p>Drill into one of the failed requests and notice lots of failures.</p> <p>These failures are bubbling up through the request chain back towards the checkoutservice.</p> <p>Ultimately though, the failure comes from the final span in the trace: The call to <code>PaymentService/Charge</code>.</p> <p>Investigating the span events the cause of the failure becomes clear: The payment service cuase an exception. The exception message and stacktrace is given:</p> <pre><code>exception.message   PaymentService Fail Feature Flag Enabled\nexception.stacktrace    Error: PaymentService Fail Feature Flag Enabled at module.exports.charge\n  (/usr/src/app/charge.js:21:11) at process.processTicksAndRejections\n  (node:internal/process/task_queues:95:5) at async Object.chargeServiceHandler\n  [as charge] (/usr/src/app/index.js:21:22)\nexception.type  Error\n</code></pre> <p></p>"},{"location":"view-acceptance-test-results/#roll-back-change","title":"Roll Back Change","text":"<p>Inform Dynatrace that a change in configuration is coming. The <code>paymentServiceFailure</code> flag will be set to <code>off</code></p> <pre><code>./runtimeChange.sh paymentServiceFailure off\n</code></pre> <p>Again edit <code>flags.yaml</code> and set the <code>defaultValue</code> of <code>paymentServiceFailure</code> from <code>\"on\"</code> to <code>\"off\"</code> (line <code>84</code>)</p> <p>Apply the changes:</p> <pre><code>kubectl apply -f $CODESPACE_VSCODE_FOLDER/flags.yaml\n</code></pre>"},{"location":"view-acceptance-test-results/#summary","title":"Summary","text":"<p>Looking back at the initial brief, it was your job to:</p> <ul> <li>Enable a feature flag in a development environment.</li> <li>Judge the impact (if any) of that change on the application.</li> <li>If an impact is observed, gather the evidence and then disable the feature flag.</li> <li>Make the \"go / no go\" decision for that feature.</li> <li>Provide feedback to the product managers on why you made the decision you did.</li> </ul> <p>So how did things turn out?</p> <ul> <li>You have enabled a feature flag and send contextual event information to Dynatrace.</li> <li>You used OpenTelemetry and Dynatrace to make an evidence-based analysis of the new software quality.</li> <li>You have automated the change analysis, noticing an impact and remediated it.</li> <li>You have protected users by automating this analysis in a development environment (of course, you could repeat this setup in production too).</li> <li>You have made the <code>no go</code> decision based on evidence provided by OpenTelemetry and the Dynatrace Site Reliability Guardian.</li> <li>You can provide this evidence (down to the stacktrace and line of code) back to the product manager so they can prioritise fixes.</li> </ul> <p>Works with any metric</p> <p>The techniques described here work with any metric, from any source.</p> <p>You are encouraged to use metrics from other devices and sources (such as business related metrics like revenue).</p> <p>Success</p> <p>The Dynatrace Platform, Site Reliability Guardian and Workflows have provided visibility and automated change analysis.</p> <ul> <li>Cleanup Resources </li> </ul>"},{"location":"whats-next/","title":"What's Next?","text":"<p>Content about how the user progresses after this demo.</p>"}]}
\ No newline at end of file
diff --git a/view-acceptance-test-results/index.html b/view-acceptance-test-results/index.html
index 1c16226..05c6627 100755
--- a/view-acceptance-test-results/index.html
+++ b/view-acceptance-test-results/index.html
@@ -681,7 +681,7 @@ <h2 id="roll-back-change">Roll Back Change<a class="headerlink" href="#roll-back
 <div class="highlight"><pre><span></span><code>./runtimeChange.sh paymentServiceFailure off
 </code></pre></div>
 <p>Again edit <code>flags.yaml</code> and set the <code>defaultValue</code> of <code>paymentServiceFailure</code> from <code>"on"</code> to <code>"off"</code> (line <code>84</code>)</p>
-<p>Apply the chnages:</p>
+<p>Apply the changes:</p>
 <div class="highlight"><pre><span></span><code>kubectl apply -f $CODESPACE_VSCODE_FOLDER/flags.yaml
 </code></pre></div>
 <h2 id="summary">Summary<a class="headerlink" href="#summary" title="Permanent link">#</a></h2>