Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pattern for scheduled lambdas #186

Merged
merged 15 commits into from
Jan 25, 2021
Merged

Add pattern for scheduled lambdas #186

merged 15 commits into from
Jan 25, 2021

Conversation

jacobwinch
Copy link
Contributor

What does this change?

This PR is a first attempt at a scheduled lambda pattern. It also introduces a construct for CloudWatch alarms, and a pattern for a lambda-specific CloudWatch alarms, based on error %.

I've introduced the CloudWatch-y stuff at the same time as the scheduled lambda pattern because a number of examples of this pattern 'in the wild' also include an alert. I think we should be encouraging developers to think about monitoring their software whenever they add new compute resources, so hopefully this will help to do that.

Does this change require changes to existing projects or CDK CLI?

No changes are required, however stacks that use schedule-based lambdas can simplify their infrastructure definition slightly.

How to test

  1. Upgrade an existing stack to use this new pattern. The snapshot tests should show 0 (or trivial) changes.
  2. Create a new stack based on this pattern. You should end up with a functional scheduled lambda.
  3. Configure monitoring and force the lambda to fail X% of the time to ensure that the new alert pattern works as expected.

I've tested 2 & 3 in the developerPlayground account.

How can we measure success?

  • Projects already using this library (e.g. tag-janitor and pagerduty-summary) should be able to simplify their infrastructure definition slightly
  • New projects should get sensible monitoring for a tiny amount of extra code/effort

Have we considered potential risks?

  • This is all new functionality; merging this PR should not have an impact on any projects currently using this library.

export interface ErrorPercentageMonitoring
extends Omit<GuAlarmProps, "metric" | "threshold" | "comparisonOperator" | "evaluationPeriods"> {
toleratedErrorPercentage: number;
numberOfFiveMinutePeriodsToEvaluate?: number;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better suggestions for this name are welcome!


interface GuScheduledLambdaProps extends Omit<GuFunctionProps, "rules" | "apis"> {
schedule: Schedule;
monitoringConfiguration?: ErrorPercentageMonitoring;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered about making the opt-out for this more explicit. In Scala I would do something like:

sealed trait LambdaMonitoring
case object NoMonitoring extends LambdaMonitoring
case class ErrorPercentageMonitoring(toleratedErrorPercentage: Int, snsTopicArn: String, numberOfFiveMinutePeriodsToEvaluate: Option[Int]=Some(5)) extends LambdaMonitoring

The interface would then be:

monitoringConfiguration: LambdaMonitoring

So to opt-out, you'd have to state:

monitoringConfiguration: NoMonitoring

This would make it clearer (and highlight to reviewers) that you were intentionally omitting this (hopefully for a good reason e.g. you prefer some different alerting system to CloudWatch), rather than just being (another) optional property that can be easily forgotten.

Any opinions on whether we'd want to do this? If so, any suggestions on how we could go about this in TypeScript? I tried a couple of options using Unions but they were all pretty ugly!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would be nice. I don't know of a nice way of doing that in typescript although I often feel like there must be one that I'm missing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? (I'm not too certain on the as though!)

Copy link
Contributor Author

@jacobwinch jacobwinch Jan 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion @akash1810 - I've used this as the basis for dc3972c. To be honest, I'm not sure whether I like this more than my original implementation, but I've spent quite a bit of time experimenting and I can't find a cleaner way. Happy to revert dc3972c if people are not keen on this approach!

@@ -0,0 +1,32 @@
import { ComparisonOperator, MathExpression } from "@aws-cdk/aws-cloudwatch";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't 100% sure whether this should be a pattern or a construct - any feedback on this would be helpful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first impression is that this should be in constructs as patterns should essentially encompass an entire stack but interested to hear other people's opinions too

Copy link
Contributor Author

@jacobwinch jacobwinch Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having read this comment and listened to @akash1810's excellent explanation of patterns during the presentations, I'm inclined to agree - I'll move this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @jamie-lynch and reminded that InstanceRole needs moving out of the patterns directory! https://docs.aws.amazon.com/cdk/latest/guide/constructs.html is a good read.

import type { GuStack } from "../core";

export interface GuAlarmProps extends AlarmProps {
snsTopicArn: string;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most teams I've worked on have 1 or 2 topics configured for all alerts and then pass the arn into various templates as parameters.

We could also consider taking the topic name (which is not private) instead of the full arn (this contains an AWS account number, which should be kept private), to simplify things slightly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it looks like that's what the GuLogShippingPolicy does. Then it gets the region and account from scope

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth having a single topic, set in param store, for use by multiple stacks? Or is the separation of concerns important?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. My initial thought is that configuring a single topic in parameter store would not work so well in this case because multiple teams sometimes share the same AWS account, but not the same topic for alerts (or the same mailing list). Similarly teams might want to use different topics for alerts of a different priority etc.

I think it'd be worth thinking about how we might standardise this further in the future, but I'm inclined to leave it as is for now.

import type { GuStack } from "../core";

export interface GuAlarmProps extends AlarmProps {
snsTopicArn: string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth having a single topic, set in param store, for use by multiple stacks? Or is the separation of concerns important?

@@ -12,7 +12,7 @@ interface ApiProps extends Omit<LambdaRestApiProps, "handler"> {
id: string;
}

interface GuFunctionProps extends Omit<FunctionProps, "code"> {
export interface GuFunctionProps extends Omit<FunctionProps, "code"> {
code: { bucket: string; key: string };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly not for this PR, but would be good for the bucket to come for free, GuGetDistributablePolicy is a vague demonstration of this.

@@ -0,0 +1,32 @@
import { ComparisonOperator, MathExpression } from "@aws-cdk/aws-cloudwatch";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @jamie-lynch and reminded that InstanceRole needs moving out of the patterns directory! https://docs.aws.amazon.com/cdk/latest/guide/constructs.html is a good read.


interface GuScheduledLambdaProps extends Omit<GuFunctionProps, "rules" | "apis"> {
schedule: Schedule;
monitoringConfiguration?: ErrorPercentageMonitoring;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? (I'm not too certain on the as though!)

package.json Outdated
@@ -40,6 +40,7 @@
"dependencies": {
"@aws-cdk/assert": "~1.74.0",
"@aws-cdk/aws-autoscaling": "~1.74.0",
"@aws-cdk/aws-cloudwatch-actions": "~1.74.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This'll need rebasing from main since #191.

Suggested change
"@aws-cdk/aws-cloudwatch-actions": "~1.74.0",
"@aws-cdk/aws-cloudwatch-actions": "1.86.0",

@jacobwinch jacobwinch marked this pull request as ready for review January 25, 2021 12:05
@jacobwinch jacobwinch requested a review from a team January 25, 2021 12:05
Copy link
Contributor

@jamie-lynch jamie-lynch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I've left a couple of comments but either stuff for us to think about in the future or personal preference on how we test so happy for this to be merged.

runtime: Runtime.NODEJS_12_X,
});
new GuAlarm(stack, "alarm", {
alarmName: `Alarm in ${stack.stage}`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One for a future PR but I wonder if it would be useful to append stack, stage and app to the alarm name (or including the information in some other way) as part of the GuAlarm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Ideally we would add these as tags, but unfortunately that's not currently possible, so perhaps we should put these into the name by default for now.

Comment on lines +7 to +22
export type LambdaMonitoring = NoMonitoring | ErrorPercentageMonitoring;

export interface NoMonitoring {
noMonitoring: true;
}

export interface ErrorPercentageMonitoring
extends Omit<GuAlarmProps, "metric" | "threshold" | "comparisonOperator" | "evaluationPeriods"> {
toleratedErrorPercentage: number;
numberOfFiveMinutePeriodsToEvaluate?: number;
noMonitoring?: false;
}

interface GuLambdaAlarmProps extends ErrorPercentageMonitoring {
lambda: GuLambdaFunction;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one for a future PR but I wonder if the noMonitoring stuff is something that should live at a higher level as it's unlikely to be specific to Lambda monitoring and also it feels a bit odd to define a prop here which then isn't used.

Maybe NoMonitoring should be exported from somewhere else and the monitoringConfiguration prop becomes monitoringConfiguration: LambdaMonitoring | NoMonitoring? Then I guess ErrorPercentageMonitoring could extend GuMonitoringProps to add the noMonitoring?: false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - this part of the code needs a little more thought as we cover more use-cases. I think I will go ahead and merge this as is for now, but perhaps we can refactor this slightly as part of adding additional lambda patterns.

Comment on lines 39 to 42
expect(SynthUtils.toCloudFormation(stack)).toMatchSnapshot();
expect(stack).toHaveResource("AWS::CloudWatch::Alarm", {
EvaluationPeriods: 12,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think given this has been moved to constructs we should prefer direct assertions over snapshots wherever possible. For the first test, the snapshot is maybe still the better choice as it will pick up any unexpected side effects of changes we make but for these tests that also have a direct assertion, I'd vote that the snapshot assertions can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds sensible, I've updated the tests and removed obsolete snapshots in 139d524.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants