Rate Limiting docs (#1877)

* Rate Limiting docs --------- Co-authored-by: Flynn <[email protected]>
linkerd · Dec 4, 2024 · e322ec8 · e322ec8
1 parent 3d76c4c
commit e322ec8
Show file tree

Hide file tree

Showing 3 changed files with 276 additions and 0 deletions.
diff --git a/linkerd.io/content/2-edge/features/rate-limiting.md b/linkerd.io/content/2-edge/features/rate-limiting.md
@@ -0,0 +1,55 @@
+---
+title: Rate Limiting
+description: Linkerd offers a simple and performant HTTP local rate limiting solution to protect services from misbehaved clients
+---
+
+Rate limiting helps protect a service by controlling its inbound traffic flow to
+prevent overload, ensure fair resource use, enhance security, manage costs,
+maintain quality, and comply with SLAs.
+
+Please check the [Configuring Rate Limiting
+task](../../tasks/configuring-rate-limiting/) for an example guide on deploying
+rate limiting, and the [HTTPLocalRateLimitPolicy reference
+doc](../../reference/rate-limiting/).
+
+## Scope
+
+Linkerd offers a _local_ rate limiting solution, which means that each inbound
+proxy performs the limiting for the pod. This is unlike _global_ rate limiting,
+which takes into account all replicas for each service to track global request
+volume. Global rate limiting requires an additional service to track everything
+and is thus more complex to deploy and maintain.
+
+## Fairness
+
+In the `HTTPLocalRateLimitPolicy` CR you can optionally configure a rate limit
+to apply to all the inbound traffic for a given Server, regardless of the
+source.
+
+Additionally, you can specify fairness among clients by declaring a limit per
+identity. This avoids specific clients gobbling all the rate limit quota and
+affecting all the other clients. Note that all unmeshed sources (which don't
+have an identity) are treated as a single source.
+
+Finally, you also have at your disposal the ability to override the config for
+specific clients by their identity.
+
+## Algorithm
+
+Linkerd uses the [Generic cell rate algorithm
+(GCRA)](https://en.wikipedia.org/wiki/Generic_cell_rate_algorithm) to implement
+rate limiting, which is more performant than the token bucket and leaky bucket
+algorithms usually used for rate limiting.
+
+The GCRA has two parameters: cell rate and tolerance.
+
+In its virtual scheduling description, the algorithm determines a theoretical
+arrival time, representing the 'ideal' arrival time of a cell (request) if cells
+(requests) were transmitted at equal intervals of time, corresponding to the
+cell rate. How closely the flow of requests should abide to that arrival time is
+determined by the tolerance parameter.
+
+In Linkerd we derive the cell rate from the `requestsPerSecond` entries in
+`HTTPLocalRateLimitPolicy` and the tolerance is set to one second. This helps
+accommodating small variations or occasional bursts in traffic while ensuring
+the long-term rate remains within limits.
diff --git a/linkerd.io/content/2-edge/reference/rate-limiting.md b/linkerd.io/content/2-edge/reference/rate-limiting.md
@@ -0,0 +1,70 @@
+---
+title: Rate Limiting
+description: Reference guide to Linkerd's HTTPLocalRateLimitPolicy resource
+---
+
+Linkerd's rate limiting functionality is configured via
+`HTTPLocalRateLimitPolicy` resources, which should point to a
+[Server](../../reference/authorization-policy/#server) reference. Note that a
+`Server` can only be referred by a single `HTTPLocalRateLimitPolicy`.
+
+{{< note >}}
+`Server`'s default `accessPolicy` config is `deny`. This means that if you don't
+have [AuthorizationPolicies](../../reference/authorization-policy/) pointing to a
+Server, it will deny traffic by default. If you want to set up rate limit
+policies for a Server without being forced to also declare authorization
+policies, make sure to set `accessPolicy` to a permissive value like
+`all-unauthenticated`.
+{{< /note >}}
+
+## HTTPLocalRateLimitPolicy Spec
+
+{{< keyval >}}
+| field| value |
+|------|-------|
+| `targetRef`| A reference to the [Server](../../reference/authorization-policy/#server) this policy applies to. |
+| `total.requestsPerSecond`| Overall rate limit for all traffic sent to the `targetRef`. If unset no overall limit is applied. |
+| `identity.requestsPerSecond`| Fairness for individual identities; each separate client, grouped by identity, will have this rate limit. If `total.requestsPerSecond` is also set, `identity.requestsPerSecond` cannot be greater than `total.requestsPerSecond`. |
+| `overrides`| An array of [overrides](#overrides) for traffic from specific client. |
+{{< /keyval >}}
+
+### Overrides
+
+{{< keyval >}}
+| field| value |
+|------|-------|
+| `requestsPerSecond`| The number of requests per second allowed from clients matching `clientRefs`. If `total.requestsPerSecond` is also set, the `requestsPerSecond` for each `overrides` entry cannot be greater than `total.requestsPerSecond`. |
+| `clientRefs.kind`| Kind of the referent. Currently only ServiceAccount is supported. |
+| `clientRefs.namespace`| Namespace of the referent. When unspecified (or empty string), this refers to the local namespace of the policy. |
+| `clientRefs.name`| Name of the referent. |
+{{< /keyval >}}
+
+## Example
+
+In this example, the policy targets the `web-http` Server, for which a total
+rate limit of 100RPS is imposed, with a limit of 20RPS per identity, and an
+override of 25RPS for the "special-client" ServiceAccount in the emojivoto
+namespace:
+
+```yaml
+apiVersion: policy.linkerd.io/v1alpha1
+kind: HTTPLocalRateLimitPolicy
+metadata:
+  namespace: emojivoto
+  name: web-rl
+spec:
+  targetRef:
+    group: policy.linkerd.io
+    kind: Server
+    name: web-http
+  total:
+    requestsPerSecond: 100
+  identity:
+    requestsPerSecond: 20
+  overrides:
+  - requestsPerSecond: 25
+    clientRefs:
+    - kind: ServiceAccount
+      namespace: emojivoto
+      name: special-client
+```
diff --git a/linkerd.io/content/2-edge/tasks/configuring-rate-limiting.md b/linkerd.io/content/2-edge/tasks/configuring-rate-limiting.md
@@ -0,0 +1,151 @@
+---
+title: Configuring Rate Limiting
+description: Using HTTP local rate limiting to protect a service
+---
+
+In this guide, we'll walk you through deploying an `HTTPLocalRateLimitPolicy`
+resource to rate-limit the traffic to a given service.
+
+For more information about Linkerd's rate limiting check the [Rate Limiting
+feature doc](../../features/rate-limiting/) and the [HTTPLocalRateLimitPolicy
+reference doc](../../reference/rate-limiting/).
+
+## Prerequisites
+
+To use this guide you'll only need a Kubernetes cluster running a Linkerd
+instance. You can follow the [installing Linkerd Guide](../install/).
+
+## Setup
+
+First inject and install the Emojivoto application, then scale-down the vote-bot
+workload to avoid it interfering with our testing:
+
+```bash
+linkerd inject https://run.linkerd.io/emojivoto.yml | kubectl apply -f -
+kubectl -n emojivoto scale --replicas 0 deploy/vote-bot
+```
+
+Finally, deploy a workload with an Ubuntu image, open a shell into it and
+install curl:
+
+```bash
+kubectl create deployment client --image ubuntu -- bash -c "sleep infinity"
+kubectl exec -it client-xxx -- bash
+root@client-xxx:/# apt-get update && apt-get install -y curl
+```
+
+Leave that shell open so we can use it below when [sending
+requests](#sending-requests).
+
+## Creating an HTTPLocalRateLimitPolicy resource
+
+We need first to create a `Server` resource pointing to the `web-svc` service.
+Note that this `Server` has `accessPolicy: all-unauthenticated`, which means
+that traffic is allowed by default and we don't require to declare authorization
+policies associated to it:
+
+```yaml
+kubectl apply -f - <<EOF
+---
+apiVersion: policy.linkerd.io/v1beta3
+kind: Server
+metadata:
+  namespace: emojivoto
+  name: web-http
+spec:
+  accessPolicy: all-unauthenticated
+  podSelector:
+    matchLabels:
+      app: web-svc
+  port: http
+  proxyProtocol: HTTP/1
+EOF
+```
+
+Now we can apply the `HTTPLocalRateLimitPolicy` resource pointing to that
+`Server`. For now, we'll just set a limit of 4 RPS per identity:
+
+```yaml
+kubectl apply -f - <<EOF
+---
+apiVersion: policy.linkerd.io/v1alpha1
+kind: HTTPLocalRateLimitPolicy
+metadata:
+  namespace: emojivoto
+  name: web-http
+spec:
+  targetRef:
+    group: policy.linkerd.io
+    kind: Server
+    name: web-http
+  identity:
+    requestsPerSecond: 4
+EOF
+```
+
+## Sending requests
+
+In the Ubuntu shell, issue 10 concurrent requests to `web-svc.emojivoto`:
+
+```bash
+root@client-xxx:/# results=$(for i in {1..10}; do curl -s -o /dev/null -w "%{http_code}\n" "http://web-svc.emojivoto" & done; wait)
+root@client-xxx:/# echo $results
+200 200 200 429 429 429 429 200 429 429
+```
+
+We see that only 4 requests were allowed. The requests that got rate-limited
+receive a response with a 429 HTTP status code.
+
+### Overrides
+
+The former client had no identity as it was deployed in the default namespace,
+where workloads are not injected by default.
+
+Now let's create a new Ubuntu workload in the emojivoto namespace, which will be
+injected by default, and whose identity will be associated to the `default`
+ServiceAccount in the emojivoto namespace:
+
+```bash
+kubectl -n emojivoto create deployment client --image ubuntu -- bash -c "sleep infinity"
+kubectl -n emojivoto exec -it client-xxx -c ubuntu -- bash
+root@client-xxx:/# apt-get update && apt-get install -y curl
+```
+
+Before issuing requests, let's expand the `HTTPLocalRateLimitPolicy` resource,
+adding an override for this specific client, that'll allow it to issue requests
+up to 6 RPS:
+
+```yaml
+kubectl apply -f - <<EOF
+---
+apiVersion: policy.linkerd.io/v1alpha1
+kind: HTTPLocalRateLimitPolicy
+metadata:
+  namespace: emojivoto
+  name: web-http
+spec:
+  targetRef:
+    group: policy.linkerd.io
+    kind: Server
+    name: web-http
+  identity:
+    requestsPerSecond: 4
+  overrides:
+  - requestsPerSecond: 6
+    clientRefs:
+    - kind: ServiceAccount
+      namespace: emojivoto
+      name: default
+EOF
+```
+
+And finally back in the shell we execute the requests:
+
+```bash
+root@client-xxx:/# results=$(for i in {1..10}; do curl -s -o /dev/null -w "%{http_code}\n" "http://web-svc.emojivoto" & done; wait)
+root@client-xxx:/# echo $results
+429 429 429 429 200 200 200 200 200 200
+```
+
+We see that now 6 requests were allowed. If we tried again with the former
+client, we could verify we would still be allowed to 4 requests only.