Skip to content

Commit

Permalink
Rate Limiting docs (#1877)
Browse files Browse the repository at this point in the history
* Rate Limiting docs

---------

Co-authored-by: Flynn <[email protected]>
  • Loading branch information
alpeb and kflynn authored Dec 4, 2024
1 parent 3d76c4c commit e322ec8
Show file tree
Hide file tree
Showing 3 changed files with 276 additions and 0 deletions.
55 changes: 55 additions & 0 deletions linkerd.io/content/2-edge/features/rate-limiting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
title: Rate Limiting
description: Linkerd offers a simple and performant HTTP local rate limiting solution to protect services from misbehaved clients
---

Rate limiting helps protect a service by controlling its inbound traffic flow to
prevent overload, ensure fair resource use, enhance security, manage costs,
maintain quality, and comply with SLAs.

Please check the [Configuring Rate Limiting
task](../../tasks/configuring-rate-limiting/) for an example guide on deploying
rate limiting, and the [HTTPLocalRateLimitPolicy reference
doc](../../reference/rate-limiting/).

## Scope

Linkerd offers a _local_ rate limiting solution, which means that each inbound
proxy performs the limiting for the pod. This is unlike _global_ rate limiting,
which takes into account all replicas for each service to track global request
volume. Global rate limiting requires an additional service to track everything
and is thus more complex to deploy and maintain.

## Fairness

In the `HTTPLocalRateLimitPolicy` CR you can optionally configure a rate limit
to apply to all the inbound traffic for a given Server, regardless of the
source.

Additionally, you can specify fairness among clients by declaring a limit per
identity. This avoids specific clients gobbling all the rate limit quota and
affecting all the other clients. Note that all unmeshed sources (which don't
have an identity) are treated as a single source.

Finally, you also have at your disposal the ability to override the config for
specific clients by their identity.

## Algorithm

Linkerd uses the [Generic cell rate algorithm
(GCRA)](https://en.wikipedia.org/wiki/Generic_cell_rate_algorithm) to implement
rate limiting, which is more performant than the token bucket and leaky bucket
algorithms usually used for rate limiting.

The GCRA has two parameters: cell rate and tolerance.

In its virtual scheduling description, the algorithm determines a theoretical
arrival time, representing the 'ideal' arrival time of a cell (request) if cells
(requests) were transmitted at equal intervals of time, corresponding to the
cell rate. How closely the flow of requests should abide to that arrival time is
determined by the tolerance parameter.

In Linkerd we derive the cell rate from the `requestsPerSecond` entries in
`HTTPLocalRateLimitPolicy` and the tolerance is set to one second. This helps
accommodating small variations or occasional bursts in traffic while ensuring
the long-term rate remains within limits.
70 changes: 70 additions & 0 deletions linkerd.io/content/2-edge/reference/rate-limiting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
title: Rate Limiting
description: Reference guide to Linkerd's HTTPLocalRateLimitPolicy resource
---

Linkerd's rate limiting functionality is configured via
`HTTPLocalRateLimitPolicy` resources, which should point to a
[Server](../../reference/authorization-policy/#server) reference. Note that a
`Server` can only be referred by a single `HTTPLocalRateLimitPolicy`.

{{< note >}}
`Server`'s default `accessPolicy` config is `deny`. This means that if you don't
have [AuthorizationPolicies](../../reference/authorization-policy/) pointing to a
Server, it will deny traffic by default. If you want to set up rate limit
policies for a Server without being forced to also declare authorization
policies, make sure to set `accessPolicy` to a permissive value like
`all-unauthenticated`.
{{< /note >}}

## HTTPLocalRateLimitPolicy Spec

{{< keyval >}}
| field| value |
|------|-------|
| `targetRef`| A reference to the [Server](../../reference/authorization-policy/#server) this policy applies to. |
| `total.requestsPerSecond`| Overall rate limit for all traffic sent to the `targetRef`. If unset no overall limit is applied. |
| `identity.requestsPerSecond`| Fairness for individual identities; each separate client, grouped by identity, will have this rate limit. If `total.requestsPerSecond` is also set, `identity.requestsPerSecond` cannot be greater than `total.requestsPerSecond`. |
| `overrides`| An array of [overrides](#overrides) for traffic from specific client. |
{{< /keyval >}}

### Overrides

{{< keyval >}}
| field| value |
|------|-------|
| `requestsPerSecond`| The number of requests per second allowed from clients matching `clientRefs`. If `total.requestsPerSecond` is also set, the `requestsPerSecond` for each `overrides` entry cannot be greater than `total.requestsPerSecond`. |
| `clientRefs.kind`| Kind of the referent. Currently only ServiceAccount is supported. |
| `clientRefs.namespace`| Namespace of the referent. When unspecified (or empty string), this refers to the local namespace of the policy. |
| `clientRefs.name`| Name of the referent. |
{{< /keyval >}}

## Example

In this example, the policy targets the `web-http` Server, for which a total
rate limit of 100RPS is imposed, with a limit of 20RPS per identity, and an
override of 25RPS for the "special-client" ServiceAccount in the emojivoto
namespace:

```yaml
apiVersion: policy.linkerd.io/v1alpha1
kind: HTTPLocalRateLimitPolicy
metadata:
namespace: emojivoto
name: web-rl
spec:
targetRef:
group: policy.linkerd.io
kind: Server
name: web-http
total:
requestsPerSecond: 100
identity:
requestsPerSecond: 20
overrides:
- requestsPerSecond: 25
clientRefs:
- kind: ServiceAccount
namespace: emojivoto
name: special-client
```
151 changes: 151 additions & 0 deletions linkerd.io/content/2-edge/tasks/configuring-rate-limiting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
---
title: Configuring Rate Limiting
description: Using HTTP local rate limiting to protect a service
---

In this guide, we'll walk you through deploying an `HTTPLocalRateLimitPolicy`
resource to rate-limit the traffic to a given service.

For more information about Linkerd's rate limiting check the [Rate Limiting
feature doc](../../features/rate-limiting/) and the [HTTPLocalRateLimitPolicy
reference doc](../../reference/rate-limiting/).

## Prerequisites

To use this guide you'll only need a Kubernetes cluster running a Linkerd
instance. You can follow the [installing Linkerd Guide](../install/).

## Setup

First inject and install the Emojivoto application, then scale-down the vote-bot
workload to avoid it interfering with our testing:

```bash
linkerd inject https://run.linkerd.io/emojivoto.yml | kubectl apply -f -
kubectl -n emojivoto scale --replicas 0 deploy/vote-bot
```

Finally, deploy a workload with an Ubuntu image, open a shell into it and
install curl:

```bash
kubectl create deployment client --image ubuntu -- bash -c "sleep infinity"
kubectl exec -it client-xxx -- bash
root@client-xxx:/# apt-get update && apt-get install -y curl
```

Leave that shell open so we can use it below when [sending
requests](#sending-requests).

## Creating an HTTPLocalRateLimitPolicy resource

We need first to create a `Server` resource pointing to the `web-svc` service.
Note that this `Server` has `accessPolicy: all-unauthenticated`, which means
that traffic is allowed by default and we don't require to declare authorization
policies associated to it:

```yaml
kubectl apply -f - <<EOF
---
apiVersion: policy.linkerd.io/v1beta3
kind: Server
metadata:
namespace: emojivoto
name: web-http
spec:
accessPolicy: all-unauthenticated
podSelector:
matchLabels:
app: web-svc
port: http
proxyProtocol: HTTP/1
EOF
```

Now we can apply the `HTTPLocalRateLimitPolicy` resource pointing to that
`Server`. For now, we'll just set a limit of 4 RPS per identity:

```yaml
kubectl apply -f - <<EOF
---
apiVersion: policy.linkerd.io/v1alpha1
kind: HTTPLocalRateLimitPolicy
metadata:
namespace: emojivoto
name: web-http
spec:
targetRef:
group: policy.linkerd.io
kind: Server
name: web-http
identity:
requestsPerSecond: 4
EOF
```

## Sending requests

In the Ubuntu shell, issue 10 concurrent requests to `web-svc.emojivoto`:

```bash
root@client-xxx:/# results=$(for i in {1..10}; do curl -s -o /dev/null -w "%{http_code}\n" "http://web-svc.emojivoto" & done; wait)
root@client-xxx:/# echo $results
200 200 200 429 429 429 429 200 429 429
```

We see that only 4 requests were allowed. The requests that got rate-limited
receive a response with a 429 HTTP status code.

### Overrides

The former client had no identity as it was deployed in the default namespace,
where workloads are not injected by default.

Now let's create a new Ubuntu workload in the emojivoto namespace, which will be
injected by default, and whose identity will be associated to the `default`
ServiceAccount in the emojivoto namespace:

```bash
kubectl -n emojivoto create deployment client --image ubuntu -- bash -c "sleep infinity"
kubectl -n emojivoto exec -it client-xxx -c ubuntu -- bash
root@client-xxx:/# apt-get update && apt-get install -y curl
```

Before issuing requests, let's expand the `HTTPLocalRateLimitPolicy` resource,
adding an override for this specific client, that'll allow it to issue requests
up to 6 RPS:

```yaml
kubectl apply -f - <<EOF
---
apiVersion: policy.linkerd.io/v1alpha1
kind: HTTPLocalRateLimitPolicy
metadata:
namespace: emojivoto
name: web-http
spec:
targetRef:
group: policy.linkerd.io
kind: Server
name: web-http
identity:
requestsPerSecond: 4
overrides:
- requestsPerSecond: 6
clientRefs:
- kind: ServiceAccount
namespace: emojivoto
name: default
EOF
```

And finally back in the shell we execute the requests:

```bash
root@client-xxx:/# results=$(for i in {1..10}; do curl -s -o /dev/null -w "%{http_code}\n" "http://web-svc.emojivoto" & done; wait)
root@client-xxx:/# echo $results
429 429 429 429 200 200 200 200 200 200
```

We see that now 6 requests were allowed. If we tried again with the former
client, we could verify we would still be allowed to 4 requests only.

0 comments on commit e322ec8

Please sign in to comment.