Skip to content

Commit

Permalink
chore(sequencer-relayer)!: minimize resubmissions to Celestia (#1234)
Browse files Browse the repository at this point in the history
## Summary
The relayer has been updated to avoid resubmitting the same data to
Celestia on restart or on a timeout of the `BroadcastTx` gRPC.

## Background
Currently the relayer will likely resubmit the same data when it
restarts, since the majority of the time spent in the submit loop is
waiting for confirmation of the `BlobTx` having been stored. While in
this state, the on-disk files aren't updated to indicate that we're
waiting for confirmation. Hence if we restart, the new session begins by
submitting all data from the last-confirmed point, i.e. the same data
which was already likely successfully submitted in the final attempt of
the previous session.

A similar situation happens if we timeout while waiting for the
`BroadcastTx` response - the retry loop will attempt to resubmit the
same data without first checking if the previous attempt succeeded.

## Changes
* Replaced the two state files (presubmit and postsubmit) with a single
one (submission-state) and similarly for their respective env vars. See
[the updated
spec](https://github.com/astriaorg/astria/blob/c110bdbf7d0fd05fba04775d07d046c37bbd7372/specs/sequencer-relayer.md#submission-state-file)
for further details.
* The `submission` module was heavily changed: there are now two primary
state structs `StartedSubmission` and `PreparedSubmission`, between
which the blob submitter toggles during normal operation. There is also
`FreshSubmission` which is only relevant at startup, and an enum
covering these three (`SubmissionStateAtStartup`) which is also only
used during startup. Finally, the `State` enum is an ephemeral object
only used to read/write the relevant state from/to disk.
* `BlobSubmitter::run` was modified to try to confirm the last
submission attempt from the previous session if the state file indicated
the relayer exited while in `prepared` state. If that submission is
confirmed (the most likely outcome), then the sequencer blocks in that
submission are simply skipped in the write loop. (We could try to avoid
even fetching these sequencer blocks, but that would be a significantly
more pervasive change, and is probably not worth the complexity).
* The relayer's celestia client was changed to split `try_submit` into
`try_prepare` and `try_submit` so that the hash of the prepared `BlobTx`
can be returned from `try_prepare` to be recorded in the state file
before the transaction is broadcast to the Celestia app.
* `submit_with_retry` was updated to check for a broadcast timeout error
in the previous attempt, and in that case, attempts to just confirm that
submission rather than automatically resubmitting the same data.

## Testing
* Unit tests for the new `submission` types.
* Black box test `later_height_in_state_leads_to_expected_relay` was
updated.
* Manually observed expected behaviour against a locally-running
sequencer and Celestia app.

## Breaking Changelist
* Removed `ASTRIA_SEQUENCER_RELAYER_PRE_SUBMIT_PATH` and
`ASTRIA_SEQUENCER_RELAYER_POST_SUBMIT_PATH` config env vars.
* Added `ASTRIA_SEQUENCER_RELAYER_SUBMISSION_STATE_PATH` config env var.

## Related Issues
Closes #1200.
  • Loading branch information
Fraser999 authored Jul 25, 2024
1 parent df1c206 commit 961294c
Show file tree
Hide file tree
Showing 20 changed files with 1,321 additions and 739 deletions.
11 changes: 11 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion charts/sequencer-relayer/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.11.0
version: 0.11.1

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
Expand Down
9 changes: 7 additions & 2 deletions charts/sequencer-relayer/files/scripts/start-relayer.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,19 @@ set -o errexit -o nounset -o pipefail

echo "Starting the Astria Sequencer Relayer..."

if ! [ -f "$ASTRIA_SEQUENCER_RELAYER_PRE_SUBMIT_PATH" ]; then
if ! [ -z ${ASTRIA_SEQUENCER_RELAYER_PRE_SUBMIT_PATH+x} ] && ! [ -f "$ASTRIA_SEQUENCER_RELAYER_PRE_SUBMIT_PATH" ]; then
echo "Pre-submit storage file not found, instantiating with ignore state. Post submit storage file will be created on first submit."
echo "{\"state\": \"ignore\"}" > $ASTRIA_SEQUENCER_RELAYER_PRE_SUBMIT_PATH
fi

if ! [ -f "$ASTRIA_SEQUENCER_RELAYER_POST_SUBMIT_PATH" ]; then
if ! [ -z ${ASTRIA_SEQUENCER_RELAYER_POST_SUBMIT_PATH+x} ] && ! [ -f "$ASTRIA_SEQUENCER_RELAYER_POST_SUBMIT_PATH" ]; then
echo "Post-submit storage file does not exist, instantiating with fresh state. Will start relaying from first sequencer block."
echo "{\"state\": \"fresh\"}" > $ASTRIA_SEQUENCER_RELAYER_POST_SUBMIT_PATH
fi

if ! [ -z ${ASTRIA_SEQUENCER_RELAYER_SUBMISSION_STATE_PATH+x} ] && ! [ -f "$ASTRIA_SEQUENCER_RELAYER_SUBMISSION_STATE_PATH" ]; then
echo "Submission state file does not exist, instantiating with fresh state. Will start relaying from first sequencer block."
echo "{\"state\": \"fresh\"}" > $ASTRIA_SEQUENCER_RELAYER_SUBMISSION_STATE_PATH
fi

exec /usr/local/bin/astria-sequencer-relayer
4 changes: 4 additions & 0 deletions charts/sequencer-relayer/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,7 @@ Namepsace to deploy elements into.
{{- define "sequencer-relayer.storage.postSubmitPath" -}}
{{ include "sequencer-relayer.storage.mountPath" . }}/postsubmit.json
{{- end }}

{{- define "sequencer-relayer.storage.submissionStatePath" -}}
{{ include "sequencer-relayer.storage.mountPath" . }}/submission-state.json
{{- end }}
5 changes: 3 additions & 2 deletions charts/sequencer-relayer/templates/configmaps.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,6 @@ data:
ASTRIA_SEQUENCER_RELAYER_CELESTIA_APP_GRPC_ENDPOINT: "{{ .Values.config.relayer.celestiaAppGrpc }}"
ASTRIA_SEQUENCER_RELAYER_CELESTIA_APP_KEY_FILE: "/celestia-key/{{ .Values.config.celestiaAppPrivateKey.secret.filename }}"
ASTRIA_SEQUENCER_RELAYER_API_ADDR: "0.0.0.0:{{ .Values.ports.healthAPI }}"
ASTRIA_SEQUENCER_RELAYER_PRE_SUBMIT_PATH: "{{ include "sequencer-relayer.storage.preSubmitPath" . }}"
ASTRIA_SEQUENCER_RELAYER_POST_SUBMIT_PATH: "{{ include "sequencer-relayer.storage.postSubmitPath" . }}"
ASTRIA_SEQUENCER_RELAYER_NO_METRICS: "{{ not .Values.config.relayer.metrics.enabled }}"
ASTRIA_SEQUENCER_RELAYER_METRICS_HTTP_LISTENER_ADDR: "0.0.0.0:{{ .Values.ports.metrics }}"
ASTRIA_SEQUENCER_RELAYER_FORCE_STDOUT: "{{ .Values.global.useTTY }}"
Expand All @@ -30,7 +28,10 @@ data:
ASTRIA_SEQUENCER_RELAYER_SEQUENCER_CHAIN_ID: "{{ .Values.config.relayer.sequencerChainId }}"
ASTRIA_SEQUENCER_RELAYER_CELESTIA_CHAIN_ID: "{{ .Values.config.relayer.celestiaChainId }}"
{{- if not .Values.global.dev }}
ASTRIA_SEQUENCER_RELAYER_PRE_SUBMIT_PATH: "{{ include "sequencer-relayer.storage.preSubmitPath" . }}"
ASTRIA_SEQUENCER_RELAYER_POST_SUBMIT_PATH: "{{ include "sequencer-relayer.storage.postSubmitPath" . }}"
{{- else }}
ASTRIA_SEQUENCER_RELAYER_SUBMISSION_STATE_PATH: "{{ include "sequencer-relayer.storage.submissionStatePath" . }}"
{{- end }}
---
apiVersion: v1
Expand Down
6 changes: 3 additions & 3 deletions charts/sequencer/Chart.lock
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
dependencies:
- name: sequencer-relayer
repository: file://../sequencer-relayer
version: 0.11.0
digest: sha256:70434f4e37c36660ff9b89258d4de6770f206712020bda7398a22772e8f74fa8
generated: "2024-07-19T12:21:51.250339+02:00"
version: 0.11.1
digest: sha256:9c44f4901c4b89bbf6261f1a92bb18f71aef6d95e536aadc8a4a275a01eec25b
generated: "2024-07-23T20:01:42.179395482+01:00"
4 changes: 2 additions & 2 deletions charts/sequencer/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.19.0
version: 0.19.1
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
Expand All @@ -24,7 +24,7 @@ appVersion: "0.14.0"

dependencies:
- name: sequencer-relayer
version: "0.11.0"
version: "0.11.1"
repository: "file://../sequencer-relayer"
condition: sequencer-relayer.enabled

Expand Down
8 changes: 7 additions & 1 deletion crates/astria-sequencer-relayer/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ const_format = { workspace = true }
futures = { workspace = true }
hex = { workspace = true, features = ["serde"] }
humantime = { workspace = true }
humantime-serde = "1.1.1"
hyper = { workspace = true }
itoa = { workspace = true }
metrics = { workspace = true }
Expand All @@ -39,7 +40,12 @@ tendermint-config = { workspace = true }
thiserror = { workspace = true }
tracing = { workspace = true }
tryhard = { workspace = true }
tokio = { workspace = true, features = ["macros", "rt-multi-thread", "signal"] }
tokio = { workspace = true, features = [
"fs",
"macros",
"rt-multi-thread",
"signal",
] }
tokio-stream = { workspace = true }
tokio-util = { workspace = true }
tonic = { workspace = true }
Expand Down
22 changes: 8 additions & 14 deletions crates/astria-sequencer-relayer/local.env.example
Original file line number Diff line number Diff line change
Expand Up @@ -60,23 +60,17 @@ ASTRIA_SEQUENCER_RELAYER_ONLY_INCLUDE_ROLLUPS=
# The socket address at which sequencer relayer will server healthz, readyz, and status calls.
ASTRIA_SEQUENCER_RELAYER_API_ADDR=127.0.0.1:2450

# The path to which relayer will write its state prior to submitting to Celestia.
# A file must exist at this path, be readable and writable, and contain one of:
# 1. {"state": "ignore"}
# to ignore the pre-submit state entirely and only consider the object stored in
# ASTRIA_SEQUENCER_RELAYER_POST_SUBMIT_PATH.
# 2. {"state": "started", "last_submission": <post_submission_state> }
# which is usually only written by sequencer-relayer during normal operation and
# is checked for consistency with ASTRIA_SEQUENCER_RELAYER_POST_SUBMIT_PATH at startup.
ASTRIA_SEQUENCER_RELAYER_PRE_SUBMIT_PATH=/path/to/presubmit.json

# The path to which relayer will write its state after submitting to Celestia.
# The path to which relayer will write its state while submitting to Celestia.
# A file must exist at this path, be readable and writable, and contain one of:
# 1. {"state": "fresh"}
# for relaying sequencer blocks starting at sequencer height 1.
# 2. {"state": "submitted", "celestia_height": <number>, "sequencer_height": <number>}}
# for relaying blocks starting at `<number> + 1`.
ASTRIA_SEQUENCER_RELAYER_POST_SUBMIT_PATH=/path/to/postsubmit.json
# 2. {"state":"started","last_submission":{"celestia_height":<number>,"sequencer_height":<number>}}
# for relaying blocks starting at `[last_submission.sequencer_height] + 1`.
# 3. {"state":"prepared","sequencer_height":<number>,"last_submission":{"celestia_height":<number>,"sequencer_height":<number>},"blob_tx_hash":"<hex string>","at":"<timestamp>"}
# for trying to continue from the last submission attempt. Checks if the given blob tx is stored
# on Celestia, and if so, begins relaying blocks starting at `[sequencer_height] + 1`, otherwise
# begins relaying blocks starting at `[last_submission.sequencer_height] + 1`.
ASTRIA_SEQUENCER_RELAYER_SUBMISSION_STATE_PATH=/path/to/submission-state.json

# Set to true to enable prometheus metrics.
ASTRIA_SEQUENCER_RELAYER_NO_METRICS=true
Expand Down
6 changes: 2 additions & 4 deletions crates/astria-sequencer-relayer/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -48,10 +48,8 @@ pub struct Config {
pub metrics_http_listener_addr: String,
/// Writes a human readable format to stdout instead of JSON formatted OTEL trace data.
pub pretty_print: bool,
/// The path to which relayer will write its state prior to submitting to Celestia.
pub pre_submit_path: PathBuf,
/// The path to which relayer will write its state after submitting to Celestia.
pub post_submit_path: PathBuf,
/// The path to which relayer will write its state while submitting to Celestia.
pub submission_state_path: PathBuf,
}

impl Config {
Expand Down
9 changes: 3 additions & 6 deletions crates/astria-sequencer-relayer/src/relayer/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,7 @@ pub(crate) struct Builder {
pub(crate) sequencer_poll_period: Duration,
pub(crate) sequencer_grpc_endpoint: String,
pub(crate) rollup_filter: IncludeRollup,
pub(crate) pre_submit_path: PathBuf,
pub(crate) post_submit_path: PathBuf,
pub(crate) submission_state_path: PathBuf,
pub(crate) metrics: &'static Metrics,
}

Expand All @@ -53,8 +52,7 @@ impl Builder {
sequencer_poll_period,
sequencer_grpc_endpoint,
rollup_filter,
pre_submit_path,
post_submit_path,
submission_state_path,
metrics,
} = self;

Expand Down Expand Up @@ -93,8 +91,7 @@ impl Builder {
celestia_client_builder,
rollup_filter,
state,
pre_submit_path,
post_submit_path,
submission_state_path,
metrics,
})
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,12 @@ pub(in crate::relayer) enum TrySubmitError {
#[derive(Clone, Debug)]
pub(in crate::relayer) struct GrpcResponseError(Status);

impl GrpcResponseError {
pub(in crate::relayer) fn is_timeout(&self) -> bool {
self.0.code() == tonic::Code::Cancelled
}
}

impl Display for GrpcResponseError {
fn fmt(&self, formatter: &mut Formatter<'_>) -> fmt::Result {
write!(
Expand Down
Loading

0 comments on commit 961294c

Please sign in to comment.