Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manual sync committee update #1164

Merged
merged 9 commits into from
Apr 18, 2024
Merged

Conversation

claravanstaden
Copy link
Contributor

@claravanstaden claravanstaden commented Mar 27, 2024

  • If the sync committee update data retrieval fails (most likely because a beacon state is not available), build an update using the standard Beacon API endpoints.
  • Splits the protocol settings and methods into its own package.
  • Adds tests for the beacon data store.
  • First try to get an exact beacon state matching the slot provided, and if it can't be found, provide the best next update

@claravanstaden claravanstaden marked this pull request as ready for review April 4, 2024 09:18
Copy link
Contributor Author

@claravanstaden claravanstaden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the large change - the biggest cause of the change is moving the protocol package out.

relayer/relays/beacon/config/config.go Show resolved Hide resolved
relayer/relays/beacon/header/header.go Show resolved Hide resolved
relayer/relays/beacon/header/header.go Show resolved Hide resolved
}
beaconState, err := s.unmarshalBeaconState(uint64(finalizedHeader.Slot), beaconStateData)

blockRootsProof, err = s.GetBlockRootsFromState(beaconState)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the state is not available, check if it perhaps available in the beacon store.

@yrong
Copy link
Contributor

yrong commented Apr 9, 2024

If the sync committee update data retrieval fails (most likely because a beacon state is not available), build an update using the standard Beacon API endpoints.

That's Cool. Considering now we can always get a valid finality update from API endpoints, is the db option still necessary?


attestedSlot, err := s.findAttestedAndFinalizedHeadersAtBoundary(attestedSlot, lastSyncedFinalizedSlot)
attestedSlot, err := s.FindOldestAttestedHeaderAtInterval(slot, boundary)
Copy link
Contributor

@yrong yrong Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just notice there are 2 methods doing things similar:

FindLatestAttestedHeadersAtInterval vs FindOldestAttestedHeaderAtInterval

So can the search here be replaced with s.FindLatestAttestedHeadersAtInterval(boundry, slot),
or maybe we can try to merge the two functions into one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The methods do similar but different things. FindLatestAttestedHeadersAtInterval finds the latest available slot, while FindOldestAttestedHeaderAtInterval finds the oldest available slot. I have merged the common code into a method called findValidUpdatePair here: https://github.com/Snowfork/snowbridge/pull/1164/files#diff-c589b232c311e67a8690c53fe1886c463a185178fde8ed4aa9dd2def499603b3R578. I think the 2 methods are now small and different enough to warrant 2 functions.

Copy link
Contributor

@yrong yrong Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I understand that.

Just curious seems we use FindLatestAttestedHeadersAtInterval for syncInterimFinalizedUpdate while FindOldestAttestedHeaderAtInterval for SyncCommitteePeriodUpdate, is there a special consideration for that?

Or be more specific can the search s.FindOldestAttestedHeaderAtInterval(slot, boundary) be replaced with s.FindLatestAttestedHeadersAtInterval(boundry, slot) and always search for the latest finality?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started typing a response and never sent it. 😅 The reason why these two are different in my opinion is:

  • Finalized Update: Ideally want a newer update, so that it can cover a "larger" range of SLOTS_PER_HISTORICAL_ROOT (meaning, we want a finalized header as far from the previous finalized header) as we can.
  • Finalized Update with Sync Committee: for this case, an earlier update is better since we'd like to update the next sync committee as soon as we can.

Let me know if you think this is kinda unnecessary, and if we should just use the latest header we can (and scrap the "earlier" update for Finalized Update with Sync Committee.

@claravanstaden
Copy link
Contributor Author

That's Cool. Considering now we can always get a valid finality update from API endpoints, is the db option still necessary?

The fallback finality update uses the DB option as a fallback as well. I would feel more comfortable with having the DB option as a backup. Do you have concerns with it?

@yrong
Copy link
Contributor

yrong commented Apr 10, 2024

I would feel more comfortable with having the DB option as a backup. Do you have concerns with it?

So we'll still stick to lodestar with --chain.archiveStateEpochFrequency=1 which already saves finality state every epoch, right?

Not much concern. Just from the perspective of DevOps seems we need a separate cloud instance to run the command storing beacon state periodically for high availability, then there is some maintenance cost.

@claravanstaden
Copy link
Contributor Author

So we'll still stick to lodestar with --chain.archiveStateEpochFrequency=1 which already saves finality state every epoch, right?

Yes. 😄

Not much concern. Just from the perspective of DevOps seems we need a separate cloud instance to run the command storing beacon state periodically for high availability, then there is some maintenance cost.

I think it can be a simple cronjob on the EC2 server that will run the relayer anyway. Agreed on the cost. We can always remove it, but having it for launch gives me great assurance.

@yrong
Copy link
Contributor

yrong commented Apr 13, 2024

I think it can be a simple cronjob on the EC2 server that will run the relayer anyway

IMHO the cronjob make sense only on a seperate EC2 instance, it's for backup when the EC2 server running lodestar is down.

So I'd assume it points to a different beacon node for which maybe we can run a backup lodestar with the mock options --execution.engineMock --eth1=false

In this way we have a more reliable fallback option but it also means more cost.

@claravanstaden
Copy link
Contributor Author

I think it can be a simple cronjob on the EC2 server that will run the relayer anyway

IMHO the cronjob make sense only on a seperate EC2 instance, it's for backup when the EC2 server running lodestar is down.

So I'd assume it points to a different beacon node for which maybe we can run a backup lodestar with the mock options --execution.engineMock --eth1=false

In this way we have a more reliable fallback option but it also means more cost.

I had more the scenario in mind where just the beacon relayer is down (perhaps restarting because of a bug or some other issue) like we had that weekend where the beacon relayer was down because it could not find the beacon state. In that case, it would have been fine to have a cron just download beacon states every now and then.

I am also not sure if we will be able to justify running 2 EC instances just for this backup, especially if other people also start running relayers.

@yrong
Copy link
Contributor

yrong commented Apr 16, 2024

Yeah, I'm also not keen to add another instance for this. Just in case when the instance is down for a while(i.e. including relayer, lodestar, db) seems we don't have a rescue solution, do we?

Anyway, this PR is very nice to provide another reliable fallback option so should be good to merge.

Copy link
Contributor

@yrong yrong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

# Conflicts:
#	relayer/cmd/store_beacon_state.go
@claravanstaden claravanstaden merged commit 7f41605 into main Apr 18, 2024
1 check passed
@claravanstaden claravanstaden deleted the manual-sync-committee-update branch April 18, 2024 08:37
claravanstaden added a commit that referenced this pull request Apr 18, 2024
* adds sync committee to update method

* progress

* manual sync committee update

* fix tests

* cleanup

* remove unnecessary check

* simplify checkpoint populate

* fix method

---------

Co-authored-by: claravanstaden <Cats 4 life!>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants