Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block availability data enum #6866

Open
wants to merge 8 commits into
base: unstable
Choose a base branch
from

Conversation

dapplion
Copy link
Collaborator

Issue Addressed

PeerDAS has undergone multiple refactors + the blending with the get_blobs optimization has generated technical debt.

A function signature like this

fn get_blobs_or_columns_store_op(
&self,
block_root: Hash256,
block_epoch: Epoch,
blobs: Option<BlobSidecarList<T::EthSpec>>,
data_columns: Option<DataColumnSidecarList<T::EthSpec>>,
data_column_recv: Option<oneshot::Receiver<DataColumnSidecarList<T::EthSpec>>>,
) -> Result<Option<StoreOp<T::EthSpec>>, String> {

Allows at least the following combination of states:

  • blobs: Some / None
  • data_columns: Some / None
  • data_column_recv: Some / None
  • Block has data? Yes / No
  • Block post-PeerDAS? Yes / No

In reality, we don't have that many possible states, only:

  • NoData: pre-deneb, pre-PeerDAS with 0 blobs or post-PeerDAS with 0 blobs
  • Blobs(BlobSidecarList<E>): post-Deneb pre-PeerDAS with > 0 blobs
  • DataColumns(DataColumnSidecarList<E>): post-PeerDAS with > 0 blobs
  • DataColumnsRecv(oneshot::Receiver<DataColumnSidecarList<E>>): post-PeerDAS with > 0 blobs, but we obtained the columns via reconstruction

^ this are the variants of the new AvailableBlockData enum

So we go from 2^5 states to 4 well-defined. Downstream code benefits nicely from this clarity and I think it makes the whole feature much more maintainable.

Currently is_available returns a bool, and then we construct the available block in make_available. In a way the availability condition is duplicated in both functions. Instead, this PR constructs AvailableBlockData in is_available so the availability conditions are written once

if let Some(block_data) = is_available(..) {
    let available_block = make_available(block_data);
}

@dapplion dapplion added the das Data Availability Sampling label Jan 25, 2025
@dapplion dapplion added the ready-for-review The code is ready for review label Jan 25, 2025
let num_received_blobs = self.num_received_blobs();
let num_received_columns = self.num_received_data_columns();
if spec.is_peer_das_enabled_for_epoch(block.epoch()) {
if self.verified_data_columns.len() >= custody_column_count {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've discussed about this before - we don't allow for duplicates when adding columns to the available cache, so if len() is greater than custody column count, we likely have added non-custody column to the cache, and this would result in:

  1. inconsistent columns in DB
  2. incorrectly identifying a block as available where it isn't (getting some non custody columns, but custody columns may be missing)

With the current approach, we must be able to detect if there's a bug here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a TODO in the line below to fix this case later. We were not handling this case before and it's out of scope for this PR IMO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the other comment, our code on unstable right now requires an exact match for either blobs or columns, why change this?

"received_columns" => num_received_columns,
"expected_columns" => expected_columns_msg,
);
// TODO(das) get only the columns that we need, filter the rest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this filter necessary with the current approach?
Is this TODO to account for peer sampling?
Feels like it's inconsistent in the context of subnet sampling though - because we could have enough verified_data_columns but end up storing less than custody_column_count?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what to do in this case. Erroring here is bad as it will prevent block import. This should not happen and points to an internal error. I would rather think how to handle the > cases in a separate PR.

} else {
// Before PeerDAS, blobs
let num_received_blobs = self.verified_blobs.iter().flatten().count();
if num_received_blobs >= num_expected_blobs {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's definitely a bug if > happens, I'm not sure if we just make it available and store the the "uncommitted blobs"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the previous handling is good - and if there's unexpected blob there should be an error AvailabilityCheckError::Unexpected

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we did not handle the > case, we had a .take() on the blobs iter, we would just ignore the extra blobs

Copy link
Member

@jimmygchen jimmygchen Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The take() is unnecessary I believe, because we check for duplicates AND exact match before this PR

let all_blobs_received = block_kzg_commitments_count_opt
.is_some_and(|num_expected_blobs| num_expected_blobs == num_received_blobs);
let all_columns_received = expected_columns_opt
.is_some_and(|num_expected_columns| num_expected_columns == num_received_columns);

beacon_node/beacon_chain/src/data_availability_checker.rs Outdated Show resolved Hide resolved
"block_root" => %block_root,
"count" => blobs.len(),
);
Ok(Some(StoreOp::PutBlobs(block_root, blobs)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep the is_empty check for both blobs and columns so we dont't store empty values?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AvailableBlockData::Blobs enum by definition must have more than zero blobs. If we do a check here we should error, instead of ignoring the Op

beacon_node/beacon_chain/src/data_availability_checker.rs Outdated Show resolved Hide resolved
beacon_node/beacon_chain/src/data_availability_checker.rs Outdated Show resolved Hide resolved
@jimmygchen jimmygchen added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Jan 30, 2025
Copy link
Member

@jimmygchen jimmygchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, very elegant solution, and much cleaner.
I've added some comments, the main questions I have are the DA checker is_available checks and edge case handling.

@dapplion dapplion requested a review from jimmygchen January 31, 2025 23:06
@dapplion dapplion added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Jan 31, 2025
Copy link

mergify bot commented Feb 3, 2025

This pull request has merge conflicts. Could you please resolve them @dapplion? 🙏

.unwrap_or("unknown".to_string());
/// Returns Some if the block has received all its required data for import. The return value
/// must be persisted in the DB along with the block.
pub fn is_available(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit on convention: the name is_available implies an immutable method that does not mutate self and return a bool - although I can't think of a better name.

@jimmygchen jimmygchen added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
das Data Availability Sampling waiting-on-author The reviewer has suggested changes and awaits thier implementation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants