Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data validation on EVM-based chains #67

Open
6 of 9 tasks
denisbsu opened this issue Sep 30, 2024 · 0 comments
Open
6 of 9 tasks

Data validation on EVM-based chains #67

denisbsu opened this issue Sep 30, 2024 · 0 comments
Assignees

Comments

@denisbsu
Copy link
Contributor

denisbsu commented Sep 30, 2024

There is a number of data validators we should have in eth data ingestion system. Some of them are already implemented.

Primary data consistency:

  • Is it even a chain? block.parentHash == prev_block.hash
  • Is it even a block? block.hash == hash(block)
  • Are transactions intact? block.transactionsRoot == MPT(block.transactions).root
  • Are receipts intact? block.receiptsRoot == MPT(block.receipts).root
  • Is state transition correct?

Derivative data correctness:

  • Is "from" transaction field valid? transaction.from == recover_sender(transaction)
  • Are traces correct?
  • Are state diffs correct?

Data internal consistency:

  • Is block bloom correct? block.bloomFilter == OR(block.receipts.bloomFilter)

During first pass we should identify Primary data fields not covered by consistency checks and Derivative data fields not covered by correctness checks in order to formulate tasks for second pass. Some internal consistency checks may be added due customer requests and this checks, if failed, may cause some data rewrites.

Some validations are straightforward and fully described by asserts in text, but others are not so obvious and would be described here in details.

How to prove State Transition validity and State Diffs validity

State transition and state diffs correctness goes hand in hand as we can prove both simultaneously: we just need to build total state diff for a block (accounting for transactions order) and then request Merkle Proofs for all befores and afters from data provider. We can build two partial MPT's (before and after), roots of this tries should be state roots for previous and current blocks respectively, and all differences between them should be explained by total state diff. This check is very involved and takes a lot of time to write properly, but it relives us from supporting full node and calculating full state MPT for each block.

How to prove Traces validity

Traces are even more involved. Traces can be reproduced by running parts of geth (or any other EVM implementations), but with a twist - we need not only Merkle Proofs of written data, but read data too. So here are validation steps:

  • Parse traces to figure out what data was read
  • Get Merkle Proofs of read data from API
  • Merge Read Proofs with "before" subtrie from State Transitions/State Diffs validation
  • Run all transaction, using partial trie as initial state
  • Compare generated trace against API
  • (Future) To make ZK proof from this validator, , we would need to check resulted state against "after" trie form State Transitions/State Diffs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants