Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

Introduce native rpc support for witness generation #51

Closed
wants to merge 14 commits into from

Conversation

frisitano
Copy link
Contributor

Work in progress

@daniel-bcw
Copy link

Hey @frisitano this is such a useful change, i guess zero-bin would consume any evm based rpc request after ? (geth/erigon)
Is there any part that is blocking the PR, happy to try to contribute

@Nashtare
Copy link
Collaborator

@daniel-bcw That's correct, although initially the code @frisitano migrated isn't fully complete so may fail at generating some blocks, unlike the zero-bin payloads processed from Jerigon.

@frisitano Given that the eth-tx-proof codebase was in a fairly WIP state, I wouldn't expect you to fully clean it up in a single PR, so feel free to drop TODOs wherever you noticed refactor needs, and we can address them as follow up work.

@frisitano
Copy link
Contributor Author

Hey @frisitano this is such a useful change, i guess zero-bin would consume any evm based rpc request after ? (geth/erigon) Is there any part that is blocking the PR, happy to try to contribute

Hey @daniel-bcw, yes that is correct. Following this change zero-bin can target any client that implements the standard RPC API. Thank you for your offer of support, it's appreciated but no support required at this point in time. There are no blockers, I am just waiting on direction from repository maintainers on questions around versioning - you can follow the discussion here #33.

As an aside I've just stumbled upon the work your team have done producing helm templates for zero-bin - https://github.com/BCWResearch/zero-bin-bcw/tree/develop/deploy/helm/zero-bin. Would love to talk more about this and explore ways to collaborate. Whats the best way to get in touch?

@frisitano
Copy link
Contributor Author

@daniel-bcw That's correct, although initially the code @frisitano migrated isn't fully complete so may fail at generating some blocks, unlike the zero-bin payloads processed from Jerigon.

@frisitano Given that the eth-tx-proof codebase was in a fairly WIP state, I wouldn't expect you to fully clean it up in a single PR, so feel free to drop TODOs wherever you noticed refactor needs, and we can address them as follow up work.

I used eth-tx-proof as more of a guide rather than migrating actual code. The code in this PR is "new" based on my understanding of the trace protocol by inspecting the zero tracer in jerigon and the zk_evm repo. Do you have any specific insights of what I should double check / test?

@Nashtare
Copy link
Collaborator

Nashtare commented Apr 22, 2024

@frisitano Probably a good call to move away from it and implement a clean solution directly, thank you for this!
With the test-only feature, we can simulate proofs fairly quickly, so one could easily target a few hundreds of Shanghai blocks with the new native mode you introduced and make sure everything runs smoothly

@frisitano frisitano changed the base branch from develop to testing April 22, 2024 14:56
@daniel-bcw
Copy link

Hey @frisitano this is such a useful change, i guess zero-bin would consume any evm based rpc request after ? (geth/erigon) Is there any part that is blocking the PR, happy to try to contribute

Hey @daniel-bcw, yes that is correct. Following this change zero-bin can target any client that implements the standard RPC API. Thank you for your offer of support, it's appreciated but no support required at this point in time. There are no blockers, I am just waiting on direction from repository maintainers on questions around versioning - you can follow the discussion here #33.

As an aside I've just stumbled upon the work your team have done producing helm templates for zero-bin - https://github.com/BCWResearch/zero-bin-bcw/tree/develop/deploy/helm/zero-bin. Would love to talk more about this and explore ways to collaborate. Whats the best way to get in touch?

hey @frisitano , sure we can definitely get in touch and share some of the work done on that end or other collaborations. You can reach out to me at [email protected] and we can hook up a meeting later down the week/next week.
In the meantime great stuff excited to see this rolling out !

@Nashtare
Copy link
Collaborator

Nashtare commented Apr 23, 2024

@frisitano I ran your branch on a 100 random blocks on Ethereum mainnet, between blocks 19240650 and 19240750

I got 34/100 ❌, 66/100 ✅

  • Missing key xxx when creating sub-partial tries (always storage trie): 4
  • Trie operation error: Found a Hash node during an insert in a PartialTrie! These should not be able to be traversed during an insert!: 15
  • Failed to get proof for account (missing trie node): 4
  • Kernel Panic at mpt_read_hash_node: 9
  • trusted rlp should be valid: RlpExpectedToBeData: 1

I've attached my logs if this helps (though info is minimal as I didn't want them to take hours or clog my disk space).

block_logs.zip

@frisitano
Copy link
Contributor Author

@frisitano I ran your branch on a 100 random blocks on Ethereum mainnet, between blocks 19240650 and 19240750

I got 34/100 ❌, 66/100 ✅

  • Missing key xxx when creating sub-partial tries (always storage trie): 4
  • Trie operation error: Found a Hash node during an insert in a PartialTrie! These should not be able to be traversed during an insert!: 15
  • Failed to get proof for account (missing trie node): 4
  • Kernel Panic at mpt_read_hash_node: 9
  • trusted rlp should be valid: RlpExpectedToBeData: 1

I've attached my logs if this helps (though info is minimal as I didn't want them to take hours or clog my disk space).

block_logs.zip

Thank you! I'm working through this now. Out of interest, is there a public RPC endpoint that supports the zeroTracer on ethereum mainnet that I can use for comparison / debugging?

@Nashtare
Copy link
Collaborator

Hmm I don't think there's any public RPC endpoint, as you need to have access to debug_traceTransaction

@Nashtare Nashtare linked an issue Apr 24, 2024 that may be closed by this pull request
* feat: provide IR for debugging upon failure

* fix: fix clippy issues

* fix: fix pr comments

* fix: make evm_arithmetization non optional for ops

* fix: fix pr comments

* fix: fix clippy issues

* fix: fix clippy issue

* fix: fix pr comment

* fix: fix clippy issue

* fix: fix cargo lock

* fix: fix pr comments

* fix: fix format issues

* fix: fix save input on error for test-only

* fix: fix pr comments

* fix: fix

* fix: fix clippy issue

* fix: fmt fix

---------

Co-authored-by: Vladimir Trifonov <[email protected]>
@frisitano
Copy link
Contributor Author

frisitano commented Apr 24, 2024

Hmm I don't think there's any public RPC endpoint, as you need to have access to debug_traceTransaction

I was hoping that someone (potentially within the team) may have synced the jerigon node with Ethereum mainnet such that we could use the zero tracer to produce traces and prove blocks which are failing using the native tracer. This would give us confidence that it is in fact the trace generation and not some issue within the zk_evm. We could also use it to compare block traces produced using zero tracer and native.

I've fixed one of the issues which was related to empty storage tries, however I'm still experiencing more issues.

Something I'm a little confused about when inspecting logs for the following transaction.

In the logs we can see the following:

2024-04-24T16:20:48.247562Z DEBUG evm_arithmetization::witness::transition: Cycle 348312, ctx=1, pc=mpt_read_state_trie, instruction=Push(1), stack=[844935717204094928098547272543109538884590266852, 4691, 844935717204094928098547272543109538884590266852, 9100, 1, 1, 844935717204094928098547272543109538884590266852, 4323, 1156976560561266838347921199117367450816737848457277449313465846, 240000000000000]    

which suggests (from my understanding of the kernel) we are trying to read the account data for account 844935717204094928098547272543109538884590266852 (0x94003a59f4172db0079f0f0ca50d69796b3469e4). When I look at the transaction traces (both prestateTracer and callTracer) there are no references of this account except for the address being a storage value, which is read, associated with contract 0x2ec705d306b51e486b1bc0d6ebee708e0661add1.

            "storage": {
                "0x0000000000000000000000000000000000000000000000000000000000000000": "0x00000000000000000000000094003a59f4172db0079f0f0ca50d69796b3469e4"
            }

I believe there are two possible explanations for this:

  • the prestateTracer is not correctly reporting on access to this account.
  • the kernel is incorrectly accessing this account when it should'nt.

transaction logs:
transaction.log.zip

vladimir-trifonov and others added 2 commits April 25, 2024 02:41
* fix: fix circuit version consistency check

* fix: fix clippy issues

* fix: update deps

---------

Co-authored-by: Vladimir Trifonov <[email protected]>
Co-authored-by: Robin Salen <[email protected]>
@frisitano
Copy link
Contributor Author

frisitano commented Apr 24, 2024

So I've been investigating this issue and I suspect the following is happening. When the call to the 0x94003a59f4172db0079f0f0ca50d69796b3469e4 address is made gas is exhausted and the transaction should terminate but the kernel still tried to access the account.

If we look at the sys_call kernel method seen below we can see that the invocation of the %insert_accessed_addresses method takes place before the %call_charge_gas method. Therefore the kernel believes that we have accessed this address when in fact we have not because the transaction failed due to exhausting gas.

Relevant logs:

2024-04-24T16:20:48.245978Z TRACE evm_arithmetization::witness::transition: Cycle 348113, ctx=1, pc=8876, below sys_gas, instruction=Swap(0), stack=[4323, 1141283806222800136438331725559348284776481959846117363031225845, 844935717204094928098547272543109538884590266852, 240000000000000, 1175, 0, 1175, 0, 1175, 240000000000000]    
2024-04-24T16:20:48.245986Z TRACE evm_arithmetization::witness::transition: Cycle 348114, ctx=1, pc=8877, below sys_gas, instruction=ExitKernel, stack=[1141283806222800136438331725559348284776481959846117363031225845, 4323, 844935717204094928098547272543109538884590266852, 240000000000000, 1175, 0, 1175, 0, 1175, 240000000000000]    
2024-04-24T16:20:48.245992Z DEBUG evm_arithmetization::witness::operation: Exiting to 13813, is_kernel=false    
2024-04-24T16:20:48.245995Z DEBUG evm_arithmetization::cpu::kernel::interpreter: User instruction: Syscall(241, 7, false)    
2024-04-24T16:20:48.245998Z DEBUG evm_arithmetization::witness::operation: Syscall to sys_call    
2024-04-24T16:20:48.246002Z DEBUG evm_arithmetization::witness::transition: Cycle 348116, ctx=1, pc=sys_call, instruction=Dup(3), stack=[1141283806222800136438331725559348284776481959846117363031225846, 4323, 844935717204094928098547272543109538884590266852, 240000000000000, 1175, 0, 1175, 0, 1175, 240000000000000]    
2024-04-24T16:20:48.246009Z TRACE evm_arithmetization::witness::transition: Cycle 348117, ctx=1, pc=1107, below sys_call, instruction=Iszero, stack=[240000000000000, 1141283806222800136438331725559348284776481959846117363031225846, 4323, 844935717204094928098547272543109538884590266852, 240000000000000, 1175, 0, 1175, 0, 1175]    
2024-04-24T16:20:48.246017Z TRACE evm_arithmetization::witness::transition: Cycle 348118, ctx=1, pc=1108, below sys_call, instruction=Iszero, stack=[0, 1141283806222800136438331725559348284776481959846117363031225846, 4323, 844935717204094928098547272543109538884590266852, 240000000000000, 1175, 0, 1175, 0, 1175]    
2024-04-24T16:20:48.246024Z TRACE evm_arithmetization::witness::transition: Cycle 348119, ctx=1, pc=1109, below sys_call, instruction=Push(5), stack=[1, 1141283806222800136438331725559348284776481959846117363031225846, 4323, 844935717204094928098547272543109538884590266852, 240000000000000, 1175, 0, 1175, 0, 1175]    
2024-04-24T16:20:48.246031Z TRACE evm_arithmetization::witness::transition: Cycle 348120, ctx=1, pc=1115, below sys_call, instruction=GetContext, stack=[25769803784, 1, 1141283806222800136438331725559348284776481959846117363031225846, 4323, 844935717204094928098547272543109538884590266852, 240000000000000, 1175, 0, 1175, 0]    

I'm not sure on the specifics of why this is tripping up in the kernel but I suspect it may be something along these lines.

@frisitano
Copy link
Contributor Author

Similar situation with the following transaction - relating to a sys_call invocation which fails:

@@ -13,6 +13,9 @@ pub(crate) enum Commands {
/// The RPC URL
#[arg(short = 'u', long, value_hint = ValueHint::Url)]
rpc_url: String,
/// The RPC Tracer Type
#[arg(short = 't', long, default_value = "jerigon")]
Copy link

@fizikarubi fizikarubi May 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you could define an enum to avoid panicing on unexpected string:

#[derive(ValueEnum)]
pub enum RpcType {
    Jerigon, 
    Native
}

...

#[arg(short = 't', long, default_value_t = RpcType::Jerigon)]
rpc_type: RpcType,

Comment on lines +40 to +47
.then(|tx_hash| {
let accounts_state = accounts_state.clone();
let provider = Arc::clone(&provider);
let code_db = Arc::clone(&code_db);
async move {
super::txn::process_transaction(provider, tx_hash, accounts_state, code_db).await
}
})
Copy link

@fizikarubi fizikarubi May 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: does the order in which we update the accounts state matter? if not we could use FuturesOrdered instead to execute the processing of transaction concurrenltly for a better performance.

if the order matters, I'm curious why can't we fetch_tx_data here instead and update the accounts_state synchronously to avoid using arc/lock as we have to process the txs sequentially anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a change locally which changes this logic. I'll push them shortly.

Comment on lines +297 to 302
pub(crate) struct RpcBlockMetadata {
pub(crate) block_by_number: EthGetBlockByNumberResponse,
pub(crate) chain_id: EthChainIdResponse,
prev_hashes: Vec<H256>,
checkpoint_state_trie_root: H256,
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fairly minor, but it may be cleaner to have this struct (and EthGetBlockByNumberResponse / EthChainIdResponse along with it) be defined in a common module, as it is being used both by jerigon and native modules.

@frisitano
Copy link
Contributor Author

Closing as replaced by #81.

@frisitano frisitano closed this May 24, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Support fetching witnesses via eth-tx-proof
6 participants