Update with latest plonky2 and rely on BlockchainTests instead #37

Nashtare · 2023-09-28T16:11:12Z

This PR updates the way tests are being deserialized to handle the latest version of plonky2.

Major points:

variants in BlockchainTests are on a per transaction basis, which required changing a bit the parsing logic
with latest plonky2, roots are being checked directly in the kernel, and result in a KernelPanic error if there is a mismatch, so there is no need for revm_variants as we would error before.
deserialization during test parsing is, I agree, much slower than previously. The main issue is that the amount of data being parsed is much larger, and it seems serde wants to access all values in a HashMap, which kinda bothers me (I originally wanted to read keys and only process values for the ones related to Shanghai).
Related to last point, a way that should be faster would be to read only the RLP of a block, and decode its header from it, though I've been having trouble making it work, so in the meantime I'm still opening the PR for review, as doing this should only impact the deserialize.rs file anyway.
Some tests are not being parsed at the moment, as they don't have a transactions field and need special handling (usually ones with exception cases). This can be done through txn RLP decoding + ecrecover-like process to access the sender address. I added it as a TODO for now, also as it should have limited impact (only the deserialize.rs file). On the other hand some tests that were not parsed previously are not correctly deserialized.

@BGluth there's no rush at all for reviewing this, as I can still use it locally to run some tests. There are probably refactoring that could be done, as I may have done this in a hurried way.

Following discussion in #35, for now we still process tests as before, i.e. alter manually the gaslimit if possible, run the test, and ignore it if it was altered and still resulted in a panic. I'll update this once we handle 2-limb gaslimit / gas_used on plonky2 side.

FYI @wborgeaud I've run the updated runner against a small sample. I got 140 ✅ tests, and 1 ❌ (creationTxInitCodeSizeLimit), but haven't dived into why yet. If it's useful to you, I've gathered all the serialized GenerationInputs from the parsed tests here: https://github.com/topos-protocol/evm-tests-suite-parsed.

closes #35

Not needed anymore

README.md

Nashtare · 2023-09-29T18:26:09Z

@wborgeaud Actually it seems there's an issue with the RLP encoding derived from the LegacyTransactionRlp struct on plonky2 (and probably type 1 / type 2 as well) coming from the data segment. For many tests, I can't retrieve the txn rlp computed inside the block rlp. This may be the source of the discrepancy for the mentioned failing test, as in this particular case it fails with a transaction_trie_root mismatch. Still investigating.

Nashtare · 2023-10-01T00:47:13Z

I've addressed the TODOs mentioned in the two last points, ~~but it's pegged to the branch in 0xPolygonZero/plonky2#1263 for now as I needed to derive Clone + fix encoding of AccessLists, I'll push the changes once that PR is merged~~.

It doesn't solve the issue mentioned above, arising for instance in creationTxInitCodeSizeLimit -> the error seems to be coming from the rlp library instead of plonky2.

Nashtare · 2023-10-04T01:26:23Z

Note that once 0xPolygonZero/plonky2#1276 is merged, we should be good here. I'll update the serialized json files once again after the final bump.

Nashtare · 2023-10-04T12:02:44Z

I've updated the serialized inputs at https://github.com/topos-protocol/evm-tests-suite-parsed/tree/main.

@wborgeaud Just FYI, so far I have 2310 ✅ / 13 ❌ .
Out of the failing ones:

9 MeasureGas for which we know why they are failing (I think Jacqueline had written a write-up on how she was envisioning handling the gas, as those tests are erroring at sys_stop)
creationTxInitCodeSizeLimit_0 as mentioned earlier. I was hoping the to field fix would solve it but it seems it has another issue somewhere.
expPower256Of256
exp_4
twoOps

We'll start diving into them to see why they are failing. I'll keep you updated on discord if other tests appear to fail!

wborgeaud · 2023-10-04T17:01:56Z

Thanks @Nashtare !
It looks like exp_4 is passing on my machine. What kind of error are you getting?

Nashtare · 2023-10-04T17:36:27Z

Have you tried running the others? I noticed that they were getting out of order when being processed, so that may be the reason the indices aren't right?
I haven't run the runner with logs, so I couldn't say what was the error, I can check when I'm back.

On an unrelated note, most of the stEip1559/intrinsic tests are failing, but if I understand them it is expected, as plonky2 is expecting only well formed transactions? As those aren't and we are still passing them to the prover, we get a txn root / receipt root mismatch (it wasn't caught in the past because we didn't support those tries yet).

Nashtare · 2023-10-04T17:49:39Z

I could also update the logic to keep their original name like foo_d1g0v0_Shanghai, shouldn't be too hard to do

wborgeaud · 2023-10-04T19:07:42Z

Have you tried running the others? I noticed that they were getting out of order when being processed, so that may be the reason the indices aren't right? I haven't run the runner with logs, so I couldn't say what was the error, I can check when I'm back.

Ah yes that's it. For me it's exp_0 that's failing with the verifier error "Mismatch between evaluation and opening of quotient polynomial".

On an unrelated note, most of the stEip1559/intrinsic tests are failing, but if I understand them it is expected, as plonky2 is expecting only well formed transactions? As those aren't and we are still passing them to the prover, we get a txn root / receipt root mismatch (it wasn't caught in the past because we didn't support those tries yet).

Yes that sounds right.

I could also update the logic to keep their original name like foo_d1g0v0_Shanghai, shouldn't be too hard to do

As you prefer. I think it could be easier that way to pinpoint exactly which tests are failing.

Nashtare · 2023-10-04T19:29:04Z

For me it's exp_0 that's failing with the verifier error "Mismatch between evaluation and opening of quotient polynomial".

Yes indeed I just reproduced locally (for me it's still exp_4), I'll add a log::warn! message if verification fails for clarity.

On an unrelated note, most of the stEip1559/intrinsic tests are failing, but if I understand them it is expected, as plonky2 is expecting only well formed transactions? As those aren't and we are still passing them to the prover, we get a txn root / receipt root mismatch (it wasn't caught in the past because we didn't support those tries yet).

Yes that sounds right.

I'll update the parsing logic, those malformed txns usually have a "valid": false field, should be able to prune them during parsing.

I could also update the logic to keep their original name like foo_d1g0v0_Shanghai, shouldn't be too hard to do

As you prefer. I think it could be easier that way to pinpoint exactly which tests are failing.

I'll do that!

Nashtare · 2023-10-06T16:08:46Z

@wborgeaud FYI We can now run it with the actual variant test names. I've updated https://github.com/topos-protocol/evm-tests-suite-parsed accordingly.

Nashtare · 2023-10-06T16:54:32Z

Also there appear to still be issues with the RLP encoding. I think I spotted them, I'll probably refactor things this afternoon, assuming we're always having a single txn per block in our tests (which is the case). This should simplify the case disjunctions with all those weird RLP encoding choices.

Nashtare · 2023-11-07T21:26:34Z

Latest commits adds withdrawals from plonky2 GenerationInputs (although not a single test for Shanghai in BlockchainTests/GeneralStateTests has a non-empty withdrawals vector). Also updated the serialized tests accordingly.

BGluth

LGTM!

Remove intermediary block blooms

Nashtare added 2 commits September 28, 2023 11:38

Deserialize tests from BlockchainTests folder instead

9bb9281

Remove revm variants

7887a60

Not needed anymore

Nashtare requested a review from BGluth September 28, 2023 16:11

Nashtare self-assigned this Sep 28, 2023

Nashtare commented Sep 28, 2023

View reviewed changes

README.md Show resolved Hide resolved

Nashtare force-pushed the update_latest branch from cf795ec to 7887a60 Compare October 1, 2023 00:43

Nashtare added 5 commits October 1, 2023 08:52

Deserialize from RLPs directly and handle all test cases

7472749

Bump deps and process all tests

db8bf43

Update serialization and cleanup

448575f

Clippy

e897b99

Move encoding to deserialize file

19098ee

Nashtare force-pushed the update_latest branch from f9d72f9 to 19098ee Compare October 3, 2023 17:27

Nashtare added 3 commits October 3, 2023 15:25

Fix storage keys that are wrongly encoded

46e350c

Clippy

60e2710

Bump proc-macro2 and update rust-toolchain file

5415f2d

Bump plonky2

b37420f

Add logging for failed verification

31a4940

Nashtare added 3 commits October 4, 2023 16:47

Ignore malformed txns

f9efb50

Keep original name for test variants

95bf77b

Bump plonky2 rev

a1c5f36

Nashtare added 5 commits October 6, 2023 15:42

Ignore tests with non-canonical encodings

e3c745c

Bump plonky2 version

72c5331

Bump plonky2

5a70480

Bump with latest plonky2 fix

a196a1c

Bump plonky2

8ae06dc

Nashtare mentioned this pull request Oct 23, 2023

Update EVM Tests to Support Cancun 0xPolygonZero/zk_evm#46

Closed

Nashtare added 3 commits October 25, 2023 11:54

Bump plonky2

c564482

Bump plonky2

4763b85

Add withdrawals to generation inputs

4cb91e4

Nashtare added 6 commits November 9, 2023 16:58

Bump plonky2 and update signed_txn type

b2a1768

Bump plonky2

ace6e02

Bump

3126e31

Ignore malformed tests with gas > 2^32-1

4abd402

Remove comment

75f1448

Clippy

385dcdb

Nashtare force-pushed the update_latest branch from dcd3c71 to 385dcdb Compare November 17, 2023 21:31

Bump plonky2

f95700d

BGluth approved these changes Nov 30, 2023

View reviewed changes

Final plonky2 bump

b8b17fd

Remove intermediary block blooms

Nashtare merged commit c87a49f into 0xPolygonZero:main Nov 30, 2023
2 checks passed

Nashtare deleted the update_latest branch November 30, 2023 18:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update with latest plonky2 and rely on BlockchainTests instead #37

Update with latest plonky2 and rely on BlockchainTests instead #37

Nashtare commented Sep 28, 2023 •

edited

Loading

Nashtare commented Sep 29, 2023 •

edited

Loading

Nashtare commented Oct 1, 2023 •

edited

Loading

Nashtare commented Oct 4, 2023

Nashtare commented Oct 4, 2023

wborgeaud commented Oct 4, 2023 •

edited

Loading

Nashtare commented Oct 4, 2023

Nashtare commented Oct 4, 2023

wborgeaud commented Oct 4, 2023

Nashtare commented Oct 4, 2023

Nashtare commented Oct 6, 2023

Nashtare commented Oct 6, 2023

Nashtare commented Nov 7, 2023

BGluth left a comment

Update with latest plonky2 and rely on BlockchainTests instead #37

Update with latest plonky2 and rely on BlockchainTests instead #37

Conversation

Nashtare commented Sep 28, 2023 • edited Loading

Nashtare commented Sep 29, 2023 • edited Loading

Nashtare commented Oct 1, 2023 • edited Loading

Nashtare commented Oct 4, 2023

Nashtare commented Oct 4, 2023

wborgeaud commented Oct 4, 2023 • edited Loading

Nashtare commented Oct 4, 2023

Nashtare commented Oct 4, 2023

wborgeaud commented Oct 4, 2023

Nashtare commented Oct 4, 2023

Nashtare commented Oct 6, 2023

Nashtare commented Oct 6, 2023

Nashtare commented Nov 7, 2023

BGluth left a comment

Choose a reason for hiding this comment

Nashtare commented Sep 28, 2023 •

edited

Loading

Nashtare commented Sep 29, 2023 •

edited

Loading

Nashtare commented Oct 1, 2023 •

edited

Loading

wborgeaud commented Oct 4, 2023 •

edited

Loading