Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update with latest plonky2 and rely on BlockchainTests instead #37

Merged
merged 31 commits into from
Nov 30, 2023

Conversation

Nashtare
Copy link
Contributor

@Nashtare Nashtare commented Sep 28, 2023

This PR updates the way tests are being deserialized to handle the latest version of plonky2.

Major points:

  • variants in BlockchainTests are on a per transaction basis, which required changing a bit the parsing logic
  • with latest plonky2, roots are being checked directly in the kernel, and result in a KernelPanic error if there is a mismatch, so there is no need for revm_variants as we would error before.
  • deserialization during test parsing is, I agree, much slower than previously. The main issue is that the amount of data being parsed is much larger, and it seems serde wants to access all values in a HashMap, which kinda bothers me (I originally wanted to read keys and only process values for the ones related to Shanghai).
  • Related to last point, a way that should be faster would be to read only the RLP of a block, and decode its header from it, though I've been having trouble making it work, so in the meantime I'm still opening the PR for review, as doing this should only impact the deserialize.rs file anyway.
  • Some tests are not being parsed at the moment, as they don't have a transactions field and need special handling (usually ones with exception cases). This can be done through txn RLP decoding + ecrecover-like process to access the sender address. I added it as a TODO for now, also as it should have limited impact (only the deserialize.rs file). On the other hand some tests that were not parsed previously are not correctly deserialized.

@BGluth there's no rush at all for reviewing this, as I can still use it locally to run some tests. There are probably refactoring that could be done, as I may have done this in a hurried way.

Following discussion in #35, for now we still process tests as before, i.e. alter manually the gaslimit if possible, run the test, and ignore it if it was altered and still resulted in a panic. I'll update this once we handle 2-limb gaslimit / gas_used on plonky2 side.

FYI @wborgeaud I've run the updated runner against a small sample. I got 140 ✅ tests, and 1 ❌ (creationTxInitCodeSizeLimit), but haven't dived into why yet. If it's useful to you, I've gathered all the serialized GenerationInputs from the parsed tests here: https://github.com/topos-protocol/evm-tests-suite-parsed.

closes #35

@Nashtare Nashtare requested a review from BGluth September 28, 2023 16:11
@Nashtare Nashtare self-assigned this Sep 28, 2023
README.md Show resolved Hide resolved
@Nashtare
Copy link
Contributor Author

Nashtare commented Sep 29, 2023

@wborgeaud Actually it seems there's an issue with the RLP encoding derived from the LegacyTransactionRlp struct on plonky2 (and probably type 1 / type 2 as well) coming from the data segment. For many tests, I can't retrieve the txn rlp computed inside the block rlp. This may be the source of the discrepancy for the mentioned failing test, as in this particular case it fails with a transaction_trie_root mismatch. Still investigating.

@Nashtare
Copy link
Contributor Author

Nashtare commented Oct 1, 2023

I've addressed the TODOs mentioned in the two last points, but it's pegged to the branch in 0xPolygonZero/plonky2#1263 for now as I needed to derive Clone + fix encoding of AccessLists, I'll push the changes once that PR is merged.

It doesn't solve the issue mentioned above, arising for instance in creationTxInitCodeSizeLimit -> the error seems to be coming from the rlp library instead of plonky2.

@Nashtare
Copy link
Contributor Author

Nashtare commented Oct 4, 2023

Note that once 0xPolygonZero/plonky2#1276 is merged, we should be good here. I'll update the serialized json files once again after the final bump.

@Nashtare
Copy link
Contributor Author

Nashtare commented Oct 4, 2023

I've updated the serialized inputs at https://github.com/topos-protocol/evm-tests-suite-parsed/tree/main.

@wborgeaud Just FYI, so far I have 2310 ✅ / 13 ❌ .
Out of the failing ones:

  • 9 MeasureGas for which we know why they are failing (I think Jacqueline had written a write-up on how she was envisioning handling the gas, as those tests are erroring at sys_stop)
  • creationTxInitCodeSizeLimit_0 as mentioned earlier. I was hoping the to field fix would solve it but it seems it has another issue somewhere.
  • expPower256Of256
  • exp_4
  • twoOps

We'll start diving into them to see why they are failing. I'll keep you updated on discord if other tests appear to fail!

@wborgeaud
Copy link
Contributor

wborgeaud commented Oct 4, 2023

Thanks @Nashtare !
It looks like exp_4 is passing on my machine. What kind of error are you getting?

@Nashtare
Copy link
Contributor Author

Nashtare commented Oct 4, 2023

Have you tried running the others? I noticed that they were getting out of order when being processed, so that may be the reason the indices aren't right?
I haven't run the runner with logs, so I couldn't say what was the error, I can check when I'm back.

On an unrelated note, most of the stEip1559/intrinsic tests are failing, but if I understand them it is expected, as plonky2 is expecting only well formed transactions? As those aren't and we are still passing them to the prover, we get a txn root / receipt root mismatch (it wasn't caught in the past because we didn't support those tries yet).

@Nashtare
Copy link
Contributor Author

Nashtare commented Oct 4, 2023

I could also update the logic to keep their original name like foo_d1g0v0_Shanghai, shouldn't be too hard to do

@wborgeaud
Copy link
Contributor

Have you tried running the others? I noticed that they were getting out of order when being processed, so that may be the reason the indices aren't right? I haven't run the runner with logs, so I couldn't say what was the error, I can check when I'm back.

Ah yes that's it. For me it's exp_0 that's failing with the verifier error "Mismatch between evaluation and opening of quotient polynomial".

On an unrelated note, most of the stEip1559/intrinsic tests are failing, but if I understand them it is expected, as plonky2 is expecting only well formed transactions? As those aren't and we are still passing them to the prover, we get a txn root / receipt root mismatch (it wasn't caught in the past because we didn't support those tries yet).

Yes that sounds right.

I could also update the logic to keep their original name like foo_d1g0v0_Shanghai, shouldn't be too hard to do

As you prefer. I think it could be easier that way to pinpoint exactly which tests are failing.

@Nashtare
Copy link
Contributor Author

Nashtare commented Oct 4, 2023

For me it's exp_0 that's failing with the verifier error "Mismatch between evaluation and opening of quotient polynomial".

Yes indeed I just reproduced locally (for me it's still exp_4), I'll add a log::warn! message if verification fails for clarity.

On an unrelated note, most of the stEip1559/intrinsic tests are failing, but if I understand them it is expected, as plonky2 is expecting only well formed transactions? As those aren't and we are still passing them to the prover, we get a txn root / receipt root mismatch (it wasn't caught in the past because we didn't support those tries yet).

Yes that sounds right.

I'll update the parsing logic, those malformed txns usually have a "valid": false field, should be able to prune them during parsing.

I could also update the logic to keep their original name like foo_d1g0v0_Shanghai, shouldn't be too hard to do

As you prefer. I think it could be easier that way to pinpoint exactly which tests are failing.

I'll do that!

@Nashtare
Copy link
Contributor Author

Nashtare commented Oct 6, 2023

@wborgeaud FYI We can now run it with the actual variant test names. I've updated https://github.com/topos-protocol/evm-tests-suite-parsed accordingly.

@Nashtare
Copy link
Contributor Author

Nashtare commented Oct 6, 2023

Also there appear to still be issues with the RLP encoding. I think I spotted them, I'll probably refactor things this afternoon, assuming we're always having a single txn per block in our tests (which is the case). This should simplify the case disjunctions with all those weird RLP encoding choices.

@Nashtare
Copy link
Contributor Author

Nashtare commented Nov 7, 2023

Latest commits adds withdrawals from plonky2 GenerationInputs (although not a single test for Shanghai in BlockchainTests/GeneralStateTests has a non-empty withdrawals vector). Also updated the serialized tests accordingly.

Copy link
Collaborator

@BGluth BGluth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Remove intermediary block blooms
@Nashtare Nashtare merged commit c87a49f into 0xPolygonZero:main Nov 30, 2023
2 checks passed
@Nashtare Nashtare deleted the update_latest branch November 30, 2023 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use tests in BlockchainTests
3 participants