-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing creator transaction hashes after migration #1573
Comments
If we go with the second option we are going to have a dirty DB, that is contracts will have a creatorTxHash but no creation bytecodes. I think we should re-verify them. We'll need a script that needs to connect to the both filesystems, the current one on GCP and the legacy one running on Argo. This script needs two lists:
Then it should take an intersection of these two lists (address+chainId as identifiers). This would give us the full list that needs modification and also the number of contracts affected. We can write this list of contracts in a new database instance, a local one etc. to mark off the ones that are verified. Finally we send these contracts to re-verification on the migration instance (the instance container image needs to be updated) and have them re-verified. Bear in mind we need a small change in the code to trigger updating the row of that contract is it goes from a Also before all of this review #1572 |
I think we can optimize this process by:
|
Yes we have to do 1. and 2. regardless. Let's do that. Just not sure if using the prod database is a good idea. Maybe create a user that only has read rights to the main tables but writes just to the new table. Regardin the rest, I'm not sure if it's a good idea to accept the creatorTxHash even if we can't fetch the Transaction from this hash. That's basically what this PR is doing. First thing is, we are not checking if this txHash is indeed the txHash that created this contract. Yes we get this from a somewhat reliable resource (Etherscan, Blockscout etc.) but I think we should (not must) check this. How we do this now is by fetching the Transaction object and checking if the 'new contract address algorithm', that is the deployer address + nonce yields the address of the contract being verified. Second thing is, even if we decide it's ok to write the creatorTxHash without double checking, we still need the values deployer, block_number, and tx_index of the contract. These are available only if we fetch the Transaction via this hash. I don't think it would make sense if we only write the creatorTxHash but not these values. What we can do maybe is we can fail the whole verification if we can't fetch the Transaction from the expected API like Etherscan, instead of saving this contract as just a "runtime match". The verification will fail and it needs to be submitted again. In that case (or actually regardless) we should monitor the errors in fetching Transaction's and get notified if e.g. we are hitting Etherscan API limits. |
I agree we should not accept creatorTxHash regardless. From my POV, we should implement a new service that goes through the database and tries to re-verify all the contracts that don't have creation information. But I'm wondering if it makes sense to first implement a more reliable way to fetch the creation bytecode, for example, by starting to index all creation bytecode. |
I'm looking to backfill this data by re-verifying those contracts with missing tx hashes, and consequently without I'll report progress here and leave some queries for future reference: Query to select contracts without a `transaction_hash`SELECT
vc.id as verification_id,
vc.created_at,
vc.created_by,
cd.chain_id,
cd.address,
cd.transaction_hash,
cd.block_number,
cd.transaction_index,
vc.creation_match,
vc.runtime_match
FROM verified_contracts vc
JOIN contract_deployments cd ON vc.deployment_id = cd.id
WHERE cd.transaction_hash = '\x'::bytea
OR cd.transaction_hash IS NULL
ORDER BY cd.chain_id, cd.block_number
limit 100; Missing `transaction_hash`es per chainSELECT
cd.chain_id,
COUNT(*) as missing_tx_hash_count
FROM verified_contracts vc
JOIN contract_deployments cd ON vc.deployment_id = cd.id
WHERE cd.transaction_hash = '\x'::bytea
OR cd.transaction_hash IS NULL
GROUP BY cd.chain_id
ORDER BY missing_tx_hash_count DESC; Result of above, missing `transaction_hashes` per chain in production
Query to see the missing `transaction_hash`es as a percentage of total verified per chainWITH missing_txs AS (
SELECT
cd.chain_id,
COUNT(*) as missing_tx_count
FROM verified_contracts vc
JOIN contract_deployments cd ON vc.deployment_id = cd.id
WHERE cd.transaction_hash = '\x'::bytea
OR cd.transaction_hash IS NULL
GROUP BY cd.chain_id
),
total_verifications AS (
SELECT
cd.chain_id,
COUNT(*) as total_count
FROM verified_contracts vc
JOIN contract_deployments cd ON vc.deployment_id = cd.id
GROUP BY cd.chain_id
)
SELECT
t.chain_id,
COALESCE(m.missing_tx_count, 0) as missing_tx_count,
t.total_count as total_verifications,
ROUND(COALESCE(m.missing_tx_count::numeric * 100 / t.total_count, 0), 2) as missing_percentage
FROM total_verifications t
LEFT JOIN missing_txs m ON t.chain_id = m.chain_id
ORDER BY missing_tx_count DESC; Result of above
Query to see the total number of connections on the DBSELECT
client_ip,
connection_count,
percentage
FROM (
SELECT
CASE
WHEN client_addr IS NULL THEN 'local/null'
ELSE client_addr::text
END as client_ip,
COUNT(*) as connection_count,
ROUND(100.0 * COUNT(*) / (SELECT COUNT(*) FROM pg_stat_activity), 2) as percentage,
2 as sort_order
FROM pg_stat_activity
GROUP BY client_addr
UNION ALL
SELECT
'TOTAL' as client_ip,
COUNT(*) as connection_count,
100.00 as percentage,
1 as sort_order
FROM pg_stat_activity
) t
ORDER BY sort_order, connection_count DESC; |
Done Verifying 99% of the contracts. There are some left but we don't want to dig into each of them for now. We can run another round of re-verification after #1665 and #1367 and see if they'll verify. Results
|
We found that some contracts that had a
creator-tx-hash.txt
in the/files/any/
endpoint shown in our old kubernetes deployment didn't have it after the migration for our new setup. Thetransaction_hash
column incontract_deployments
is simply NULL for these cases.I investigated two examples:
0x00000000219ab540356cBB839Cbe05303d7705Fa on chain 1
The logs show that there was a creatorTxHash found but fetching the creation bytecode failed. The latter probably happened due to an unresponsive RPC.
We should still store the creatorTxHash in these cases. This is fixed by #1572
0x4E7095a3519A33dF3D25774c2F9D7a89eB99745D on chain 11155111
Here the log is a bit misleading. The fetching of the creatorTxHash failed actually because no
creatorTx
is in the jsonPayload. Therefore, lib-sourcify also didn't match with creation tx. I fixed the logging also in the above PR.So in this case the cause was probably the Etherscan API being unresponsive.
Database problem
What still remains is that we probably have a lot of contracts after the migration for which we are missing the creation tx hashes. I queried the DB and it shows that there are 3.7M contract_deployments without tx_hash and 1.4M with the tx_hash. This seems a lot without tx hash to me. Unfortunately, we cannot know how it was before because we have never recovered the old production DB fully.
Two options two fix this problem now:
creator-tx-hash.txt
files and add them manually to the database if not present yet.A side note: There are a probably more problems after the migration. We should check the database more thoroughly. An unresponsive RPC could also have caused a change in the match status after the migration.
The text was updated successfully, but these errors were encountered: