You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In Orchestrator's Ethereum event watcher, function get_block_delay() is used to determine the block processing delay wrt. the latest block observed on Ethereum:
/// Let's make a conservative assumption of 1% chance of an uncle being a two block deep reorg/// (actual is closer to 0.3%) and assume that continues as we increase the depth./// Given an uncle every 2.8 minutes, a 6 deep reorg would be 2.8 minutes * (100^4) or one/// 6 deep reorg every 53,272 years.///pubasyncfnget_block_delay(web3:&Web3) -Uint256{
let net_version = get_net_version_with_retry(web3).await;match net_version {// Mainline Ethereum, Ethereum classic, or the Ropsten, Kotti, Mordor testnets// all POW Chains1 | 3 | 6 | 7 =6u8.into(),// Dev, our own Gravity Ethereum testnet, and Hardhat respectively// all single signer chains with no chance of any reorgs2018 | 15 | 31337 =0u8.into(),// Rinkeby and Goerli use Clique (POA) Consensus, finality takes// up to num validators blocks. Number is higher than Ethereum based// on experience with operational issues4 | 5 =10u8.into(),// assume the safe option (POW) where we don't know
_ =6u8.into(),}}
It is employed in the function check_for_events() uniformly in the following way (here shown only for deposits):
As can be seen, the block delay is subtracted from the latest block number, and the events are then retrieved up to the adjusted block number, and processed.
There are three problems with the above approach.
The Ethereum block delay is too low
The above calculation of probabilities is not well-justified: probability theory simply doesn't apply here as Ethereum reorganizations are not random events. The correct estimation of a block delay is a risk assessment, and is a very complicated process. We are not at a position to advice on the correct block delay number, but it makes sense to take a glimpse on what Ethereum block delays are used by crypto exchanges:
Binance: 12 blocks (changed from 30 two years ago)
As can be seen, the minimal block delay used by crypto exchanges is 12, and the average around 22, which is much larger than 6, used by Gravity Bridge.
No detection / mitigation plan in case of a long reorg
No block delay can guarantee that a reorg longer than that won't happen eventually, as DAO / Ethereum Classic case demonstrates well. While nothing can protect users against long reorgs, those events should be detected and mitigated. On the contrary, Gravity Bridge doesn't include any detection or mitigation mechanism against such events, and will proceed further, possibly causing more damage than necessary.
No testing against reorgs
As can be seen in the above code, testnets do not exercise a case of reorgs at all, even up to the given bound: block delay for testnets is taken to be 0. This doesn't provide any confidence, even a minimal one, that the code will behave correctly when deployed on the Ethereum main net.
Problem Scenarios
A reorg happens on the Ethereum main net with the length larger than it is currently assumed by Gravity Bridge (6)
This creates a possibility of double spends, or other undesirable consequences;
Without a detection / mitigation plan, Gravity Bridge will proceed as if nothing has happened, possibly causing more damage.
The testnets use 0 block delay
It is not clear how the code will behave even in presence of reorgs up to the currently present block delay.
Recommendation
Short term
Undertake risk assessment for the Gravity Bridge. At the very least, orient to the Ethereum block delays used by crypto exchanges, and increase the value used by Gravity Bridge to the average block delay employed there (~ 20 blocks).
In the testnets, undertake testing against Ethereum reorgs both up to, and above the assumed block delay. Make sure that up to the assumed block delay the code behaves correctly, and thoroughly understand possible effects of long reorgs.
Long term
Implement detection / mitigation mechanisms in case of an Ethereum reorg longer than the assumed block delay happens. The detection mechanism should make sure, at the very least, that this event is detected, and communicated to validators. The mitigation mechanism should probably just stop the Gravity Bridge operation; and it should also implement the method to resume the operation after the damage assessment and repair (e.g. in the form of a special message signed by the quorum of validators).
The text was updated successfully, but these errors were encountered:
Original Isuue
Surfaced from @informalsystems audit of Althea Gravity Bridge at commit 19a4cfe
severity: High
type: Protocol / Implementation bug
difficulty: Unknown
Involved artifacts
Description
In Orchestrator's Ethereum event watcher, function get_block_delay() is used to determine the block processing delay wrt. the latest block observed on Ethereum:
It is employed in the function check_for_events() uniformly in the following way (here shown only for deposits):
As can be seen, the block delay is subtracted from the latest block number, and the events are then retrieved up to the adjusted block number, and processed.
There are three problems with the above approach.
The Ethereum block delay is too low
The above calculation of probabilities is not well-justified: probability theory simply doesn't apply here as Ethereum reorganizations are not random events. The correct estimation of a block delay is a risk assessment, and is a very complicated process. We are not at a position to advice on the correct block delay number, but it makes sense to take a glimpse on what Ethereum block delays are used by crypto exchanges:
Kraken: 20 blocks
Coinbase: 35 blocks
Binance: 12 blocks (changed from 30 two years ago)
As can be seen, the minimal block delay used by crypto exchanges is 12, and the average around 22, which is much larger than 6, used by Gravity Bridge.
No detection / mitigation plan in case of a long reorg
No block delay can guarantee that a reorg longer than that won't happen eventually, as DAO / Ethereum Classic case demonstrates well. While nothing can protect users against long reorgs, those events should be detected and mitigated. On the contrary, Gravity Bridge doesn't include any detection or mitigation mechanism against such events, and will proceed further, possibly causing more damage than necessary.
No testing against reorgs
As can be seen in the above code, testnets do not exercise a case of reorgs at all, even up to the given bound: block delay for testnets is taken to be 0. This doesn't provide any confidence, even a minimal one, that the code will behave correctly when deployed on the Ethereum main net.
Problem Scenarios
A reorg happens on the Ethereum main net with the length larger than it is currently assumed by Gravity Bridge (6)
This creates a possibility of double spends, or other undesirable consequences;
Without a detection / mitigation plan, Gravity Bridge will proceed as if nothing has happened, possibly causing more damage.
The testnets use 0 block delay
It is not clear how the code will behave even in presence of reorgs up to the currently present block delay.
Recommendation
Short term
Undertake risk assessment for the Gravity Bridge. At the very least, orient to the Ethereum block delays used by crypto exchanges, and increase the value used by Gravity Bridge to the average block delay employed there (~ 20 blocks).
In the testnets, undertake testing against Ethereum reorgs both up to, and above the assumed block delay. Make sure that up to the assumed block delay the code behaves correctly, and thoroughly understand possible effects of long reorgs.
Long term
The text was updated successfully, but these errors were encountered: