Accumulate candidates and votes from previous iterations, same round #1038

goshawk-3 · 2023-09-05T13:45:07Z

To accomplish the experiment's goal, this PR implements following ideas:

consensus: Always keep aggregators state of both reduction steps within the context of a single round.
consensus: When a message from a former step arrives, collect it instead of discarding it
node-data: Use different topic IDs for first and second reduction message
consensus: Generate an agreement for any iteration once step_votes for both reductions are produced.
consensus: Vote for a candidate block from a former iteration, same round.
node: Broadcast any blocks equal to local tip
fallback: Try to accept the received block if a node falls back to previous block only
fallback: Filter out blacklisted blocks in both modes (inSync and OutOfSync)
fallback: Blacklist the hash from the fork
node: Initiate a recovery procedure if a node is on a fork

…sages

…tions, same round

…mittee

- Try to accept the received block if node has fallbacked to previous block only - Filter out blacklisted blocks in both modes (inSync and OutOfSync) - Blacklist the hash from the fork

fed-franz · 2023-09-21T13:24:20Z

consensus/src/execution_ctx.rs

+        }
+
+        if let Err(e) = self.outbound.send(msg.clone()).await {
+            error!("could not send newblock msg due to {:?}", e);


should this adapt to the msg type instead of newblock?

fed-franz · 2023-09-22T14:41:02Z

consensus/src/aggregator.rs

@@ -18,7 +18,9 @@ use tracing::{debug, error, warn};
 /// voters.StepVotes Mapping of a block hash to both an aggregated signatures
 /// and a cluster of bls voters.
 #[derive(Default)]
-pub struct Aggregator(BTreeMap<Hash, (AggrSignature, Cluster<PublicKey>)>);
+pub struct Aggregator(
+    BTreeMap<(u8, Hash), (AggrSignature, Cluster<PublicKey>)>,


I'm not sure we need the "step" part in the key.
There's no collision between the block hashes...

Does it have another goal?

If this is not introduced we accumulate votes for empty hash during all iterations. Before introducing this fix, I noticed votes for an empty hash were more than 67%.

I see.
This is due to the ambiguity of votes. The vote should be separate from the block hash so as to distinguish the case of invalid candidate from that of the candidate not being received. Still, in the case of empty candidate, the iteration/step is indeed needed.

That said, I think nodes should not vote on empty hash. Instead, they should vote NIL only on invalid candidates.
That is:

if no candidate is received, do not vote

if a invalid candidate is received, vote NIL on the candidate

Note that voting nil for not having received a block is also in contrast to waiting (and voting) for previous-iteration blocks, as this could lead to the same provisioner voting NIL (because it didn't receive the block) and then vote the block's hash when receiving it while on a later iteration.

That said, I think nodes should not vote on empty hash.

What metrics do you expect to improve by implementing this patch not voting on empty hash.? Why do you propose it here as a comment instead of posting a DIP with better and complete explanation?

I wrote it here as part of the discussion. DIP/Issues can always stem from comments :)
I'll do write a proposal for this, but, in short, I think voting on empty hash yields the same issues as not collecting votes from previous iterations.
Either way, if we vote nil when not receiving a block, we shouldn't vote on previous-iteration blocks, or we could produce a double (and opposite) vote on the same block.

Created an Issue here

As mentioned in the DIP Issue, relative to this PR, if the node votes NIL for a round/iteration after the timeout and then votes on the candidate when receiving it (vote_for_former_candidate) it would produce two votes for the same block, creating potential indeterminism (nodes can choice any of the two votes).

Simply put,
Nil vote means a vote for tuple (empty hash, round, step)
A real vote means a vote for tuple (candidate_block_hash, round, step)

nodes can choice any of the two votes

Not the case at all. Nodes can reach quorum for an empty hash at iteration 1 and then move forward and reach quorum for candidate block of the iteration 1 while running the second iteration. The output of this case is a valid finalized block.

creating potential indeterminism (nodes can choice any of the two votes).

There is no indeterminism as these are not the same blocks. Also, voting NIL only helps provisioners to move forward when a candidate block is not generated, it has not other impact other than that.

We do not currently distinguish between NIL for unknown candidate and NIL for invalid block.
In both cases, the vote will be (round, step, empty_hash).
So, a NIL vote is effectively valid as a negative vote on the candidate, whichever that might be.

If we produce both a NIL and a candidate votes, they would be a condtradictive vote for the same block.
And if I receive both votes I can use either for the quorum, and they would be both valid votes.
In fact, in extreme cases, there could be a quorum on both NIL and the candidate.

Example 1:

I'm at round/step

I don't receive the block, so I vote NIL (empty,round,step)

I then receive the block and vote (candidate,round,step)

Now there are two votes for the same round-step.
Let's say the committee is split in half (half voted NIL, half voted the candidate) and my vote is the one deciding the quorum.
Nodes receiving the NIL vote first will reach quorum on NIL, and move to the next iteration;
Nodes receiving the candidate vote first will reach quorum on the block, and move to the next round.
Both nodes will have a valid quorum for the same round and step, but one is NIL and one is for the candidate.

Example 2:

let's say all nodes are at round/step

the candidate is delayed, so all provisioners vote NIL

then the block is broadcasted and they all vote for it

Depending on which votes they receive first, nodes can either reach quorum on the candidate or on the empty hash.

Not the case at all. Nodes can reach quorum for an empty hash at iteration 1 and then move forward and reach quorum for candidate block of the iteration 1 while running the second iteration. The output of this case is a valid finalized block.

You're only seeing the best-case scenario.
In this case you still have two valid quorums on for the same round/iteration.
If a quorum is reached on iteration 2, it's possible that some nodes will move to the next round while others will accept the first-iteration block and move to the next round.

Reaching quorum on a tuple (empty_hash, round, step) does not impact reaching quorum for any candidate block of the same round.

I'd suggest to pair up (when you're available) and review with you the implementation to ensure we're on the same page.

consensus/src/execution_ctx.rs

fed-franz · 2023-09-26T13:50:15Z

node/src/chain/fsm.rs

+                // the network for that broadcast may damage the
+                // fallback for nodes on different fork.
+
+                self.network.write().await.broadcast(msg).await;


I don't think this is needed.
If consensus accepted the block, it means we already broadcasted it (or at least, we should have).
Also, if we already accepted the block from the network, we should not re-broadcast it to avoid possible DoS attacks (eg by flooding the network with duplicate blocks).

I'm aware of that. This would be also the case when we implement fallback for any former (non-finalized) block as we'd need to re-broadcast blocks with lower tip.

The reasoning behind this change is: Any node that has accepted a block of 1st iteration will not re-propagate it eclipsing the network for other (non-consensus) guys that are still on the wrong fork.

I don't follow the reasoning...
If the block is known, being it because it was received by peers or produced by consensus, it would have been already propagated to the node's peers.
Re-broadcasting the same block when received from the network would be a duplicate.
If nodes accepted that block at the 1st iteration, they will have broadcasted the same message they are receiving now.

Noted. I'll remove it once I confirm that in a case of fallback all nodes are receiving the correct block.

…to different iteration

… to different iteration

…available

…nt is triggered

…f a single round

goshawk-3 · 2023-10-23T09:27:12Z

Closed in favor of PR: #1098

goshawk added 4 commits September 5, 2023 10:06

consensus: Reduction steps accumulates votes for any iteration

2536c11

consensus: Initial impl of RoundCtx struct

61475de

consensus: Integrate RoundCtx into both Reduction steps

b4a4eba

node-data: Add Copy trait impl

426d49e

goshawk-3 changed the title ~~Accumulate messages from previous iterations~~ Accumulate candidates and votes from previous iterations, same round Sep 5, 2023

goshawk and others added 18 commits September 5, 2023 20:05

consensus: Use different topic IDs for first and second reduction mes…

13a7b55

…sages

node-data: Use different topic IDs for first and second reduction mes…

ff00966

…sages

node: Enable FirstReduction and SecondReduction messages in Chain

9480e6b

consensus: compile block with certificate

094a707

consensus: Collect past events and produce agreements from past itera…

c0c7771

…tions, same round

rusk-recovery: Enable 64 stakers in genesis

784ff09

consensus: Fix logging info

e9d99f6

consensus: Fix double-lock issue

7fc0cfe

consensus: Implement vote_for_former_candidate

bd2cb8f

node: Broadcast any blocks equal to local tip

3176713

node: Broadcast any blocks equal to local tip

bc7822e

consensus: Vote on former candidate only if am_member of the step com…

2d1a585

…mittee

node: Print blocks received in in_sync mode

d74daa8

Merge branch 'master' into collect_past_messages

9ef3a84

node: Improve fallback procedure

1ade3a3

- Try to accept the received block if node has fallbacked to previous block only - Filter out blacklisted blocks in both modes (inSync and OutOfSync) - Blacklist the hash from the fork

node: Randomize the unique counter in network::send call

5d33233

consensus: Fix clippy warnings

becdc5e

consensus: Add step in Aggregator::get_total

58495c4

goshawk-3 requested a review from fed-franz September 20, 2023 08:50

goshawk-3 marked this pull request as ready for review September 20, 2023 09:04

consensus: Republish a drained message, if valid

3cf757d

fed-franz reviewed Sep 26, 2023

View reviewed changes

fed-franz mentioned this pull request Sep 28, 2023

Proposal: do not vote on empty hash dusk-network/dips#4

Open

goshawk added 2 commits September 29, 2023 13:29

consensus: Return from firststep::collect call if step_votes belongs …

d80c1ae

…to different iteration

consensus: Return from secondstep::collect call if step_votes belongs…

ad880cc

… to different iteration

goshawk added 2 commits September 29, 2023 13:32

consensus: Fix error discription

38dc58d

node: Request missing blocks from multiple sources if dest_addr is un…

28f3e7a

…available

goshawk-3 mentioned this pull request Oct 2, 2023

Block from agreement #1020

Closed

goshawk added 8 commits October 3, 2023 11:51

consensus: Decrease default consensus_timeout to 5s

fc02bc7

node: Request updates from random peers when accept_block_timeout eve…

9f1d7da

…nt is triggered

consensus: Decrease default consensus_timeout_ms to 2000

ce62586

consensus: Apply a delay in block generator accordingly

bbcd402

node: Trace duration of accept/finalize call execution

2b531f6

consensus: Move committee stores from phases into IterCtx

7ff415f

consensus: Maintain the iteraton_ctx object throughout the duration o…

df539c1

…f a single round

consensus: Refactor/Rename round_ctx into StepVotesRegistry

f45e193

goshawk-3 mentioned this pull request Oct 23, 2023

consensus: Enable reaching a quorum for any candidate from a former iteration within the same round. #1097

Closed

goshawk-3 closed this Oct 23, 2023

goshawk-3 deleted the collect_past_messages branch February 8, 2024 10:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accumulate candidates and votes from previous iterations, same round #1038

Accumulate candidates and votes from previous iterations, same round #1038

goshawk-3 commented Sep 5, 2023 •

edited

Loading

fed-franz Sep 21, 2023

goshawk-3 Sep 27, 2023

fed-franz Sep 22, 2023

goshawk-3 Sep 27, 2023

fed-franz Sep 27, 2023

goshawk-3 Sep 28, 2023

fed-franz Sep 28, 2023

fed-franz Sep 28, 2023

fed-franz Sep 28, 2023

goshawk-3 Sep 29, 2023 •

edited

Loading

fed-franz Sep 29, 2023

goshawk-3 Oct 2, 2023

fed-franz Sep 26, 2023

goshawk-3 Sep 27, 2023

fed-franz Sep 27, 2023

goshawk-3 Sep 28, 2023

goshawk-3 commented Oct 23, 2023 •

edited

Loading

Accumulate candidates and votes from previous iterations, same round #1038

Accumulate candidates and votes from previous iterations, same round #1038

Conversation

goshawk-3 commented Sep 5, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

goshawk-3 Sep 29, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

goshawk-3 commented Oct 23, 2023 • edited Loading

goshawk-3 commented Sep 5, 2023 •

edited

Loading

goshawk-3 Sep 29, 2023 •

edited

Loading

goshawk-3 commented Oct 23, 2023 •

edited

Loading