Document not found (404)
+This URL is invalid, sorry. Please use the navigation bar or search to continue.
+ +diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 0000000..e69de29 diff --git a/docs b/docs deleted file mode 120000 index b2dbf19..0000000 --- a/docs +++ /dev/null @@ -1 +0,0 @@ -build/html \ No newline at end of file diff --git a/docs/.nojekyll b/docs/.nojekyll new file mode 100644 index 0000000..f173110 --- /dev/null +++ b/docs/.nojekyll @@ -0,0 +1 @@ +This file makes sure that Github Pages doesn't process mdBook's output. diff --git a/docs/404.html b/docs/404.html new file mode 100644 index 0000000..59af5df --- /dev/null +++ b/docs/404.html @@ -0,0 +1,217 @@ + + +
+ + +This URL is invalid, sorry. Please use the navigation bar or search to continue.
+ +This design augments the existing Zcash Proof‑of‑Work (PoW) network with a new consensus layer which provides trailing finality, called the Trailing Finality Layer (TFL).
+This layer enables blocks produced via PoW to become final which ensures they may never be rolled back. This enables safer and simpler wallets and other infrastructure, and aids trust-minimized cross-chain bridges.
+This consensus layer uses a finalizing Proof-of-Stake (PoS) consensus protocol, and enables ZEC holders to earn protocol rewards for contributing to the security of the Zcash network. By integrating a PoS layer with the current PoW Zcash protocol, this design specifies a hybrid consensus protocol.
+The integration of the current PoW consensus with the TFL produces a new top-level consensus protocol referred to as PoW+TFL.
+In the following subchapters we introduce the Design at a Glance, then provide an overview of the major components of the design.
+Following this overview chapter, we proceed into a detailed Protocol Specification (TODO).
+ +Crosslink is the proposed hybrid construction for the Trailing Finality Layer. The current version is Crosslink 2.
+We are now ready to give a description of a protocol that takes into account the issues described in Notes on Snap‑and‑Chat, and that implements bounded availability. We call this the “Crosslink” construction; more precisely the version described here is “Crosslink 2”.
+This description will attempt to be self-contained, but [NTT2020] (arXiv version) is useful background on the general model of Ebb-and-Flow protocols.
+“” is a metavariable for the name of a protocol. We also use it as a wildcard in protocol names of a particular type, for example “bc” for the name of some best‑chain protocol.
+Protocols are referred to as for a name “”. Where it is useful to avoid ambiguity, when referring to a concept defined by we prefix it with “‑”.
+We do not take synchrony or partial synchrony as an implicit assumption of the communication model; that is, unless otherwise specified, messages between protocol participants can be arbitrarily delayed or dropped. A given message is received at most once, and messages are nonmalleably authenticated as originating from a given sender whenever needed by the applicable protocol. Particular subprotocols may require a stronger model.
+For an overview of communication models used to analyze distributed protocols, see this blog post by Ittai Abraham.
+Discussion of incorrect applications of the GST formalization of partial synchrony to continuously operating protocols.
+ +The original context for the definition of the partially synchronous model in [DLS1988] was for “one‑shot” Byzantine Agreement — called “the consensus problem” in that paper. The following argument is used to justify assuming that all messages from the Global Stabilization Time onward are delivered within the upper time bound :
+++Therefore, we impose an additional constraint: For each execution there is a global stabilization time (GST), unknown to the processors, such that the message system respects the upper bound from time GST onward.
+This constraint might at first seem too strong: In realistic situations, the upper bound cannot reasonably be expected to hold forever after GST, but perhaps only for a limited time. However, any good solution to the consensus problem in this model would have an upper bound on the amount of time after GST required for consensus to be reached; in this case it is not really necessary that the bound hold forever after time GST, but only up to time GST . We find it technically convenient to avoid explicit mention of the interval length in the model, but will instead present the appropriate upper bounds on time for each of our algorithms.
+
Several subsequent authors applying the partially synchronous model to block chains appear to have forgotten or neglected this context. In particular, the argument depends on the protocol completing soon after GST. Obviously a block‑chain protocol does not satisfy this assumption; it is not a “one‑shot” consensus problem.
+This assumption could be removed, but some authors of papers about block‑chain protocols have taken it to be an essential aspect of modelling partial synchrony. I believe this is contrary to the intent of [DLS1988]:
+++Instead of requiring that the consensus problem be solvable in the GST model, we might think of separating the correctness conditions into safety and termination properties. The safety conditions are that no two correct processors should ever reach disagreement, and that no correct processor should ever make a decision that is contrary to the specified validity conditions. The termination property is just that each correct processor should eventually make a decision. Then we might require an algorithm to satisfy the safety conditions no matter how asynchronously the message system behaves, that is, even if does not hold eventually. On the other hand, we might only require termination in case holds eventually. It is easy to see that these safety and termination conditions are [for the consensus problem] equivalent to our GST condition: If an algorithm solves the consensus problem when holds from time GST onward, then that algorithm cannot possibly violate a safety property even if the message system is completely asynchronous. This is because safety violations must occur at some finite point in time, and there would be some continuation of the violating execution in which eventually holds.
+
This argument is correct as stated, i.e. for the one‑shot consensus problem. Subtly, essentially the same argument can be adapted to protocols with safety properties that need to be satisfied continuously. However, it cannot correctly be applied to liveness properties of non‑terminating protocols. The authors (Cynthia Dwork, Nancy Lynch, and Larry Stockmeyer) would certainly have known this: notice how they carefully distinguish “the GST model” from “partial synchrony”. They cannot plausibly have intended this GST formalization to be applied unmodified to analyze liveness in such protocols, which seems to be common in the block‑chain literature, including in the Ebb-and-Flow paper [NTT2020] and the Streamlet paper [CS2020].
+The Ebb-and-Flow paper acknowledges the issue by saying “Although in reality, multiple such periods of (a‑)synchrony could alternate, we follow the long‑standing practice in the BFT literature and study only a single such transition.” This is not adequate: “long‑standing practice” notwithstanding, it is not valid in general to infer that properties holding for the first transition to synchrony also apply to subsequent transitions (where the protocol can be in states that would not occur initially), and it is plausible that this inference could fail for real protocols. The Streamlet paper also refers to “periods of synchrony” which indicates awareness of the issue, but then it uses the unmodified GST model in the proofs.
+Informally, to solve this issue it is necessary to also prove that existing progress is maintained during periods of asynchrony, and that during such periods the protocol remains in states where it will be able to take advantage of a future period of synchrony to make further progress.
+This provides further motivation to avoid taking the GST formalization of partial synchrony as a basic assumption.
+Note that the recent result [CGSW2024] does not contradict anything we say here. Although the GST and Unknown Latency models are “equally demanding” in the sense of existence of protocols that satisfy a given goal, this result does not show that the models are equivalent for any specific protocol. In particular the requirements of the “clock‑slowing” technique fail in practice for any protocol involving Proof‑of‑Work.
+A ‑execution is the complete set of events (message sends/receives and decisions by protocol participants) that occur in a particular run of from its initiation up to a given time. A prefix of a ‑execution is also a ‑execution. Since executions always start from protocol initiation, a strict suffix of a ‑execution is not a ‑execution.
+Times are modelled as values of a totally ordered type with minimum value . For convenience, we consider all protocol executions to start at time .
+Although protocols may be nondeterministic, an execution fixes the events that occur and times at which they occur, for the purpose of modeling.
+For simplicity, we assume that all events occur at global times in a total ordering. This assumption is not realistic in an asynchronous communication model, but it is not essential to the design or analysis and could be removed: we could use a partial happens-before ordering on events in place of a total ordering on times.
+A “‑node” is a participant in (the protocol may be implicit). A ‑node is “honest at time ” in a given execution iff it has followed the protocol up to and including time in that execution.
+A time series on type is a function assigning a value of to each time in an execution. By convention, we will write the time as a superscript: .
+A ‑chain is a nonempty sequence of ‑blocks, starting at the “genesis block” , in which each subsequent block refers to its preceding or “parent block” by a collision‑resistant hash. The “tip” of a ‑chain is its last element.
+For convenience, we conflate ‑blocks with ‑chains; that is, we identify a chain with the block at its tip. This is justified because, assuming that the hash function used for parent links is collision‑resistant, there is exactly one ‑chain corresponding to a ‑block; and conversely there is exactly one ‑block at the tip of a ‑chain.
+If is a ‑chain, means with the last blocks pruned, except that if , the result is the genesis ‑chain consisting only of .
+The block at depth in a ‑chain is defined to be the tip of . Thus the block at depth in a chain is the last one that cannot be affected by a rollback of length (this also applies when because the genesis ‑chain cannot roll back).
+Our usage of “depth” is different from [NTT2020], which uses “depth” to refer to what Bitcoin and Zcash call “height”. It also differs by from the convention for confirmation depths in zcashd
, where the tip is considered to be at depth , rather than .
+For ‑blocks and :
+A function is ‑linear iff for every where we have . (This definition can be applied to time series where , or to sequences of ‑blocks where values of are indices.)
++
If and then .
+Proof: The chain of ancestors of is -linear, and , are both on that chain.
+The notation means the sequence of for each ‑block in chain order from genesis up to and including . ( is a bound variable within this construct.)
+remove this if not used:
+We use (without a subscript on ) to mean that the transaction ledger is a prefix of . Similarly to above, means that either or ; that is, “one of and is a prefix of the other”.
+In the simplest case, a block‑chain protocol provides a single “view” that, for a given ‑execution, provides each ‑node with a time series on ‑chains. More generally a protocol may define several “views” that provide each ‑node with time series on potentially different chain types.
+We model a ‑view as a function . By convention, we will write the node index as a subscript and the time as a superscript: .
++
An execution of has Agreement on the view iff for all times , and all nodes , (potentially the same) such that is honest at time and is honest at time , we have .
+As in Snap‑and‑Chat, we depend on a BFT protocol , and a best‑chain protocol .
+See this terminology note for why we do not call a “longest‑chain” protocol.
+We modify (resp. ) to give (resp. ) by adding structural elements, changing validity rules, and changing the specified behaviour of honest nodes.
+A Crosslink 2 node must participate in both and ; that is, it must maintain a view of the state of each protocol. Acting in more specific roles such as bft‑proposer, bft‑validator, or bc‑block‑producer is optional, but we assume that all such actors are Crosslink 2 nodes.
+A bft‑node’s view includes a set of bft‑block chains each rooted at a fixed genesis bft‑block . There is a bft‑block‑validity rule (specified below), which depends only on the content of the block and its ancestors. A non‑genesis block can only be bft‑block‑valid if its parent is bft‑block‑valid. A bft‑valid‑chain is a chain of bft‑block‑valid blocks.
+Execution proceeds in a sequence of epochs. In each epoch, an honest proposer for that epoch may make a bft‑proposal.
+A bft‑proposal refers to a parent bft‑block, and specifies the proposal’s epoch. The content of a proposal is signed by the proposer using a strongly unforgeable signature scheme. We consider the proposal to include this signature. There is a bft‑proposal‑validity rule, depending only on the content of the proposal and its parent block, and the validity of the proposer’s signature.
+We extend the notation to bft‑proposals in the obvious way: if , is a bft‑proposal and its parent bft‑block, then .
+We will shorten “bft‑block‑valid bft‑block” to “bft‑valid‑block”, and “bft‑proposal‑valid bft‑proposal” to “bft‑valid‑proposal”.
+For each epoch, there is a fixed number of voting units distributed between the bft‑nodes, which they use to vote for a bft‑proposal. We say that a voting unit has been cast for a bft‑proposal at a given time in a bft‑execution, if and only if is bft‑proposal‑valid and a ballot for authenticated by the holder of the voting unit exists at that time.
+Using knowledge of ballots cast for a bft‑proposal that collectively satisfy a notarization rule at a given time in a bft‑execution, and only with such knowledge, it is possible to obtain a valid bft‑notarization‑proof . The notarization rule must require at least a two‑thirds absolute supermajority of voting units in ’s epoch to have been cast for . It may also require other conditions.
+A voting unit is cast non‑honestly for an epoch’s proposal iff:
+Note that a unit should be considered to be cast non-honestly in the case of key compromise, because it is then effectively under the control of an adversary. The key compromise may or may not be attributable to another flaw in the protocol, but such a flaw would not be a break of the consensus mechanism per se.
++
An execution of has the one‑third bound on non‑honest voting property iff for every epoch, strictly fewer than one third of the total voting units for that epoch are ever cast non‑honestly.
+It may be the case that a ballot cast for is not in honest view when it is used to create a notarization proof for . Since we are not assuming synchrony, it may also be the case that such a ballot is in honest view but that any given node has not received it (and perhaps will never receive it).
+There may be multiple distinct ballots or distinct ballot messages attempting to cast a given voting unit for the same proposal; this is undesirable for bandwidth usage, but it is not necessary to consider it to be non‑honest behaviour for the purpose of security analysis, as long as such ballots are not double‑counted toward the two‑thirds threshold.
++
The one‑third bound on non‑honest voting property considers all ballots cast in the entire execution. In particular, it is possible that a validator’s key is compromised and then used to cast its voting units for a proposal of an epoch long finished. If the number of voting units cast non-honestly for any epoch ever reaches one third of the total voting units for that epoch during an execution, then the one‑third bound on non‑honest voting property is violated for that execution.
+Therefore, validator keys of honest nodes must remain secret indefinitely. Whenever a key is rotated, the old key must be securely deleted. For further discussion and potential improvements, see tfl-book issue #140.
+A bft‑block consists of re‑signed by the same proposer using a strongly unforgeable signature scheme. It is bft‑block‑valid iff:
+A bft‑proposal’s parent reference hashes the entire parent bft‑block, i.e. proposal, proof, and outer signature.
+Neither nor the proposer’s outer signature is unique for a given . The proposer’s outer signature is however third‑party nonmalleable, by definition of a strongly unforgeable signature scheme. An “honest bft‑proposal” is a bft‑proposal made for a given epoch by a proposer who is honest in that epoch. Such a proposer will only create one proposal and only sign at most once for each epoch, and so there will be at most one “honestly submitted” bft‑block for each epoch.
+It is possible for there to be multiple bft‑valid‑blocks for the same proposal, with different notarization proofs and/or outer signatures, if the proposer is not honest. However, the property that there will be at most one “honestly submitted” bft‑block for each epoch is important for liveness, even though we cannot guarantee that any particular proposer for an epoch is honest.
+check that we are correctly using this in the liveness analysis.
+There is an efficiently computable function . For a bft‑block‑valid input block , this function outputs the last ancestor of that is final in the context of .
+The chain of ancestors is unambiguously determined because a bft‑proposal’s parent reference hashes the entire parent bft‑block; each bft‑block commits to a proposal; and the parent hashes are collision‑resistant. This holds despite the caveat mentioned above that there may be multiple bft‑valid‑blocks for the same proposal.
+must satisfy all of the following:
+It is correct to talk about the “last final block” of a given chain (that is, each bft‑valid-block unambiguously determines a bft‑valid-block ), but it is not correct to refer to a given bft‑block as objectively “bft‑final”.
+A particular BFT protocol might need adaptations to fit it into this model for , before we apply the Crosslink 2 modifications to obtain . Any such adaptions are necessarily protocol-specific. In particular:
+The intuition behind the following safety property is that:
++We say that a bft‑block is “in honest view” if a party observes it at some time at which that party is honest.
++
An execution of has Final Agreement iff for all bft‑valid blocks in honest view at time and in honest view at time , we have .
+Note that it is possible for this property to hold for an execution of a BFT protocol in an asynchronous communication model. As previously mentioned, if the one‑third bound on non‑honest voting property is ever broken at any time in an execution, then it may not be possible to maintain Final Agreement from that point on.
+Adapting the Streamlet BFT protocol.
+ +Streamlet as described in [CS2020] has three possible states of a block in a player’s view:
+By “valid” the Streamlet paper means just that it satisfies the structural property of being part of a block chain with parent hashes. The role of bft‑block‑validity in our model corresponds roughly to Streamlet’s “notarized”. It turns out that with some straightforward changes relative to Streamlet, we can identify “origbft‑block‑valid” with “notarized” and consider an origbft‑valid‑chain to only consist of notarized blocks. This is not obvious, but is a useful simplification.
+Here is how the paper defines “notarized”:
+++When a block gains votes from at least distinct players, it becomes notarized. A chain is notarized if its constituent blocks are all notarized.
+
This implies that blocks can be added to chains independently of notarization. However, the paper also says that an honest leader always proposes a block extending from a notarized chain. Therefore, only notarized chains really matter in the protocol.
+In unmodified Streamlet, the order in which a player sees signatures might cause it to view blocks as notarized out of order. Streamlet’s security analysis is in a synchronous model, and assumes for liveness that any vote will have been received by all players (Streamlet nodes) within two epochs.
+In Crosslink 2, however, we need origbft‑block‑validity to be an objectively and feasibly verifiable property. We also would prefer reliable message delivery within bounded time not to be a basic assumption of our communication model. (This does not dictate what assumptions about message delivery are made for particular security analyses.) If we did not make a modification to the protocol to take this into account, then some Crosslink 2 nodes might receive a two‑thirds absolute supermajority of voting messages and consider a BFT block to be notarized, while others might never receive enough of those messages.
+Obviously a proposal cannot include signatures on itself — but the block formed from it can include proofs about the proposal and signatures. We can therefore say that when a proposal gains a two‑thirds absolute supermajority of signatures, a block is created from it that contains a proof (such as an aggregate signature) that it had such a supermajority. For example, we can have the proposer itself make this proof once it has enough votes, sign the resulting to create a block, then submit that block in a separate message. (The proposer has most incentive to do this in order to gain whatever reward attaches to a successful proposal; it can outsource the proving task if needed.) Then the origbft‑block‑validity rule can require a valid supermajority proof, which is objectively and feasibly verifiable. Players that see an origbft‑valid‑block can immediately consider it notarized.
+Note that for the liveness analysis to be unaffected, we need to assume that the combined latency of messages, of collecting and aggregating signatures, and of block submission is such that all adapted‑Streamlet nodes will receive a notarized block corresponding to a given proposal (rather than just all of the votes for the proposal) within two epochs. Alternatively we could re‑do the timing analysis.
+With this change, “origbft‑block‑valid” and “notarized” do not need to be distinguished.
+Streamlet’s finality rule is:
+++If in any notarized chain, there are three adjacent blocks with consecutive epoch numbers, the prefix of the chain up to the second of the three blocks is considered final. When a block becomes final, all of its prefix must be final too.
+
We can straightforwardly express this as an function of a context block , as required by the model:
+For an origbft‑valid‑block , is the last origbft‑valid‑block such that either or is the second block of a group of three adjacent blocks with consecutive epoch numbers.
+Note that “When a block becomes final, all of its prefix must be final too.” is implicit in the model.
+A node’s view in includes a set of bc‑block chains each rooted at a fixed genesis bc‑block . There is a bc‑block‑validity rule (often described as a collection of “consensus rules”), depending only on the content of the block and its ancestors. A non‑genesis block can only be bc‑block‑valid if its parent is bc‑block‑valid. By “bc‑valid‑chain” we mean a chain of bc‑block‑valid blocks.
+The terminology commonly used in the block‑chain community does not distinguish between rules that are part of the consensus protocol proper, and rules required for validity of the economic computation supported by the block chain. Where it is necessary to distinguish, the former can be called “L0” consensus rules, and the latter “L1” consensus rules.
+The definition of bc‑block‑validity is such that it is hard for a block producer to extend a bc‑valid‑chain unless they are selected by a random process that chooses a block producer in proportion to their resources with an approximately known and consistent time distribution, subject to some assumption about the total proportion of resources held by honest nodes.
+There is a function , with a strict total ordering on . An honest node will choose one of the bc‑valid‑chains with highest score as the bc‑best‑chain in its view. Any rule can be specified for breaking ties.
+The function is required to satisfy for any non‑genesis bc‑valid‑chain .
+For a Proof‑of‑Work protocol, the score of a bc‑chain should be its accumulated work.
+Unless an adversary is able to censor knowledge of other chains from a node’s view, it should be difficult to cause the node to switch to a chain with a last common ancestor more than blocks back from the tip of its previous bc‑best‑chain.
+Let be a view such that is node ’s bc‑best‑chain at time . (This matches the notation used in [NTT2020].) We define to be .
+A bc‑valid‑block is assumed to commit to a collection (usually, a sequence) of bc‑transactions. Unlike in Crosslink 1 or Snap-and-Chat, we do not need to explicitly model bc‑transaction validity or impose any additional constraints on it. The consensus rules applying to bc‑transactions are entirely unchanged, including any rules that depend on bc‑block height or previous bc‑blocks. This is because Crosslink 2 never reorders or selectively “sanitizes” transactions as Snap-and-Chat does. If a bc‑block is included in a Crosslink 2 block chain then its entire parent bc‑block chain is included just as it would have been in (only modified by the structural additions described later), so block heights are also preserved.
+A “coinbase transaction” is a bc‑transaction that only distributes newly issued funds and has no inputs.
+Define so that iff has exactly one transaction that is a coinbase transaction.
+Each bc‑block is summarized by a bc‑header that commits to the block. There is a notion of bc‑header‑validity that is necessary, but not sufficient, for validity of the block. We will only make the distinction between bc‑headers and bc‑blocks when it is necessary to avoid ambiguity.
+Header validity for Proof‑of‑Work protocols.
+ +In a Proof‑of‑Work protocol, it is normally possible to check the Proof‑of‑Work of a block using only the header. There is a difficulty adjustment function that determines the target difficulty for a block based on its parent chain. So, checking that the correct difficulty target has been used relies on knowing that the header’s parent chain is valid.
+Checking header validity before expending further resources on a purported block can be relevant to mitigating denial‑of‑service attacks that attempt to inflate validation cost.
+Typically, Bitcoin‑derived best chain protocols do not need much adaptation to fit into this model. The model still omits some details that would be important to implementing Crosslink 2, but distracting for this level of abstraction.
+We make an assumption on executions of that we will call Prefix Consistency (introduced in [PSS2016, section 3.3] as just “consistency”):
++
An execution of has Prefix Consistency at confirmation depth , iff for all times and all nodes , (potentially the same) such that is honest at time and is honest at time , we have that .
+Explain the confusion in the literature about what variants of this property are called.
+ +The literature uses the same name, “common‑prefix property”, for two different properties of very different strength.
+[PSS2016, section 3.3] introduced the stronger variant. That paper first describes the weaker variant, calling it the “common‑prefix property by Garay et al [GKL2015].” Then it explains what is essentially a bug in that variant, and describes the stronger variant which it just calls “consistency”:
+++The common‑prefix property by Garay et al [GKL2015], which was already considered and studied by Nakamoto [Nakamoto2008], requires that in any round , the record chains of any two honest players , agree on all, but potentially the last , records. We note that this property (even in combination with the other two desiderata [of Chain Growth and Chain Quality]) provides quite weak guarantees: even if any two honest parties perfectly agree on the chains, the chain could be completely different on, say, even rounds and odd rounds. We here consider a stronger notion of consistency which additionally stipulates players should be consistent with their “future selves”.
+Let iff for all rounds , and all players , (potentially the same) such that is honest at and is honest at , we have that the prefixes of and consisting of the first records are identical.
+
Unfortunately, [GKL2020], which is a revised version of [GKL2015], switches to the stronger variant without changing the name.
+(The eprint version history may be useful; the change was made in version 20181013:200033, page 17.)
+Note that [GKL2020] uses an adaptive‑corruption model, “meaning that the adversary is allowed to take control of parties on the fly”, and so their wording in Definition 3:
+++... for any pair of honest players , adopting the chains , at rounds in view respectively, it holds that .
+
is intended to mean the same as our
+++... for all times and all nodes , (potentially the same) such that is honest at time and is honest at time , we have that .
+
The latter is closer to [PSS2016].
+Incidentally, this property does not seem to be mentioned in [Nakamoto2008], contrary to the [PSS2016] authors’ assertion. Maybe implicitly, but it’s a stretch.
+Discussion of [GKL2020]’s communication model and network partition.
+ +When Prefix Consistency is taken to hold of typical PoW-based block‑chain protocols like Bitcoin (as it often is), this implies that, in the relevant executions, the network of honest nodes is never partitioned — unless any partition lasts only for a short length of time relative to block times. If node is on one side of a full partition and node on the other, then after node ’s best chain has been extended by more than blocks, will contain information that has no way to get to node . And even if the partition is incomplete, we cannot guarantee that the Prefix Consistency property will hold for any given pair of nodes.
+It might be possible to maintain Prefix Consistency if the honest nodes on one side of the partition knew that they should not continue building on their chain until the partition has healed, but it is unclear how that would be done in general without resorting to a BFT protocol (as opposed to in specific cases like a single node being unable to connect to the rest of the network). Certainly there is no mechanism to explicitly detect and respond to partitions in protocols derived from Bitcoin.
+And yet, [GKL2020] claims to prove Prefix Consistency from other assumptions. So we know that those assumptions must also rule out a long partition between honest nodes. In fact the required assumption is implicit in the communication model:
+We might be concerned that these implicit assumptions are stronger than we would like. In practice, the peer‑to‑peer network protocol of Bitcoin and Zcash attempts to flood blocks to all nodes. This protocol might have weaknesses, but it is not intended to (and plausibly does not) depend on all messages being received. (Incidentally, Streamlet also implicitly floods messages to all nodes.)
+Also, Streamlet and many other BFT protocols do not assume for safety that the network is not partitioned. That is, BFT protocols can be safe in a fully asynchronous communication model with unreliable messaging. That is why we avoid taking synchrony or partial synchrony as an implicit assumption of the communication model, or else we could end up with a protocol with weaker safety properties than alone.
+This leaves the question of whether the Prefix Consistency property is still too strong, even if we do not rely on it for the analysis of safety when has not been subverted. In particular, if a particular node is not well-connected to the rest of the network, then that will inevitably affect node ’s security, but should not affect other honest nodes’ security.
+Fortunately, it is not the case that disconnecting a single node from the network causes the security assumption to be voided. The solution is to view as not honest in that case (even though it would follow the protocol if it could). This achieves the desired effect within the model, because other nodes can no longer rely on ’s honest input. Although viewing as potentially adversarial might seem conservative from the point of view of other nodes, bear in mind that an adversary could censor an arbitrary subset of incoming and outgoing messages from the node, and this may be best modelled by considering it to be effectively controlled by the adversary.
+Prefix Consistency compares the -truncated chain of some node with the untruncated chain of node . For our analysis of safety of the derived ledgers, we will also need to make an assumption on executions of that at any given time , any two honest nodes and agree on their confirmed prefixes — with only the caveat that one may have observed more of the chain than the other. That is:
++
An execution of has Prefix Agreement at confirmation depth iff it has Agreement on the view .
+Why are this property, and Prefix Consistency above, stated as unconditional properties of protocol executions, rather than as probabilistic assumptions?
+ +Our security arguments that depend on these properties will all be of the form “in an execution where ⟨safety properties⟩ are not violated, ⟨undesirable thing⟩ cannot happen”.
+It is not necessary to involve probability in arguments of this form. Any probabilistic reasoning can be done separately.
+In particular, if a statement of this form holds, and ⟨safety properties⟩ are violated with probability at most under certain conditions, then it immediately follows that under those conditions ⟨undesirable thing⟩ happens with probability at most . Furthermore, ⟨undesirable thing⟩ can only happen after ⟨safety properties⟩ have been violated, because the execution up to that point has been an execution in which ⟨safety properties⟩ are not violated.
+With few exceptions, involving probability in a security argument is best done only to account for nondeterministic choices in the protocol itself. This is opinionated advice, but a lot of security proofs would likely be simpler if inherently probabilistic arguments were more distinctly separated from unconditional ones.
+In the case of the Prefix Agreement property, an alternative approach would be to prove that Prefix Agreement holds with some probability given Prefix Consistency and some other chain properties. This is what [NTT2020] does in its Theorem 2, which essentially says that under certain conditions Prefix Agreement holds except with probability .
+The conclusions that can be obtained from this approach are necessarily probabilistic, and depending on the techniques used, the proof may not be tight; that is, the proof may obtain a bound on the probability of failure that is (either asymptotically or concretely) higher than needed. This is the case for [NTT2020, Theorem 2]; footnote 10 in that paper points out that the expression for the probability can be asymptotically improved:
+++Using the recursive bootstrapping argument developed in [DKT+2020, Section 4.2], it is possible to bring the error probability as close to an exponential decay as possible. In this context, for any , it is possible to find constants , such that is secure after +C with confirmation time except with probability .
+
(Here is the probability that any given node gets to produce a block in any given time slot.)
+In fact none of the proofs of security properties for Snap‑and‑Chat depend on the particular expression ; for example in the proofs of Lemma 5 and Theorem 1, this probability just “passes through” the proof from the premisses to the conclusion, because the argument is not probabilistic. The same will be true of our safety arguments.
+Talking about what is possible in particular executions has further advantages:
+Why, intuitively, should we believe that Prefix Agreement and Prefix Consistency for a large enough confirmation depth hold with high probability for executions of a PoW‑based best‑chain protocol?
+ +Roughly speaking, the intuition behind both properties is as follows:
+Honest nodes are collectively able to find blocks faster than an adversary, and communication between honest nodes is sufficiently reliable that they act as a combined network racing against that adversary. Then by the argument in [Nakamoto2008], modified by [GP2020] to correct an error in the concrete analysis, a private mining attack that attempts to cause a ‑block rollback will, with high probability, fail for large enough . A private mining attack is optimal by the argument in [DKT+2020].
+Any further analysis of the conditions under which these properties hold should be done in the context of a particular .
+Why is the quantification in Prefix Agreement over two different times t and t′?
+ +This strengthens the security property, relative to quantifying over a single time. The question can then be split into several parts:
+Crosslink 2 is parameterized by a bc‑confirmation‑depth (as in Snap‑and‑Chat), and also a finalization gap bound with significantly greater than .
+Each node always uses the fixed confirmation depth to obtain its view of the finalized chain . Unlike in Snap‑and‑Chat or Crosslink 1, this is just a block chain; because we do not need sanitization, there is no need to express it as a log of transactions rather than blocks.
+Each node chooses a potentially different bc‑confirmation‑depth where to obtain its view of the bounded‑available ledger at time , . (We make the restriction because there is no reason to choose a larger .)
+Choosing is at the node’s own risk and may increase the risk of rollback attacks against (it does not affect ). Using small values of is not recommended. The default should be .
+Consider, roughly speaking, the number of bc‑blocks that are not yet finalized at time (a more precise definition will be given in the section on changes from ). We call this the “finality gap” at time . Under an assumption about the distribution of bc‑block intervals, if this gap stays roughly constant then it corresponds to the approximate time that transactions take to be finalized after being included in a bc‑block (if they are finalized at all) just prior to time .
+As explained in detail by The Arguments for Bounded Availability and Finality Overrides, if this bound exceeds a threshold , then it likely signals an exceptional or emergency condition, in which it is undesirable to keep accepting user transactions that spend funds into new bc‑blocks. In practice, should be at least .
+The condition that the network enters in such cases will be called “Stalled Mode”. For a given higher‑level transaction protocol, we can define a policy for which bc‑blocks will be accepted in Stalled Mode. This will be modelled by a predicate . A bc‑block for which returns is called a “stalled block”.
+A bc‑block producer is only constrained to produce stalled blocks while, roughly speaking, its view of the finalization point is not advancing. In particular an adversary that has subverted the BFT protocol in a way that does not keep the finalization point from advancing, can always avoid being constrained by Stalled Mode.
+The desired properties of stalled blocks and a possible Stalled Mode policy for Zcash are discussed in the How to block hazards section of The Arguments for Bounded Availability and Finality Overrides.
+In practice a node's view of the finalized chain, , is likely to lag only a few blocks behind (depending on the latency overhead imposed by ), unless the chain has entered Stalled Mode. So when , the main factor influencing the choice of a given application to use or is not the average latency, but rather the desired behaviour in the case of a finalization stall: i.e. stall immediately, or keep processing user transactions until blocks have passed.
+For a bft‑block or bft‑proposal , define +For a bc‑block , define
+When is the tip of a node’s bc‑best‑chain, will give the candidate finalization point, subject to a condition described below that prevents local rollbacks.
+Use of the headers_bc field, and its relation to the ch field in Snap‑and‑Chat.
+ +For a bft‑proposal or bft‑block , the role of the bc‑chain snapshot referenced by is comparable to the snapshot referenced by in the Snap‑and‑Chat construction from [NTT2020]. The motivation for the additional headers is to demonstrate, to any party that sees a bft‑proposal (resp. bft‑block), that the snapshot had been confirmed when the proposal (resp. the block’s proposal) was made.
+Typically, a node that is validating an honest bft‑proposal or bft‑block will have seen at least the snapshotted bc‑block (and possibly some of the subsequent bc‑blocks in the chain) before. For this not to be the case, the validator’s bc‑best‑chain would have to be more than bc‑blocks behind the honest proposer’s bc‑best‑chain at a given time, which would violate the Prefix Consistency property of .
+If the headers do not connect to any bc‑valid‑chain known to the validator, then the validator should be suspicious that the proposer might not be honest. It can assign a lower priority to validating the proposal in this case, or simply drop it. The latter option could drop a valid proposal, but this does not in practice cause a problem as long as a sufficient number of validators are properly synced (so that Prefix Consistency holds for them).
+If the headers do connect to a known bc‑valid‑chain, it could still be the case that the whole header chain up to and including is not a bc‑valid‑chain. Therefore, to limit denial‑of‑service attacks the validator should first check the Proofs‑of‑Work and difficulty adjustment —which it can do locally using only the headers— before attempting to download and validate any bc‑blocks that it has not already seen. This is why we include the full headers rather than just the block hashes. Nodes may “trim” (i.e. not explicitly store) headers in a bft‑block that overlap with those referred to by its ancestor bft‑block(s).
+Why is a distinguished value needed for the headers_bc field in the genesis bft‑block?
+ +It would be conceptually nice for to refer to , as well as being so that . That reflects the fact that we know “from the start” that neither genesis block can be rolled back.
+This is not literally implementable using block hashes because it would involve a hash cycle, but we achieve the same effect by defining a function that allows us to “patch” to be . We do it this way around rather than “patching” the link from a bc‑block to a bft‑block, because the genesis bft‑block already needs a special case since there are not bc‑headers available.
+Why is the context_bft field needed? Why not use a final_bft field to refer directly to the last final bft‑block before the context block?
+ +The finality of some bft‑block is only defined in the context of another bft‑block. One possible design would be for a bc‑block to have both and fields, so that the finality of could be checked objectively in the context of .
+However, specifying just the context block is sufficient information to determine its last final ancestor. There would never be any need to give a context block and a final ancestor that is not the last one. The function can be computed efficiently for typical BFT protocols. Therefore, having just the field is sufficient.
+Each node keeps track of a “locally finalized” bc‑chain at time . Each node’s locally finalized bc‑chain starts at . However, this chain state should not be exposed to clients of the node until it has synced.
++
Node has Local finalization linearity up to time iff the time series of bc‑blocks is +bc‑linear.
+When node ’s bc‑best‑chain view is updated from to , the node’s will become if and only if this is a descendant of . Otherwise will stay at . This guarantees Local finalization linearity by construction.
+If when making this update, (i.e. and are on different forks), then the node should record a finalization safety hazard. This can only happen if global safety assumptions are violated. Note that Local finalization linearity on each node is not sufficient for Assured Finality, but it is necessary.
+This can be expressed by the following state update algorithm, where is the time of the last update and is the time of the current update:
++
A safety hazard record should include and the history of updates including and since the last one that was an ancestor of .
++
In any execution of Crosslink 2, for any node that is honest at time , there exists a time such that .
+Proof: By the definition of we have for all times . Let be the last time at which changed, or the genesis time if it has never changed. Then for we have , and for we have (because , and truncating always yields ).
+Why does fini need to be maintained using local state?
+ +When a node’s view of the bc‑best‑chain reorgs to a different fork (even if the reorg is shorter than blocks), it may be the case that rolls back. If Final Agreement of holds up to time , the new snapshot should in that case be an ancestor of the old one. If all is well then this snapshot will subsequently roll forward along the same path. However, we do not want applications using the node to see the temporary rollback.
++Assured Finality is our main safety goal for Crosslink 2. It is essentially the same goal as Final Agreement but applied to nodes’ locally finalized bc‑chains; intuitively it means that honest nodes never see conflicting locally finalized chains. We intend to prove that this goal holds under reasonable assumptions about either or .
+An execution of Crosslink 2 has Assured Finality iff for all times , and all nodes , (potentially the same) such that is honest at time and is honest at time , we have .
+Note that if an execution of Crosslink 2 has Assured Finality, then all nodes that are honest for that execution have Local finalization linearity. That is because the restriction of Assured Finality to the case is equivalent to Local finalization linearity for node up to any time at which node is honest.
+Why do we need to use candidate(H) rather than snapshot(LF(H))?
+ +This ensures that the candidate is at least ‑confirmed.
+In practice will rarely differ from , but using the former patches over a potential gap in the safety proof. The Last Final Snapshot rule specified later will guarantee that , and this ensures that . However, the depth of relative to is not guaranteed to be . For the proof we will need , so that we can use the Local fin‑depth lemma together with Prefix Agreement of at confirmation depth to prove Assured Finality.
+An alternative would be to change the Last Final Snapshot rule to directly require .
+Choose between these options based on what works well for the security proofs and finalization latency.
+Define the locally bounded‑available chain on node for bc‑confirmation‑depth , as
+Like the locally finalized bc‑chain, this chain state should not be exposed to clients of the node until it has synced.
++
For any node that is honest at time , and any confirmation depth , .
+Proof: By construction of .
++
In any execution of Crosslink 2, for any confirmation depth and any node that is honest at time , there exists a time such that .
+Proof: Either , in which case the result follows by the Local fin‑depth lemma since , or in which case it follows trivially with .
+Our security goal for will be Agreement on as already defined.
+Why is the ‘otherwise’ case in the definition of (baμ)it necessary?
+ +Assume for this discussion that uses PoW.
+Depending on the value of , the timestamps of bc‑blocks, and the difficulty adjustment rule, it can be the case that if switches to a different fork, the difficulty on that fork is greater than on the chain of the previous snapshot. Then, the new bc‑chain could reach a higher score than the previous chain in fewer than blocks from the fork point, and so might not be a descendant of (which is more likely if ). This can occur even when all safety assumptions are satisfied.
+For Zcash’s difficulty adjustment algorithm, the difficulty of each block is adjusted based on the median timestamps and difficulty target thresholds over a range of blocks, where each median is taken over blocks. Other damping factors and clamps are applied in order to prevent instability and to reduce the influence that adversarially chosen timestamps can have on difficulty adjustment. This makes it unlikely that an adversary could gain a significant advantage by manipulating the difficulty adjustment. So it is safe to use in this case: even though it does not have confirmations relative to , it does have at least the required amount of work “on top” of it.
+Defining this way also has the advantage of making the proof of the Ledger prefix property trivial.
+It is recommended that node implementations “bake in” a checkpointed bft‑block to each released version, and that node should only expose and to its clients once it is “synced”, that is:
++Genesis bft‑block rule: is bft‑block‑valid.
+A bft‑proposal (resp. non‑genesis bft‑block) is bft‑proposal‑valid (resp. bft‑block‑valid) iff all of the following hold:
++The “corresponding validity rules” are assumed to include the Parent rule that ’s parent is bft‑valid.
+Note: origbft‑block‑validity rules may be different to origbft‑proposal‑validity rules. For example, in adapted Streamlet, a origbft‑block needs evidence that it was voted for by a supermajority, and an origbft‑proposal doesn’t. Such differences also apply to bft‑block‑validity vs bft‑proposal‑validity.
+Why have validity rules been separated from the honest voting condition below?
+ +The reason to separate the validity rules from the honest voting condition, is that the validity rules are objective: they don’t depend on an observer’s view of the bc‑best‑chain. Therefore, they can be checked independently of validator signatures. Even a proposal voted for by 100% of validators will not be considered bft‑proposal‑valid by other nodes unless it satisfies the above rules. If more than two thirds of voting units are cast for an invalid proposal, something is seriously and visibly wrong; in any case, the block will not be accepted as a bft‑valid‑block. Importantly, a purportedly valid bft‑block will not be recognized as such by any honest Crosslink 2 node even if it includes a valid notarization proof, if it does not meet other bft‑block‑validity rules.
+This is essential to making the finalized chain safe against a flaw in or its security assumptions (even, say, a complete break of the validator signature algorithm), as long as remains safe.
+What does the Linearity rule do?
+ +This rule is key to combining simplicity with strong security properties in Crosslink 2. It essentially says that, in a given bft‑valid‑chain, the snapshots pointed to by blocks in that chain cannot roll back.
+This allows the informal safety argument for Crosslink 2 to be rather intuitive.
+Informally, if has Final Agreement, then all nodes see only one consistent bft‑linear chain (restricting to bft‑blocks that are final in the context of some bft‑block in honest view). Within such a bft‑chain, the Linearity rule ensures by construction that the sequence of referenced bc‑chain snapshots is bc‑linear. This implies Assured Finality, without needing to assume any safety property of .
+We will also be able to prove safety of the finalized snapshots based only on safety of (for a confirmation depth of ), without needing to assume any safety property of . Informally, that is because each node sees each candidate final snapshot at a given time as a -confirmed prefix of its bc‑best‑chain at that time (this can be proven based on the Last Final Snapshot rule and the fact that a snapshot includes subsequent headers), and Prefix Agreement implies that honest nodes agree on this prefix. We will leave a more detailed argument until after we have presented the + changes from .
+The Linearity rule replaces the “Increasing Score rule” used in Crosslink 1. The Increasing Score rule required that each snapshot in a bft‑valid‑chain either be the same snapshot, or a higher-scoring snapshot to that of its parent block. Since scores strictly increase within a bc‑valid‑chain, the Linearity rule implies the Increasing Score rule. It retains the same or stronger positive effects:
+Note that the adversary could take advantage of an “accidental” fork and start its attack from the base of that fork, so that not all of this work is done by it alone. This is also possible in the case of a standard “private mining” attack, and is not so much of a problem in practice because accidental forks are expected to be short. In any case, should be chosen to take it into account.
+The Linearity rule is also critical to removing the need for one of the most complex elements of Snap‑and‑Chat and Crosslink 1, “sanitization”. In those protocols, because bc‑chain snapshots could be unrelated to each other, it was necessary to sanitize the chain formed from these snapshots to remove transactions that were contextually invalid (e.g. because they double‑spend). The negative consequences of this are described in Notes on Snap‑and‑Chat; avoiding it is much simpler.
+The linearity property is intentionally always relative to the snapshot of the parent bft‑block, even if it is not final in the context of the current bft‑block. This is because the rule needs to hold if and when it becomes final in the context of some descendant bft‑block.
+PoS Desideratum: we want leader selection with good security / performance properties that will be relevant to this rule. (Suggested: PoSAT.)
+Why does the Linearity rule allow keeping the same snapshot as the parent?
+ +This is necessary in order to preserve liveness of relative to . Liveness of might require honest proposers to make proposals at a minimum rate. That requirement could be consistently violated if it were not always possible to make a valid proposal. But given that it is allowed to repeat the same snapshot as in the parent bft‑block, neither the Linearity rule nor the Tail Confirmation rule can prevent making a valid proposal — and all other rules of affecting the ability to make valid proposals are the same as in . (In principle, changes to voting in could also affect its liveness; we’ll discuss that in the liveness proof later.)
+For example, Streamlet requires three notarized blocks in consecutive epochs in order to finalize a block [CS2020, section 1.1]. Its proof of liveness depends on the assumption that in each epoch for which the leader is honest, that leader will make a proposal, and that during a “period of synchrony” this proposal will be received by every node [CS2020, section 3.6]. This argument can also be extended to adapted‑Streamlet.
+We could alternatively have allowed to always make a “null” proposal, rather than to always make a proposal with the same snapshot as the parent. We prefer the latter because the former would require specifying the rules for null proposals in .
+As a clarification, no BFT protocol that uses leader election can require a proposal in each epoch, because the leader might be dishonest. The above issue concerns liveness of the protocol when assumptions about the attacker’s share of bft‑validators or stake are met, so that it can be assumed that sufficiently long periods with enough honest leaders to make progress (5 consecutive epochs in the case of Streamlet), will occur with significant probability.
+The finality rule for bft‑blocks in a given context is unchanged from origbft‑finality. That is, is defined in the same way as (modulo referring to bft‑block‑validity and ).
+An honest proposer of a bft‑proposal chooses as the ‑block tail of its bc‑best‑chain, provided that it is consistent with the Linearity rule. If it would not be consistent with that rule, it sets to the same field as ’s parent bft‑block. It does not make proposals until its bc‑best‑chain is at least blocks long.
+Why σ + 1?
+ +If the length were less than blocks, it would be impossible to construct the field of the proposal.
+Note that when the length of the proposer’s bc‑best‑chain is exactly blocks, the snapshot must be of But this does not violate the Linearity rule, because matches the previous snapshot by .
+How is it possible that the Linearity rule would not be satisfied by choosing headers from an honest proposer’s bc‑best‑chain?
+ +As in the answer to Why is the ‘otherwise’ case in the definition of necessary? above, after a reorg on the bc‑chain, the -confirmed block on the new chain might not be a descendant of the -confirmed block on the old chain, which could break the Linearity rule.
+An honest validator considering a proposal , first updates its view of both subprotocols with the bc‑headers given in , downloading bc‑blocks for these headers and checking their bc‑block‑validity.
+For each downloaded bc‑block, the bft‑chain referenced by its field might need to be validated if it has not been seen before.
+Wait what, how much validation is that?
+ +In general the entire referenced bft‑chain needs to be validated, not just the referenced block — and for each bft‑block, the bc‑chain in needs to be validated, and so on recursively. If this sounds overwhelming, note that:
+In summary, the order of validation is important to avoid denial‑of‑service — but it already is in Bitcoin and Zcash.
+After updating its view, the validator will vote for a proposal only if:
+Blocks in a bc‑best‑chain are by definition bc‑block‑valid. If we’re checking the Confirmed best‑chain criterion, why do we need to have separately checked that the blocks referenced by the headers are bc‑block‑valid?
+ +The Confirmed best‑chain criterion is quite subtle. It ensures that is bc‑block‑valid and has bc‑block‑valid blocks after it in the validator’s bc‑best‑chain. However, it need not be the case that is part of the validator’s bc‑best‑chain after it updates its view. That is, the chain could fork after .
+The bft‑proposal‑validity rule must be objective; it can’t depend on what the validator’s bc‑best‑chain is. The validator’s bc‑best‑chain may have been updated to (if it has the highest score), but it also may not.
+However, if the validator’s bc‑best‑chain was updated, that makes it more likely that it will be able to vote for the proposal.
+In any case, if the validator does not check that all of the blocks referenced by the headers are bc‑block‑valid, then its vote may be invalid.
+How does this compare to Snap‑and‑Chat?
+ +Snap‑and‑Chat already had the voting condition:
+++An honest node only votes for a proposed BFT block if it views as confirmed.
+
but it did not give the headers potentially needed to update the validator’s view, and it did not require a proposal to be for an objectively confirmed snapshot as a matter of validity.
+If a Crosslink‑like protocol were to require an objectively confirmed snapshot but without including the bc‑headers in the proposal, then validators would not immediately know which bc‑blocks to download to check its validity. This would increase latency, and would be likely to lead proposers to be more conservative and only propose blocks that they think will already be in at least a two‑thirds absolute supermajority of validators’ best chains.
+That is, showing to all of the validators is advantageous to the proposer, because the proposer does not have to guess what blocks the validators might have already seen. It is also advantageous for the protocol goals in general, because it improves the trade‑off between finalization latency and security.
++Genesis bc‑block rule: For the genesis bc‑block we must have , and therefore .
+A bc‑block is bc‑block‑valid iff all of the following hold:
+Explain the definition of finality‑depth.
+ +The finality depth must be objectively defined, since it is used in a consensus rule. Therefore it should measure the height of relative to , which is an objectively defined function of , rather than relative to . (These will only differ for when node has just reorged, and only then in corner cases.)
+Note that the Last Final Snapshot rule ensures that it is meaningful to simply use the difference in heights, since .
+The consensus rule changes above are all non-contextual. Modulo these changes, contextual validity in is the same as in .
+An honest producer of a bc‑block must follow the consensus rules under block validity above. In particular, it must produce a stalled block if required to do so by the Finality depth rule.
+To choose , the producer considers a subset of the tips of bft‑valid‑chains in its view: It chooses one of the longest of these chains, , breaking ties by maximizing , and if there is still a tie then by taking with the smallest hash.
+The honest block producer then sets to .
+An honest bc‑block‑producer must not use information from the BFT protocol, other than the specified consensus rules, to decide which bc‑valid‑chain to follow. The specified consensus rules that depend on have been carefully constructed to preserve safety of relative to . Imposing any additional constraints could potentially allow an adversary that is able to subvert , to influence the evolution of the bc‑best‑chain in ways that are not considered in the safety argument.
+Why not choose T such that H ⌈1bc . context_bft ⪯bft bft‑last‑final(T )?
+ +The effect of this would be to tend to more often follow the last bft‑block seen by the producer of the parent bc‑block, if there is a choice. It is not always possible to do so, though: the resulting set of candidates for might be empty.
+Also, it is not clear that giving the parent bc‑block‑producer the chance to “guide” what bft‑block should be chosen next is beneficial, since that producer might be adversarial and the resulting incentives are difficult to reason about.
+Why choose the longest C, rather than the longest bft‑last‑final(C )?
+ +We could have instead chosen to maximize the length of . The rule we chose follows Streamlet, which builds on the longest notarized chain, not the longest finalized chain. This may call for more analysis specific to the chosen BFT protocol.
+Why this tie‑breaking rule?
+ +Choosing the bft‑chain that has the last final snapshot with the highest score, tends to inhibit an adversary’s ability to finalize its own chain if it has a lesser score. (If it has a greater score, then it has already won a hash race and we cannot stop the adversary chain from being finalized.)
+For discussion of potentially unifying the roles of bc‑block producer and bft‑proposer, see What about making the bc‑block‑producer the bft‑proposer? in Potential changes to Crosslink.
+At this point we have completed the definition of Crosslink 2. In Security Analysis of Crosslink 2, we will prove it secure.
+ +The discussion in The Argument for Bounded Availability and Finality Overrides is at an abstract level, applying to any Ebb‑and‑Flow-like protocol.
+This document considers specifics of the Snap‑and‑Chat construction proposed in [NTT2020] (arXiv version).
+We are trying to be precise in this document about use of the terms “Ebb‑and‑Flow”, which is the security model and goal introduced in [NTT2020], vs “Snap‑and‑Chat”, which is the construction proposed in the same paper to achieve that goal. There are other ways to design an Ebb‑and‑Flow protocol that don’t run into the difficulties described in this section (or that run into different difficulties).
+A general problem with the Snap‑and‑Chat construction is that it does not follow, from enforcement of the original consensus rules on blocks produced in , that the properties they are intended to enforce hold for the or ledgers. Less obviously, the converse also does not follow: enforcing unmodified consensus rules on blocks is both too lax and too strict.
+Recall from the paper how and are constructed (starting at the end of page 8 of [NTT2020]):
++++
+- Ledger extraction: Finally, how honest nodes compute and from and is illustrated in Figure 6. Recall that is an ordering of snapshots, i.e., a chain of chains of LC blocks. First, is flattened, i.e. the chains of blocks are concatenated as ordered to arrive at a single sequence of LC blocks. Then, all but the first occurrence of each block are removed (sanitized) to arrive at the finalized ledger of LC blocks. To form the available ledger , , which is a sequence of LC blocks, is appended to and the result again sanitized.
+
This says that and are sequences of transactions, not sequences of blocks. Therefore, consensus rules defined at the block level are not applicable.
+Most of these rules are Proof‑of‑Work‑related checks that can be safely ignored at this level. Some are related to the hashBlockCommitments
field intended for use by the FlyClient protocol. It is not at all clear how to make FlyClient (or other uses of this commitment) work with the Snap‑and‑Chat construction. In particular, the hashEarliest{Sapling,Orchard}Root
, hashLatest{Sapling,Orchard}Root
, and n{Sapling,Orchard}TxCount
fields don’t make sense in this context since they could only reflect the values in , which have no relation in general to those for any subrange of transactions in . This problem occurs as a result of sanitization and so will be avoided by Crosslink 2, which does not need sanitization.
Since does not have blocks, it is not well-defined whether it has “coinbase‑only blocks” when in Stalled Mode. That by itself is not so much of a problem because it would be sufficient for it to have only coinbase transactions in that mode.
+The issuance schedule of Zcash was designed under the assumption that blocks only come from a single chain, and that the difficulty adjustment algorithm keeps the rate of block mining roughly constant over time.
+For Snap‑and‑Chat, if there is a rollback longer than blocks in , additional coinbase transactions from the rolled-back chain will be included in .
+We can argue that this will happen rarely enough not to cause any significant problem for the overall issuance schedule. However, it does mean that issuance is less predictable, because the block subsidies will be computed according to their depth in the chain on which they were mined. So it would no longer be the case that coinbase transactions issue a deterministic, non-increasing sequence of block subsidies. (Again, this problem will be avoided by Crosslink 2.)
+The order of transactions in any particular is not in general preserved in either or . This is considered in the paper (middle of the left column on page 10) but it is very easy to miss it:
+++Thus, snapshots taken by different nodes or at different times can conflict. However, is still safe and thus orders these snapshots linearly. Any transactions invalidated by conflicts are sanitized during ledger extraction.
+
That is, a transaction from one snapshot might double-spend an output already spent in a different transaction of a different snaphot earlier in the flattening order. If it is omitted, then later transactions could depend on the outputs of the omitted one. The paper is saying that each transaction is only included if (in Bitcoin and Zcash terminology) it satisfies contextual checks for double-spending and existence of inputs at the point in the ledger where it would be added.
+Since nullifiers for shielded spends are public, it is possible to do this even for shielded transactions. Each node will construct commitment trees in the order given by
+This means that if is extended by a block that is not the next block in after the finalization point (and that has different note commitments), then all shielded transactions from that point onward in the previous will be invalidated. It could be possible to do better at the expense of a more complicated note commitment tree structure. In any case, this situation is expected to be rare, because it can only occur if there is a rollback of more than blocks in the consensus chain or a failure of BFT safety.
+There are two possible ways to interpret how are constructed in Snap‑and‑Chat:
+These are equivalent in the setting considered by [NTT2020], but the argument for their equivalence is not obvious. We definitely want them to be equivalent: in practice there will be many duplicate blocks from chain prefixes in the input to sanitization, and so a literal implementation of the first variant would have to recheck all duplicate transactions for contextual validity. That would have at least complexity (more likely ) in the length of the block chain, because the length of each final snapshot grows with .
+Suppose that, in a particular , the only reasons for a transaction to be contextually invalid are double-spends and missing inputs. In that case the argument for equivalence is:
+Note that any other reason for transactions to be contextually invalid might interfere with this argument. Therefore, strictly speaking Snap‑and‑Chat should require of that there is no such other reason. This does not seem to be explicitly stated anywhere in [NTT2020].
+In Zcash a transaction can also be contextually invalid because it has expired, or because it has a missing anchor. Expiry can be handled by extending the above argument as follows:
+It is not obvious how to extend it to handle missing anchors, because it is technically possible for a duplicate transaction that was invalid because of a missing anchor to become valid in a subsequent block. That situation would require careful manipulation of the commitment trees, but there does not seem to be anything preventing it from being provoked intentionally. The argument that was used above for missing inputs does not work here, because there is no corresponding DAG formed by the transactions with missing anchors: the same commitment treestate can be produced by two unrelated transactions.
+Transactions in need to be able to spend outputs that are not necessarily from any previous transaction in . This is because, from the point of view of a user of node at time , the block chain includes all transactions in . All of the transactions after the finalization point are guaranteed to also be in , but the ones before the finalization point (i.e. in ) are not, because they could be from some other for and (intuitively, from some long chain fork that was once considered confirmed by enough nodes).
+Honest nodes only ever vote for confirmed snapshots, that is, prefixes of their best chain truncated by the confirmation depth . Obviously the whole point of having the BFT protocol is that chain forks longer than can occur in — otherwise we'd just use and have done. So it is not that we expect this case to be common, but if it happens then it will never fix itself: the consensus chain in will continue on without ever including the transactions from that were obtained from a snapshot of another fork.
+A user must be able to spend outputs for which they hold the spending key from any finalized transaction, otherwise there would be no point to the finalization.
+The authors of [NTT2020] probably just missed this: the paper only has evidence that they simulated their construction, rather than implementing it for Bitcoin or any other concrete block chain as . Let’s try to repair it.
+Suppose that node is trying to determine whether is a consensus-valid chain, which is necessary for deterministic consensus in . It cannot decide whether to allow transactions in to spend outputs not in the history of on the basis of its own finalized view because and are not in general the same.
+Of course, we hope that and are consistent, i.e. one is a prefix of the other. But even if they are consistent, they are not necessarily the same length. In particular, if is shorter than then node does not have enough information to fill in the gap — and so it may incorrectly view a transaction in as spending an output that does not exist, when actually it does exist in Conversely if were longer and node were to allow spending an output in that would be using information that is not necessarily available to other nodes, and so node could diverge from consensus.
+Consensus validity of the block at the tip of can only be a deterministic function of the block itself and its ancestors in . It is crucial to be able to eventually spend outputs from the finalized chain. We are forced to conclude that the chain must include the information needed to calculate for some not too far behind . That is, must be modified to ensure that this is the case. This leads us to strengthen the required properties of an Ebb‑and‑Flow protocol to include another property, “finalization availability”.
+In the absence of security flaws and under the security assumptions required by the finality layer, the finalization point will not be seen by any honest node to roll back. However, that does not imply that all nodes will see the same finalized height — which is impossible given network delays and unreliable messaging.
+Both in order to optimize the availability of applications that require finality, and in order to solve the technical issue of spending finalized outputs described in the previous section, we need to consider availability of the information needed to finalize the chain up to a particular point.
+Note that in Bitcoin-like consensus protocols, we don’t generally consider it to be an availability flaw that a block header only commits to the previous block hash and to the Merkle tree of transactions in the block, rather than including them directly. These commitments allow nodes to check that they have the correct information, which can then be requested separately.
+Suppose, then, that each block header in commits to the Last Final BFT block known by the block producer. For an LC block chain with block at its tip, we will refer to this commitment as . We refer to the parent block of as (this is a special case of a notation that will be defined in The Crosslink 2 Construction).
+We require, as a consensus rule, that if is not the genesis block header, then this BFT block either descends from or is the same as the final BFT block committed to by the block’s parent. That is, .
+This Extension rule will be preserved into Crosslink 2.
+The Extension rule does not prevent the BFT chain from rolling back, if the security assumptions of were violated. However, it means that if a node does not observe a rollback in at confirmation depth , then it will also not observe any instability in , even if the security assumptions of are violated. This property holds by construction, and in fact regardless of .
+In the Snap‑and‑Chat construction, we also have BFT block proposals committing to snapshots (top of right column of [NTT2020, page 7]):
+++In addition, is used as side information in to boycott the finalization of invalid snapshots proposed by the adversary.
+
This does not cause any circularity, because each protocol only commits to earlier blocks of the other. In fact, BFT validators have to listen to transmission of block headers anyway, so that could be also the protocol over which they get the information needed to make and broadcast their own signatures or proposals. (A possible reason not to broadcast individual signatures to all nodes is that with large numbers of validators, the proof that a sufficient proportion of validators/stake has signed can use an aggregate signature, which could be much smaller. Also, nodes only need to know about successful BFT block proposals.)
+Now suppose that, in a Snap‑and‑Chat protocol, the BFT consensus finalizes a snapshot that does not extend the snapshot in the previous block (which can happen if either is unsafe, or suffers a rollback longer than blocks). In that case we will initially not be able to spend outputs from the old snapshot in the new chain. But eventually for some node that sees the header at the tip of its best chain at time , will be such that from then on (i.e. at time ), includes the output that we want to spend. This assumes liveness of and safety of .
+That is, including a reference to a recent final BFT block in block headers both incentivizes nodes to propagate this information, and can be used to solve the “spending finalized outputs” problem.
+Optionally, we could incentivize the block producer to include the latest information it has, for example by burning part of the block reward or by giving the producer some limited mining advantage that depends on how many blocks back the finalization information is.
+This raises the question of how we measure how far ahead a given block is relative to the finalization information it provides. As we said before, is a sequence of transactions, not blocks. The transactions will in general be in a different order, and also some transactions from may have been omitted from (and even ) because they were not contextually valid.
+In Crosslink 2, we will sidestep this problem by avoiding the need for sanitization — that is, will correspond exactly to a chain of blocks that is a prefix of . Actually we use the notation to reflect the fact that it is a bc‑chain, not a sequence of bc‑transactions. This invariant is maintained statefully on each node : any rollback past will be ignored. If a new would conflict with the old one, the node will refuse to use it. This allows each node to straightforwardly measure how many blocks is ahead of as the difference in heights. Since this document is intended to explain the development of Crosslink from Snap‑and‑Chat, here we describe the more complicated approach that we originally came up with for Crosslink 1 — which also serves to motivate the simplification in Crosslink 2.
+Assume that a block unambiguously specifies its ancestor chain. For a block , define:
+Here is the BFT block we are providing information for, and is the corresponding snapshot. For a node that sees as the most recent final BFT block at time , will definitely contain transactions from blocks up to , but usually will not contain subsequent transactions on ’s fork.
+Strictly speaking, it is possible that a previous BFT block took a snapshot that is between and . This can only happen if there have been at least two rollbacks longer than blocks (i.e. we went more than blocks down ’s fork from , then reorged to more than blocks down ’s fork, then reorged again to ’s fork). In that case, the finalized ledger would already have the non-conflicting transactions from blocks between and — and it could be argued that the correct definition of finality depth in such cases is the depth of relative to , not of relative to .
+However,
+By the way, the “tailhead” of a tailed animal is the area where the posterior of the tail joins the rump (also called the “dock” in some animals).
+We could alternatively just rely on the fact that some proportion of block producers are honest and will include the latest information they have. However, it turns out that having a definition of finality depth will also be useful to enforce going into Stalled Mode.
+Specifically, if we accept the above definition of finality depth, then the security property we want is
+Bounded hazard-freeness for a finality gap bound of blocks: There is never, for any node at time , observed to be a more-available ledger with a hazardous transaction that comes from block of such that .
+This assumes that transactions in the non-finalized suffix come from blocks in . In Snap‑and‑Chat they do by definition, but ideally we wouldn’t depend on that. The difficulty in finding a more general security definition is due to the ledgers in an Ebb‑and‑Flow protocol being specified as sequences of transactions, so that a depth in the ledger would have only a very indirect correspondence to time. We could instead base a definition on timestamps, but that could run into difficulties in ensuring timestamp accuracy.
+Another possibility would be to count the number of coinbase transactions in before the hazardous transaction. This would still be somewhat ad hoc (it depends on the fact that coinbase transactions happen once per block and cannot conflict with any other distinct transaction).
+In any case, if sometimes overestimates the depth, that cannot weaken this security definition.
+Note that a node that is validating a chain must fetch all the chains referenced by BFT blocks reachable from it (back to an ancestor that it has seen before). In theory, there could be a partition that causes there to be multiple disjoint snapshots that get added to the BFT chain in quick succession. However, in practice we expect such long rollbacks to be rare if is meeting its security goals.
+Going into Stalled Mode if there is a long finalization stall helps to reduce the cost of validation when the stall resolves. That is, if there is a partition and nodes build on several long chains, then in unmodified Snap‑and‑Chat, it could be necessary to validate an arbitrary number of transactions on each chain when the stall resolves. Having only coinbase transactions after a certain point in each chain would significantly reduce the concrete validation costs in this situation.
+Nodes should not simply trust that the BFT blocks are correct; they should check validator signatures (or aggregate signatures) and finalization rules. Similarly, snapshots should not be trusted just because they are referenced by BFT blocks; they should be fully validated, including the proofs-of-work.
+It is also possible for a snapshot reference to include the subsequent block headers, which are guaranteed to be available for a confirmed snapshot. Having all nodes validate the proofs-of-work in these headers is likely to significantly increase the work that an attacker would need to perform to cause disruption under a partial failure of either or ’s security properties.
+Note that [NTT2020] (bottom of right column, page 9) makes a safety assumption about in order to prove the consistency of with the output of :
+++As indicated by Algorithm 1, a snapshot of the output of becomes final as part of a BFT block only if that snapshot is seen as confirmed by at least one honest node. However, since is safe [i.e., does not roll back further than the confirmation depth ], the fact that one honest node sees that snapshot as confirmed implies that every honest node sees the same snapshot as confirmed.
+
We claim that, while this may be a reasonable assumption to make for parts of the security analysis, in practice we should always require any adversary to do the relevant amount of Proof‑of‑Work to construct block headers that are plausibly confirmed. This is useful even though we cannot require, for every possible attack, that it had those headers at the time they should originally have appeared.
+The following idea for enforcing finalization availability and a bound on the finality gap was originally conceived before we had switched to advocating the Stalled Mode approach. It’s simpler to explain first in that variant.
+Suppose that for an -block availability bound, we required each block header to include the information necessary for a node to finalize to blocks back. This would automatically enforce a chain stall after the availability bound without any further explicit check, because it would be impossible to produce a block after the bound.
+Note that if full nodes have access to the BFT chain, knowing is sufficient to tell whether the correct version of any given BFT block in ’s ancestor chain has been obtained.
+Suppose that the finality gap bound is blocks. Having already defined , the necessary consensus rule is attractively simple:
+ +To adapt this approach to enforce Stalled Mode instead of stalling the chain, we can allow the alternative of producing a block that follows the Stalled Mode restrictions:
+ +Note that Stalled Mode will be exited automatically as soon as the finalization point catches up to within blocks (if it does without an intentional rollback). Typically, after recovery from whatever was causing the finalization stall, the validators will be able to obtain consensus on the same chain as , and so there will be no rollback (or at least not a long one) of .
+An earlier iteration of this idea required the finalization information to be included in block headers. This is not necessary when we assume that full nodes have access to the BFT chain and can obtain arbitrary BFT blocks. This also sidesteps any need to relax the rule in order to bound the size of block headers. block producers are still incentivized to make the relevant BFT blocks available, because without them the above consensus rule cannot be checked, and so their blocks would not be accepted.
+There is, however, a potential denial-of-service attack by claiming the existence of a BFT block that is very far ahead of the actual BFT chain tip. This attack is not very serious as long as nodes limit the number of BFT blocks they will attempt to obtain in parallel before having checked validator signatures.
+Consider Lemma 5:
+++Moreover, for a BFT block to become final in the view of an honest node under , at least one vote from an honest node is required, and honest nodes only vote for a BFT block if they view the referenced LC block as confirmed.
+
The stated assumptions are:
+++formalizes the model of P2, a synchronous network under dynamic participation, with respect to a bound on the fraction of awake nodes that are adversarial:
++
+- At all times, is required to deliver all messages sent between honest nodes in at most slots.
+- At all times, determines which honest nodes are awake/asleep and when, subject to the constraint that at all times at most fraction of awake nodes are adversarial and at least one honest node is awake.
+
is defined as .
+Now consider this statement and figure:
+++Even if is unsafe (Figure 9c), finalization of a snapshot requires at least one honest vote, and thus only valid snapshots become finalized.
+ +
This argument is technically correct but has to be interpreted with care. It only applies when the number of malicious nodes is such that . What we are trying to do with Crosslink is to ensure that a similar conclusion holds even if is completely subverted, i.e. the adversary has 100% of validators (but only < 50% of hash rate).
+ +This page documents suggestions that have not had the same attention to security analysis as the Crosslink 2 construction. Some of them are broken. Some of them also increase the complexity of the protocol (while some simplify it or have a mixed effect on complexity), and so we need to consider the security/complexity trade‑off of each suggestion before we could include it.
+This page has not yet been updated for the changes from Crosslink 1 to Crosslink 2.
+We can allow honest bc‑block‑producers to record information about every proposed and notarized bft‑block, rather than just the one in the field.
+Duplicate information that has already been given in an ancestor bc‑block would be omitted.
+This would automatically expose the following shenanigans to public view (as long as enough bc‑block‑proposers are honest, which is already assumed):
+We could also expose attempts to double‑vote.
+Note that double‑proposal and double‑voting could be a sign that a proposer or validator’s private key is compromised, rather than that it belongs to the adversary per se. However, the security analysis must treat such a proposer/validator as non‑honest in any case.
+The current Increasing Score rule concerns the score of the snapshot:
+Increasing Snapshot Score rule: Either or .
+We could instead require the score of to increase:
+Increasing Tip Score rule: Either or .
+Pros:
+Con:
+Apart from the above con, the original motivations for the Increasing Snapshot Score rule also apply to the Increasing Tip Score rule. In particular,
+If we switch to using the Increasing Tip Score rule, then it would be more consistent for block producers to also change the tie‑breaking rule for choosing to use the tip score, i.e. .
+A variation on this suggestion effectively keeps both the Increasing Snapshot Score rule and the Increasing Tip Score rule:
+Combined Increasing Score rule: Either ( and ), or .
+Note that if , both scores are necessarily equal.
+This variation does not simplify honest bft‑proposer behaviour.
+Basic idea: Detect the case where the bc‑snapshot is rolling back, and impose a longer confirmation depth to switch to the new bc‑chain. Also temporarily stall finalization of the existing bc‑chain until the conflict has been resolved.
+Let be the Crosslink 1 definition of , i.e.
+When , we want to go into a mode where we require a longer confirmation depth , say . Because we don’t know in this situation whether the old bc‑chain or the new bc‑chain will win, we stop finalizing both until a winner is clear.
+The simplest option is to record the state saying that we are in this mode explicitly, and add a consensus rule requiring it to be correct. That is, add an field to bft‑proposals and bft‑blocks, and add a bft‑proposal and bft‑block validity rule as follows:
+where:
+It is intentional that takes precedence over .
+Then redefine as follows:
+Since , the recursion will terminate.
+Note that there is an interaction between the Increasing Snapshot Score rule and this change: the Increasing Snapshot Score rule should arguably use instead of . The Increasing Tip Score rule, on the other hand, works fine as‑is, and so it makes sense to use both of these changes together. The combination of both changes also fixes the con discussed above for the Increasing Tip Score rule; it ensures that the score of the snapshot must increase.
+Pros:
+Cons:
+In the case of a private mining attack, the adversary will typically conceal the existence of the overtaking chain until it can be used to cause a rollback in \LOG_{\bda}}. So the approach used in the previous section seems to be all we can do against such an attack.
+In the case of a partitioning attack, on the other hand, the adversary relies on honest nodes to do mining work on each side of the partition. This relies on the successful miners on each side knowing about their chain, but not the chain on the other side. Subtly, it does not rely on a perfect network partition. An adversary could, for example, attempt to create partitions around the most successful mining pools. Occasional leaks of information across a partition also do not necessarily foil the attack unless that information gets to a successful miner. Therefore, measures that constrain the adversary’s ability to make use of an incomplete partition can be useful.
+This also has the benefit of making the protocol more robust against non‑malicious incomplete partitions.
+Given that in such an attack the competing chains may be visible to some proposers, there is the possibility of detecting a potential rollback even before it gets snapshotted, by using the fact that previous bft‑blocks created by honest bft‑proposers have been recording the bc‑best‑chain tip blocks ahead. Also, depending on what proportion of validators an adversary has, they may rely on honest validators on each side to ensure that a snapshot of each chain appears in a bft‑valid block; in that case, including information about competing chains in validators' votes (see the next subsection) may be useful.
+It is still possible that if an adversary has several consecutive proposal slots, it can get its chain snapshotted. However, if there is an intervening slot with an honest proposer, we can potentially compare its tip with the adversary’s tip and anticipate the need to go into mode.
+In order to get this to work, we need to propose a definition to identify bc‑chains that are competing with the current best chain, such that there is some risk of a “long” rollback to a competing chain. Let be a measure of how close (in terms of bc‑blocks) a competing chain’s score needs to be to that of the bc‑best‑chain, and let be a lower bound on the rollback depth we would consider significant if the competing chain were to immediately catch up. (The condition is necessary to avoid false positives that might only be a single‑block fork.)
+A node identifies ‑competing chains as follows based on its current view at time :
+Details, including how to modify the and conditions.
+For now we will assume that all of the competing chain information in a bft‑block has to be checked as bc‑block‑valid in order for that block to be bft‑block‑valid. This might introduce validation DoS attacks and needs to be considered more carefully.
+This complements the above idea by letting a validator that has seen a competing chain signal it in its signed vote. Then, as long as the adversary is reliant on some votes from honest validators that are signalling the existence of competing chains, we would go into mode without relying on honest proposers to have an intervening slot.
+The notarization proof that appears in a bft‑block would need to be modified to preserve these signals. More precisely, it is necessary for a bft‑block to preserve at least:
+This is also motivated by the suggested change in the next section.
+Enforcing this is relatively straightforward if the evidence is a SNARK. It can also be enforced with aggregate signatures even for schemes that only allow aggregation of signatures over a common message: we just collect the distinct messages (corresponding to either “no competing chain” or each distinct competing chain) and aggregate them separately.
+Assume that votes include competing chain information as discussed above. We can assume that an honest proposer has read all of this information from its parent bft‑block. Therefore, we can require the tip score of its proposal to have at least the score of the best tip implied by that information:
+Let be the tip mentioned in bft‑block with the highest score. A bft‑block “mentions” the two best tips defined in the previous section.
+Strong Increasing Tip Score rule: Either or .
+Note that this rule is really quite constraining for a potential adversary, especially in partitioning attacks. It means that if the adversary does not want to acknowledge the existence of a given chain, it cannot use any votes or build on any previous bft‑block that signals the existence of that chain. Essentially, a partitioning adversary with control over only the minimum one‑third of the stake would have to have ensure a perfectly complete partition; it could not get away with any information leakage between honest validators.
+The Crosslink 1 design imposes a finalization latency of at least block times. Intuitively, this is because in is at least blocks back from (as argued in Questions about Crosslink 1), and therefore blocks back from . So the total finalization latency is block times + BFT overhead + block times + snapshot overhead.
+However, the snapshot headers contain information about the proposer’s bc‑best‑chain.
+Define . Although it is not guaranteed, normally will be an ancestor of . What if we were to optimistically allow the last snapshot to be taken as ? After all, we know that is confirmed.
+Oh, this won’t work. The problem is that we want safety of not to depend on safety of . So we cannot assume (for this purpose) that nodes see the same .
+What if we instead take this to be the definition of \LOG_{\opt}}, replacing \LOG_{\bda}} ("opt" meaning optimistic)?
+As stated, a malicious proposer can try to maximize the latency of \LOG_{\opt}} (subject to the Increasing Score rule). For example, if there exists a fork of length , the malicious proposer can force the latency of \LOG_{\opt}} to be block times + BFT overhead. However, this can be improved by applying the idea to each bft‑block in turn after the one pointed to by the best confirmed bc‑block. Then a malicious proposer cannot do anything that it could not do anyway (keeping the finalization point at its current position).
+Pros:
+Cons:
+The following idea is broken for safety when has been subverted:
+We have two potential sources of information about blocks that could plausibly be considered finalized:
+We cannot rely only on 1. because we want assured finalization even under partition. +We cannot rely only on 2. because if has been subverted, then the chain of final bft‑blocks could fork.
+But intuitively, if we combine these sources of information, using them over the Crosslink 1 finalization only when they are consistent, the resulting protocol should still be as safe as the safer of and . In particular, 2. will not roll back unless has been subverted.
+If this idea were to pan out, it could improve the latency of finalization by block times.
+This approach is essentially a hybrid of Snap‑and‑Chat and Crosslink 1:
+To explain the safety problem with this idea: suppose that has been subverted. In that case it is possible for a snapshot to be finalized without having being confirmed as in any honest node’s bc‑best‑chain; that is, it is possible for to include transactions from a snapshot in bft‑block such that is not on the consensus bc‑best‑chain. And, because has been subverted, it is also possible that a conflicting final bft‑block omits . And so a node that has seen will think that it is consistent with the best bc‑chain (so that its does not include but does include later transactions on the consensus bc‑best‑chain), but a node that has seen will compute a that does include .
+More detailed specification of the above broken idea.
+ +Define as before.
+For simplicity assume that extends by only one bft‑block. (This assumption could have been removed if the idea had panned out.)
+Then this proposal was to consider this bc‑block as contributing the last finalized snapshot: +
+There is no need for a tie‑breaking rule for 2.: if we ever see two context bft‑blocks for which the last‑final blocks are conflicting, we know that has been subverted, so we should stall or crash.
+Caveat: for a given node, can in theory roll back past , therefore can also roll back. It is okay if we keep state here and refuse to roll back. We should set a “crisis flag”, and unset it if at any point extends . (If is safe and live, it will.)
+A similar rule that would give the same result in almost all circumstances is:
+The answer given for this question at The Crosslink 2 Construction is:
+++If this were enforced, it could be an alternative way of ensuring that every bft‑proposal snapshots a new bc‑block with a higher score than previous snapshots, potentially making the Increasing Score rule redundant. However, it would require merging bc‑block‑producers and bft‑proposers, which could have concerning knock‑on effects (such as concentrating security into fewer participants).
+
This may have been too hasty. It is not clear that merging bc‑block‑producers and bft‑proposers actually does “concentrate security into fewer participants” in a way that can have any harmful effect.
+Remember, the job of a bft‑proposer in Crosslink is primarily to snapshot the bc‑best‑chain (even more so if the Increasing Tip Score rule is adopted). An honest miner by definition is claiming to build on the best chain, and miners have a strong economic incentive to do so. Therefore, it is entirely reasonable for every newly produced block to be treated as a bft‑proposal. This arguably decentralizes the task of proposing bft‑blocks more effectively than using a leader election protocol would — especially given that in a hybrid protocol we necessarily still rely on there being sufficient honest miners.
+[DKT2021], for example, argues for the importance of “the complete unpredictability of who will get to propose a block next, even by the winner itself.” The main basis of this argument is that it makes subversion of the proposer significantly more difficult. A PoW protocol has that property, and most PoS protocols do not. (It is not that PoS protocols are unable to provide this property; indeed, [DKT2021] constructs a PoS protocol, “PoSAT”, that provides it.)
+So let’s explore this in more detail. A newly produced bc‑block would implicitly be a bft‑proposal with itself as the tip. The field is therefore not needed. The Tail Confirmation rule goes away since its intent is automatically satisfied. This is already a significant simplification.
+The inner proposer signature is also not needed (since the bc‑header is self-authenticating), but the block producer would have to include a public key that can be used to verify its outer signature. It would sign the notarized bft‑block with the corresponding private key. This change is a wash in terms of protocol complexity.
+Considered as a bft‑proposal, a bc‑block needs to refer to a parent bft‑block, which requires a field in the bc‑header. With some caveats depending on the design of , it might be possible to merge this with the field, but for now we will assume that it is not merged.
+What are the caveats?
+ +If we are in an execution where Final Agreement holds for , then it is possible to show that merging the two fields has no negative effect, provided that has no additional rules that could disallow it in some cases.
+This is because, by Final Agreement, for any potential bft‑block that the bc‑block‑producer of a new block could choose as . Suppose that the bc‑block‑producer freely chooses according to the desired honest behaviour for a bft‑proposer in , and then chooses to be the same block (which is always reasonable as long as it is allowed).
+In the case , we are done, because this choice of is allowed by the Extension rule.
+In the case , we can argue that would be a better choice than for as well as for , because it has a later final ancestor. This is where the argument might fall down if (and therefore ) has any additional rules that could disallow this choice. For now let’s suppose that situtation does not arise, but it is one of the caveats.
+Another potential problem is that in an execution where Final Agreement does not hold for , we can no longer infer that either or . In particular it could be the case that the producer of was adversarial, and chose in such a way as to favour its own bft‑block that is final in that context.
+However, in that situation it must be possible for the bc‑block‑producer to see (and prove) that the bft‑chain has a final fork. That is, it can produce a witness to the violation of Final Agreement, showing that does not hold, as discussed in the section Recording more info about the bft‑chain in bc‑blocks above.
+The second caveat is that in that situation, we still need to set and in order to be able to recover, and they typically should not be the same in order to do so.
+The Increasing Tip Score rule is still needed, but it can be simplified. A newly produced bc‑block is also a bft‑proposal such that . This would yield the following bft‑proposal / bc‑block validity rule:
+++[Candidate rule for discussion] Either or .
+
except that cannot be , because is newly produced. It turns out we can just drop that part:
+++Increasing Tip Score (producer = proposer) rule: .
+
This works because, if does not have a higher score than the bc‑block , the bc‑block‑producer should instead have built on top of that bc‑block — which was necessarily known to the producer in order for it to set in the header of the new block.
+The voting would be the same, performed by the same parties. Therefore, there is no concentration of voting into fewer parties. There is no change in the producer/proposer’s incentive to make the bft‑notarization‑proof or its soundness properties. Everything else is roughly the same, including the use of the field of a bc‑block and the validity rules related to it. As far as I can see, all of the security analysis goes through essentially unchanged.
+There may be some complication due to the fact that BFT protocols are typically designed to use epochs with a fixed period, whereas bc‑blocks are found at less predictable intervals. However, as long as BFT messages are labelled with the bc‑block they apply to, it seems like most BFT protocols would be tolerant to this change. In fact the adaptations of Snap‑and‑Chat to Hotstuff and PBFT in [NTT2020] already assume that BFT messages can be queued and processed at a later time, and rely on those BFT protocols' tolerance to this.
+In most PoS protocols, the requirement to have a minimum amount of stake in order to make a proposal acts as a gatekeeping filter on proposals, and potentially allows parties that make invalid proposals to be slashed.
+Strictly speaking, whether there is a stake requirement to make a proposal is independent of whether bc‑block‑producers (e.g. miners) are merged with bft‑proposers. It could be, for example, that a miner is still able to produce bc‑blocks, but is not able to make them into a proposal unless they satisfy a stake requirement. (This would have significant effects on the economics of mining that would need to be analyzed, and that might have governance consequences.)
+In a system that uses PoS, validators by definition need to have stake in order to control the ability to vote. This also allows validators to be slashed.
+On the other hand, there is no technical reason why the ability to make a bft‑proposal has to be gatekept by a stake requirement — given the situation of Zcash in which we already have a mining infrastructure, and that in a Snap‑and‑Chat or Crosslink‑style hybrid protocol we necessarily still rely on miners not to censor transactions. The potential to make proposals that are expensive to validate as a denial of service is made sufficiently difficult by proof‑of‑work. This option has probably been underexplored by previous PoS protocols because they cannot assume an existing mining infrastructure.
+It could be argued that this approach goes less far toward a pure PoS‑based block‑chain protocol, leaving more to be done in the second stage. However, there is a clear route to that second stage, by replacing PoW with a protocol like PoSAT that has full unpredictabilty and dynamic availability. PoSAT does this using a VDF, and as it happens, Halo 2 is a strong candidate to be used to construct such a VDF.
+If the arguments in [DKT2021] about the need for proposer unpredictability are persuasive, then this approach defers the complexity of requiring a VDF without losing any security, since Zcash’s PoW is already unpredictable.
+Building on the previous idea, we can try to eliminate the explicit bft‑chain by piggybacking the information it would hold onto a bc‑block (in the header and/or the block contents). In the previous section we merged the concepts of a bft‑proposal and a bc‑block; the and fields of a bft‑proposal were moved into the and fields of a bc‑header respectively. A field was also added to hold the producer’s public key, so that the producer can sign the bft‑block constructed from it using the corresponding private key.
+This left the concept of a bft‑block intact. Recall that in Crosslink 1, a bft‑block consists of signed by the proposer. So in “Crosslink with proposer = producer”, a bft‑block consists of signed by the producer.
+What if a bc‑block were to “inline” its parent and context bft‑blocks rather than referring to them? I.e. a bc‑block with referring to signed for , would instead +literally include (either in the header or the coinbase transaction) signed for — and similarly for .
+It would still be necessary to have the message type that the proposer/producer previously used to submit a notarized bft‑block. (It cannot be merged with a bc‑block announcement: the producer of a new block is not in general the producer of its parent, and their incentives may differ; also we cannot wait until a new block is found before publishing the previous notarization.) It would also still be necessary for Crosslink nodes to keep track of notarizations that have not made it into any bc‑block. Nevertheless, this is a potential simplification.
+Note that unless notarization proofs are particularly short and constant-length, it would not be appropriate to include them in the bc‑block headers, and so they would have to go into the coinbase transaction or another similar variable-length data structure. In that case we would still have an indirection to obtain the bft‑block information; it would just be merged with the indirection to obtain a coinbase transaction (or similar) — which is already needed in order to check validity of the bc‑block.
+As discussed under Recording more info about the bft‑chain in bc‑blocks above, we might in any case want to record information about other proposed and notarized bft‑blocks, and the data structure needed for this would necessarily be variable-length. The complexity burden of doing so would be shared between these two changes.
+It would be possible to save some space in headers (while keeping them fixed-length), by inlining only one of and in the header and keeping the other as a hash. As discussed under “What are the caveats?” above, the only reason for the two bft‑blocks referred to by these fields to be different, is that the bc‑block producer has observed a violation of Final Agreement in . In that case, we can include an inlining of the other block, and any other information needed to prove that a violation of Final Agreement has occurred, in a variable-length overflow structure.
+Pros:
+Con:
+A potential simplification can be obtained by combining the following two ideas:
+Str4d’s suggestion can be written as:
+Linearity rule: .
+Notice that this implies the existing Increasing Score rule in Crosslink 1, because score necessarily increases within a bc‑valid‑chain. Therefore it would in practice be a replacement for the Increasing Score rule. It does not imply the Increasing Tip Score rule discussed above, and in fact it could make sense to enforce both the Linearity rule and the Increasing Tip Score rule.
+The Linearity rule implies that it is no longer possible for a bft‑valid‑chain to snapshot a bc‑chain that rolls back relative to the previous snapshot. This makes it unnecessary to sanitize : the sequence of snapshots considered by is linear, and so the “sanitization” would just return the transactions in the last snapshot.
+To remove the need to sanitize as well, we need a further modification to . Recall that in Crosslink 1 we define: +The Linearity rule ensures that is a linear sequence of snapshots, but for to be linear, we also need . In order for this to hold for any choice of with , we require the strongest version of this condition with , i.e. .
+Since we can only enforce that this holds for by enforcing that it holds for an arbitrary bc‑valid‑block , the rule becomes:
+Last Final Snapshot rule: .
+This is exactly Nate’s suggestion discussed in Questions about Crosslink. In that document we argued against this rule, but that argument was made in the context of a protocol without the Linearity rule (and originally, even without the Increasing Score rule).
+Combining the Linearity rule and Last Final Snapshot rule, on the other hand, completely eliminates the need for sanitization. This could be a huge simplification — and potentially safer, since it would avoid breaking assumptions that may be made by existing Zcash node implementations and other consumers of the Zcash block chain.
+To spell out the resulting simplifications to the definitions of and \LOG^t_{\bda,\mu,i}, we would just have: +\begin{array}{rcl} +\LOG_{\fin,i}^t &:=& \snapshotlf{\ch_i^t \trunc_{\bc}^\sigma} \ +\LOG_{\bda,\mu,i}^t &:=& \ch_i^t \trunc_{\bc}^\mu +\end{array}
+Here it is no longer necessary to define and \LOG^t_{\bda,\mu,i} as sequences of transactions, since the final and bounded-available chains are both just bc‑valid‑chains.
+The definition of in the Finality depth rule becomes much simpler: As before, either or .
+Avoiding sanitization also means that the bug we described in Snap‑and‑Chat, that could prevent spending outputs from a snapshotted chain after a -block rollback, cannot occur by construction. That is, the changes in contexual validity relative to are not needed any more.
+This almost seems too simple, and indeed we should be skeptical, because the security analysis essentially has to be redone. The reason why Snap‑and‑Chat didn’t take this approach is that it requires a more complicated argument to show that it is reasonable to believe in the safety assumptions of whenever it is reasonable to believe in the corresponding assumptions for . This is because the ... We will need to do some work to show that the changes are benign.
+The key observation needed for this analysis is that neither the Linearity rule nor the Last Final Snapshot rule affect the evolution of unless we are in a situation where its Prefix Consistency or Prefix Agreement properties would be violated.
+This implies that any safety property that we can prove given Prefix Consistency plus ****.
+ +Rationale: Can we outright prevent rollbacks > from ever appearing in ?
+This document analyzes the effect of this rule on its own. For the effect in combination with an additional Linearity rule, see the Linearity and Last Final Snapshot rules section of Potential changes for Crosslink 2.
+Daira-Emma: In a variant of Crosslink with this rule, an adversary’s strategy would be to keep the fields in its blocks as such that when the attack starts, and then fork from the bc‑best‑chain that extends . If its private chain falls behind the public bc‑best‑chain, it resets, just like in a conventional private mining attack.
+Note that the proposed rule does not prevent the adversary’s private chain from just staying at the same block. The reason is that Crosslink does not change the fork-choice rule of . That is, even if the adversary’s chain has a that is far behind the current bft‑block, it is still allowed to become the bc‑best‑chain.
+(Eventually the adversary’s chain using this strategy will hit the finality gap bound of blocks. But that must be significantly greater than , to avoid availability problems. So it does not prevent the adversary from performing a rollback longer than blocks before they hit the bound. Also, going into Stalled Mode for new blocks does not prevent the attacker’s chain from having included harmful transactions before that point.)
+It is possible to change the fork-choice rule, for example so that the bc‑best‑chain for a node is required to extend where is the last final block for any bft-chain in node ’s view.
+This would break the current safety and liveness arguments for Crosslink. But for the sake of argument, suppose we did it anyway.
+The adversary’s strategy would change slightly: it resets if either its private chain falls behind the public bc‑best‑chain, or its private chain is invalidated because it forks before for some new last final block of a bft-chain. During the attack, it also attempts to impede progress of the BFT protocol as far as possible.
+In that case, the proposed rule still does not preclude a rollback of more than blocks, for several reasons:
+In general we can’t say anything about how many bc‑blocks are mined in any given interval, so it could be the case that more than blocks are mined on both the honest chain and the adversary’s chain before it would be realistically possible to go through even a single round of the BFT protocol.
+Nor can we say anything about how quickly those blocks are finalized, unless we enforce it, which we don’t. (In Crosslink we do enforce a finalization gap bound , but as explained above must be significantly greater than , so that doesn’t really help.)
+will typically be at least blocks back from . +The argument for that goes:
+++None of the block hashes in can point to because that would be a hash cycle. In a typical case where no block withholding and no other rollback (not caused by the adversary) occurs on the honestly mined chain, the proposer of the last final block before a context bft‑block that can point to will have, at the latest, as . Under these conditions, will point, at the latest, to blocks before , i.e. blocks before .
+
This means that by the time could catch up to , on average block times will have occurred. So, roughly speaking, the rule that does not usefully constrain the adversary until after block times. Wlog let’s assume uses PoW: if is chosen reasonably, then being able to do a -block rollback at all probably requires having somewhere close to 50% of network hash rate. And so in block times the adversary has a significant chance of being able to do the required rollback before the suggested rule “kicks in”. (It also has however much additional latency is added by the BFT protocol, which is simultaneously under attack to maximize this latency.)
+All that said, does the suggested rule help? First we have to ask whether it introduces any weaknesses.
+Okay, but is it a good idea to make that change to the fork-choice rule anyway?
+Probably not. I don’t know how to repair the safety and liveness arguments.
+The change was that the bc‑best‑chain for a node would be required to extend where is the last final bft‑block in node ’s view.
+From the point of view of any modular analysis that treats as potentially subverted, we cannot say anything useful about . It seems as though any repair would have to assume much more about the BFT protocol than is desirable.
+In general, changes to fork‑choice rules are tricky; it was a fork-choice rule problem that allowed the liveness attack against Casper FFG described in [NTT2020, Appendix E].
+What if validators who see that a long rollback occurred, refuse to vote for it?
+Yep that is allowed. The rule is “An honest validator will only vote for a proposal if ...” (not if‑and‑only‑if). If an honest validator sees a “good” reason not to vote for a proposal, including reasons based on out‑of‑band information, they should not. The Complementarity argument made in The Argument for Bounded Availability and Finality Overrides actually depends on this. Obviously, it may affect BFT liveness (and that’s okay).
+The only reason why we don’t make this part of the voting condition is that it’s a stateful rule. A new validator could come along and wouldn’t have the state needed to enforce it. Perhaps that could be fixed.
+ +This document analyzes the security of Crosslink 2 in terms of liveness and safety. It assumes that you have read The Crosslink 2 Construction which defines the protocol.
+First note that Crosslink 2 intentionally sacrifices availability if there is a long finalization stall.
+This is technically independent of the other changes; you can omit the Finality depth rule and the protocol would still have security advantages over Snap‑and‑Chat, as well as being much simpler and solving its “spending from finalized outputs” issue. In that case the incentives to “pull” the finalization point forward to include new final bft‑blocks would be weaker, but honest bc‑block‑producers would still do it.
+It would still be a bug if there were any situation in which failed to be live, though, because that would allow tail‑thrashing attacks.
+Crosslink 2 involves a bidirectional dependence between and . The Ebb‑and‑Flow paper [NTT2020] argues in Appendix E ("Bouncing Attack on Casper FFG") that it can be more difficult to reason about liveness given a bidirectional dependence between protocols:
+++To ensure consistency among the two tiers [of Casper FFG], the fork choice rule of the blockchain is modified to always respect ‘the justified checkpoint of the greatest height [*]’ [22]. There is thus a bidirectional interaction between the block proposal and the finalization layer: blocks proposed by the blockchain are input to finalization, while justified checkpoints constrain future block proposals. This bidirectional interaction is intricate to reason about and a gateway for liveness attacks.
+
[*] The quotation changes this to “[depth]”, but our terminology is consistent with Ethereum here and not with [NTT2020]’s idiosyncratic use of “depth” to mean “block height”.
+The argument is correct as far as it goes. The main reason why this does not present any great difficulty to proving liveness of Crosslink, is due to a fundamental difference from Casper FFG: in Crosslink 2 the fork‑choice rule of is not modified.
+Let be the subset of bc‑blocks . Assume that is such that a bc‑block‑producer can always produce a block in .
+In that case it is straightforward to convince ourselves that the additional bc‑block‑validity rules are never an obstacle to producing a new bc‑block in :
+The instructions to an honest bc‑block‑producer allow it to produce a block in . Therefore, remains live under the same conditions as .
+The additional bft‑proposal‑validity, bft‑block‑validity, bft‑finality, and honest voting rules are also not an obstacle to making, voting for, or finalizing bft‑proposals:
+Therefore, remains live under the same conditions as .
+The only other possibility for a liveness issue in Crosslink 2 would be if the change to the constructions of or could cause either of them to stall, even when and are both still live.
+However, liveness of and the Linearity rule together imply that at each point in time, provided there are sufficient honest bft‑proposers/validators, eventually a new bft‑block with a higher-scoring snapshot will become final in the context of the longest bft‑valid‑chain. make that more precise.
+Because of the Extension rule, this new bft‑block must be a descendent of the previous final bft‑block in the context visible to bc‑block‑producers. Therefore, the new finalized chain will extend the old finalized chain.
+Finally, we need to show that Stalled Mode is only triggered when it should be; that is, when the assumptions needed for liveness of are violated. Informally, that is the case because, as long as there are sufficient honest bc‑block‑producers and sufficient honest bft‑proposers/validators, the finalization point implied by the field at the tip of the bc‑best chain in any node’s view will advance fast enough for the finalization gap bound not to be hit. This depends on the value of relative to , the network delay, the hash rate of honest bc‑block‑producers, the number of honest bft‑proposers and the proportion of voting units they hold, and other details of the BFT protocol. more detailed argument needed, especially for the dependence on .
+Not updated for Crosslink 2 below this point.
+Recall the definition of Assured Finality.
++
An execution of Crosslink 2 has Assured Finality iff for all times , and all nodes , (potentially the same) such that is honest at time and is honest at time , we have .
+First we prove that Assured Finality is implied by Prefix Agreement of .
++
An execution of has Prefix Agreement at confirmation depth , iff for all times , and all nodes , (potentially the same) such that is honest at time and is honest at time , we have .
++
In an execution of Crosslink 2 for which the subprotocol has Prefix Agreement at confirmation depth , that execution has Assured Finality.
+Proof: Suppose that we have times , and nodes , (potentially the same) such that is honest at time and is honest at time . Then, by the Local fin-depth lemma applied to each of node and node , there exist times at which node is honest and at which node is honest, such that and . By Prefix Agreement at confirmation depth , we have . Wlog due to symmetry, suppose . Then (by transitivity of ) and (as above), so by the Linear prefix lemma.
+Then we prove that Assured Finality is also implied by Final Agreement of .
++
An execution of has Final Agreement iff for all bft‑valid blocks in honest view at time and in honest view at time , we have .
++
In an execution of Crosslink 2 for which the subprotocol has Final Agreement, that execution has Assured Finality.
+Proof: Suppose that we have times , and nodes , (potentially the same) such that is honest at time and is honest at time . Then, by the Local fin-depth lemma applied to each of node and node , there exist times at which node is honest and at which node is honest, such that and . By Prefix Agreement at confirmation depth , we have . Wlog due to symmetry, suppose . Then (by transitivity of ) and (as above), so by the Linear prefix lemma.
++
By the Local ba-depth lemma, we have:
+++In any execution of Crosslink 2, for any confirmation depth and any node that is honest at time , there exists a time such that .
+
Renaming to and to in the definition of Prefix Consistency gives:
+++An execution of has Prefix Consistency at confirmation depth , iff for all times and all nodes , (potentially the same) such that is honest at time and is honest at time , we have that .
+
Since any node that is honest at time is also honest at time , and by transitivity of , we therefore have:
+++In any execution of Crosslink 2 that has Prefix Consistency at confirmation depth , for all times and all nodes , (potentially the same) such that is honest at time and is honest at time , we have that .
+
The Extension rule ensures that, informally, if a given node ’s view of its bc‑best‑chain at a depth of blocks does not roll back, then neither does its view of the bft‑final block referenced by its bc‑best‑chain, and therefore neither does its view of .
+This does not by itself imply that all nodes are seeing the “same” confirmed bc‑best‑chain (up to propagation timing), or the same . If the network is partitioned and is subverted, it could be that the nodes on each side of the partition follow a different fork, and the adversary arranges for each node’s view of the last final bft‑block to be consistent with the fork it is on. It can potentially do this if it has more than one third of validators, because if the validators are partitioned in the same way as other nodes, it can vote with an additional one third of them on each side of the fork.
+This is, if you think about it, unavoidable. doesn’t include the mechanisms needed to maintain finality under partition; it takes a different position on the CAP trilemma. In order to maintain finality under partition, we need not to be subverted (and to actually work!)
+So what is the strongest security property we can realistically get? It is stronger than what Snap‑and‑Chat provides. Snap‑and‑Chat is unsafe even without a partition if is subverted. Ideally we would have a protocol with safety that is only limited by attacks “like” the unavoidable attack described above — which also applies to used on its own.
+In order to capture the intuition that it is hard in practice to cause a consistent partition of the kind described in the previous section, we will need to assume that the Prefix Agreement safety property holds for the relevant executions of . The structural and consensus modifications to relative to seem unlikely to have any significant effect on this property, given that we proved above that they do not affect liveness. ==TODO: that is a handwave; we should be able to do better, as we do for below.== So, to the extent that it is reasonable to assume that Prefix Agreement holds for executions of under some conditions, it should also be reasonable to assume it holds for executions of under the same conditions.
+Recall that .
++
If , are bc‑valid blocks with , then .
+Proof: Using the Extension rule, by induction on the distance between and .
+Using the Prefix Lemma once for each direction, we can transfer the Prefix Agreement property to the referenced bft‑blocks:
++
In an execution of that has Prefix Agreement at confirmation depth , for all times , and all nodes , (potentially the same) such that is honest at time and is honest at time , we have .
+Let be the sequence of transactions in the given chain , starting from genesis.
+Recall that
+Because takes the form , we have that . (This would be true for any well‑typed and .)
+By this observation and the Prefix Agreement Lemma, we also have that, in an execution of Crosslink 2 where has the Prefix Agreement safety property at confirmation depth , for all times , and all nodes , (potentially the same) such that is honest at time and is honest at time , .
+Because only considers previous state, ∘ must be a prefix-preserving map; that is, if then . Therefore:
+In an execution of Crosslink 2 where has Prefix Agreement at confirmation depth , for all times , and all nodes , (potentially the same) such that is honest at time and is honest at time , .
+Notice that this does not depend on any safety property of , and is an elementary proof. ([NTT2020, Theorem 2] is a much more complicated proof that takes nearly 3 pages, not counting the reliance on results from [PS2017].)
+In addition, just as in Snap‑and‑Chat, safety of can be inferred from safety of , which follows from safety of . We prove this based on the Final Agreement property for executions of :
+An execution of has the Final Agreement safety property iff for all origbft‑valid blocks in honest view at time and in honest view at time , we have .
+The changes in relative to only add structural components and tighten bft‑block‑validity and bft‑proposal‑validity rules. So for any legal execution of there is a corresponding legal execution of , with the structural additions erased and with the same nodes honest at any given time. A safety property, by definition, only asserts that executions not satisfying the property do not occur. Safety properties of necessarily do not refer to the added structural components in . Therefore, for any safety property of , including Final Agreement, the corresponding safety property holds for .
+By the definition of as above, in an execution of Crosslink 2 where has Final Agreement, for all times , and all nodes , (potentially the same) such that is honest at time and is honest at time , . Therefore, by an argument similar to the one above using the fact that ∘ is a prefix-preserving map:
+In an execution of Crosslink 2 where has Final Agreement, for all times , and all nodes , (potentially the same) such that is honest at time and is honest at time , .
+From the two Safety theorems and the Ledger prefix property, we immediately have:
+Let be an arbitrary choice of confirmation depth for each node . Consider an execution of Crosslink 2 where either has Prefix Agreement at confirmation depth or has Final Agreement.
+In such an execution, for all times , and all nodes , (potentially the same) such that is honest at time and is honest at time , either or .
+Corollary: Under the same conditions, if wlog , then .
+The above property is not as strong as we would like for practical uses of , because it does not say anything about rollbacks up to the finalization point, and such rollbacks may be of unbounded length. (Loosely speaking, the number of non‑Stalled Mode bc‑blocks after the consensus finalization point is bounded by , but we have also not proven that so far.)
+As documented in the Model for BFT protocols section of The Crosslink 2 Construction):
+++For each epoch, there is a fixed number of voting units distributed between the players, which they use to vote for a bft‑proposal. We say that a voting unit has been cast for a bft‑proposal at a given time in a bft‑execution, if and only if is bft‑proposal‑valid and a ballot for authenticated by the holder of the voting unit exists at that time.
+Using knowledge of ballots cast for a bft‑proposal that collectively satisfy a notarization rule at a given time in a bft‑execution, and only with such knowledge, it is possible to obtain a valid bft‑notarization‑proof . The notarization rule must require at least a two‑thirds absolute supermajority of voting units in ’s epoch to have been cast for . It may also require other conditions.
+A voting unit is cast non‑honestly for an epoch’s proposal iff:
++
+- it is cast other than by the holder of the unit (due to key compromise or any flaw in the voting protocol, for example); or
+- it is double‑cast (i.e. there are two ballots casting it for distinct proposals); or
+- the holder of the unit following the conditions for honest voting in , according to its view, should not have cast that vote.
+
+
An execution of has the one‑third bound on non‑honest voting property iff for every epoch, strictly fewer than one third of the total voting units for that epoch are ever cast non‑honestly.
++
By a well known argument often used to prove safety of BFT protocols, in an execution of Crosslink 2 where has the one‑third bound on non‑honest voting property (and assuming soundness of notarization proofs), any bft‑valid block for a given epoch in honest view must commit to the same proposal.
+Proof (adapted from [CS2020, Lemma 1]): Suppose there are two bft‑proposals and , both for epoch , such that is committed to by some bft‑block‑valid block , and is committed to by some bft‑block‑valid block . This implies that and have valid notarization proofs. Let the number of voting units for epoch be . Assuming soundness of the notarization proofs, it must be that at least voting units for epoch , denoted as the set , were cast for , and at least voting units for epoch , denoted as the set , were cast for . Since there are voting units for epoch , must have size at least . In an execution of Crosslink 2 where has the one‑third bound on non‑honest voting property, must therefore include at least one voting unit that was cast honestly. Since a voting unit for epoch that is cast honestly is not double-cast, it must be that .
+In the case of a network partition, votes may not be seen on both/all sides of the partition. Therefore, it is not necessarily the case that honest nodes can directly see double‑voting. The above argument does not depend on being able to do so.
+Therefore, in an execution of Crosslink 2 for which has the one‑third bound on non‑honest voting property, for each epoch there will be at most one bft‑proposal‑valid proposal , and at least one third of honestly cast voting units must have been cast for it. Let be the (necessarily nonempty) set of nodes that cast these honest votes; then, for all at the times of their votes in epoch . (For simplicity, we assume that for each honest node there is only one time at which it obtains a successful check for the voting condition in epoch , which it uses for any votes that it casts in that epoch.)
+Let be any bft‑block for epoch such that , where is some bft‑block‑valid block. Since , is bft‑block‑valid. So by the argument above, commits to the only bft‑proposal‑valid proposal for epoch , and was voted for in that epoch by a nonempty subset of honest nodes .
+Let be any bc‑valid block. We have by definition: So, taking , each for of epoch in the result of satisfies for all in some nonempty honest set of nodes .
+For an execution of Crosslink 2 in which has the Prefix Consistency property at confirmation depth , for every epoch , for every such , for every node that is honest at any time , we have . Let . Then, by transitivity of :
+In an execution of Crosslink 2 where has the one‑third bound on non‑honest voting property and has the Prefix Consistency property at confirmation depth , every bc‑chain in (and therefore every snapshot that contributes to ) is, at any time , in the bc‑best‑chain of every node that is honest at time (where commits to at epoch and is the time of the first honest vote for ).
+A similar (weaker) statement holds if we replace with , since the time of the first honest vote for necessarily precedes the time at which the signed is submitted as a bft‑block, which necessarily precedes . (Whether or not the notarization proof depends on the first honest vote for ’s proposal , it must depend on some honest vote for that proposal that was not made earlier than .)
+Furthermore, we have
+So in an execution of Crosslink 2 where has the Prefix Consistency property at confirmation depth , if node is honest at time then is also, at any time , in the bc‑best‑chain of every node that is honest at time .
+If an execution of has the Prefix Consistency property at confirmation depth , then it necessarily also has it at confirmation depth . Therefore we have:
+In an execution of Crosslink 2 where has the one‑third bound on non‑honest voting property and has the Prefix Consistency property at confirmation depth , every bc‑chain snapshot in (and therefore every snapshot that contributes to ) is, at any time , in the bc‑best‑chain of every node that is honest at time .
+Sketch: we also need the sequence of snapshots output from fin to only be extended in the view of any node. In that case we can infer that the node does not observe a rollback in LOG_ba.
+Recall that in the proof of safety for , we showed that in an execution of Crosslink 2 where (or ) has Final Agreement, for all times , and all nodes , (potentially the same) such that is honest at time and is honest at time , .
+What we want to show is that, under some conditions on executions, ...
+Unlike Snap‑and‑Chat, Crosslink 2 requires structural and consensus rule changes to both and . On the other hand, several of those changes are arguably necessary to fix a show‑stopping bug in Snap‑and‑Chat (not being able to spend some finalized outputs).
+For a given choice of , the finalization latency is higher. The snapshot of the BFT chain used to obtain is obtained from the block at depth on node ’s best chain, which will on average lead to a finalized view that is about blocks back (in ), rather than \sigma_{\sac}} blocks in Snap‑and‑Chat. This is essentially the cost of ensuring that safety is given by the stronger of the safety of (at confirmations) and the safety of .
+On the other hand, the relative increase in expected finalization latency is only \frac{\mu + 1 + \sigma}{\sigma_{\sac}}}, i.e. at most slightly more than a factor of 2 for the case \mu = \sigma = \sigma_{\sac}}.
+See the Liveness section above.
+In order to show that Crosslink 2 is at a local optimum in the security/complexity trade‑off space, for each rule we show attacks on safety and/or liveness that could be performed if that rule were omitted or simplified.
+Edit: some rules, e.g. the Linearity rule, only contribute heuristically to security in the analysis so far.
+ +This document considers disadvantages of allowing transactions to continue to be included at the chain tip while the gap from the last finalized block becomes unbounded, and our perspective on what should be done instead. This condition is allowed by Ebb‑and‑Flow protocols [NTT2020].
+We also argue that it is necessary to allow for the possibility of overriding finalization in order to respond to certain attacks, and that this should be explicitly modelled and subject to a well-defined governance process.
+This is a rewritten version of this forum post, adapting the main argument to take into account the discussion of “tail‑thrashing attacks” and finalization availability from the Addendum. More details of how bounded availability could be implemented in the context of a Snap‑and‑Chat protocol are in Notes on Snap‑and‑Chat.
+The proposed changes end up being significant enough to give our construction a new name: “Crosslink”, referring to the cross-links between blocks of the BFT and best-chain protocols. Crosslink has evolved somewhat, and now includes other changes not covered in either this document or Notes on Snap‑and‑Chat. The current version is called Crosslink 2.
+“Ebb‑and‑Flow”, as described in [NTT2020] (arXiv version), is a security model for consensus protocols that provide two transaction logs, one with dynamic availability, and a prefix of it with finality.
+The paper proposes an instantiation of this security model called a “Snap‑and‑Chat” construction. It composes two consensus subprotocols, a BFT subprotocol and a best-chain subprotocol (it calls this the “longest chain protocol”). The above logs are obtained from the output of these subprotocols in a non-trivial way.
+This is claimed by the paper to “resolve” the tension between finality and dynamic availability. However, a necessary consequence is that in a situation where the “final” log stalls and the “available” log does not, the “finalization gap” between the finalization point and the chain tip can grow without bound. In particular, this means that transactions that spend funds can remain unfinalized for an arbitrary length of time.
+In this document, we argue that this is unacceptable, and that it is preferable to sacrifice strict dynamic availability. The main idea behind Ebb‑and‑Flow protocols is a good one, and allowing the chain tip to run ahead of the finalization point does make sense and has practical advantages. However, we also argue that it should not be possible to include transactions that spend funds in blocks that are too far ahead of the finalization point.
+Naive ways of preventing an unbounded finalization gap, such as stopping the chain completely in the case of a finalization stall, turn out to run into serious security problems — at least when the best-chain protocol uses Proof‑of‑Work. We’ll discuss those in detail in the section on tail‑thrashing attacks.
+Our proposed solution will be to require coinbase-only blocks during a long finalization stall. This solution has the advantage of not complicating the security analysis.
+We argue that losing strict dynamic availability in favour of “bounded availability” is preferable to the consequences of the unbounded finality gap, if/when a “long finalization stall” occurs.
+We also argue that it is beneficial to explicitly allow “finality overrides” under the control of a well-documented governance process. Such overrides allow long rollbacks that may be necessary in the case of an exploited security flaw. This is complementary to the argument for bounded availability, because the latter limits the period of user transactions that could be affected. The governance process can impose a limit on the length of this long rollback if desired.
+Since partition between nodes sufficient for finalization cannot be prevented, loosely speaking the CAP theorem implies that any consistent protocol (and therefore any protocol with finality) may stall for at least as long as the partition takes to heal.
+That “loosely speaking” is made precise by [LR2020].
+Dynamic availability implies that the chain tip will continue to advance, and so the finalization gap increases without bound.
+Partition is not necessarily the only condition that could cause a finalization stall, it is just the one that most easily proves that this conclusion is impossible to avoid.
+Both the available protocol, and the subprotocol that provides finality, will be used in practice — otherwise, one or both of them might as well not exist. There is always a risk that blocks may be rolled back to the finalization point, by definition.
+Suppose, then, that there is a long finalization stall. The final and available protocols are not separate: there is no duplication of tokens between protocols, but the rules about how to determine best-effort balance and guaranteed balance depend on both protocols, how they are composed, and how the history after the finalization point is interpreted.
+The guaranteed minimum balance of a given party is not just the minimum of their balance at the finalization point and their balance at the current tip. It is the minimum balance taken over all possible transaction histories that extend the finalized chain — taking into account that a party’s previously published transactions might be able to be reapplied in a different context without its explicit consent. The extent to which published transactions can be reapplied depends on technical choices that we must make, subject to some constraints (for example, we know that shielded transactions cannot be reapplied after their anchors have been invalidated). It may be desirable to further constrain re-use in order to make guaranteed minimum balances easier to compute.
+As the finalization gap increases, the negative consequences of rolling back user transactions that spend funds increase. (Coinbase transactions do not spend funds; they are a special case that we will discuss later.)
+There are several possible —not mutually exclusive— outcomes:
+Any of these might precipitate a crisis of confidence, and there are reasons to think this effect might be worse than if the chain had switched to a “Stalled Mode” designed to prevent loss of user funds. Any such crisis may have a negative effect on token prices and long-term adoption.
+Note that adding finalization using an Ebb‑and‑Flow protocol does not by itself increase the probability of a rollback in the available chain, provided the PoW remains as secure against rollbacks of a given length as before. But that is a big proviso. We have a design constraint (motivated by limiting token devaluation and by governance issues) to limit issuance to be no greater than that of the original Zcash protocol up to a given height. Since some of the issuance is likely needed to reward staking, the amount of money available for mining rewards is reduced, which may reduce overall hash rate and security of the PoW. Independently, there may be a temptation for design decisions to rely on finality in a way that reduces security of PoW (“risk compensation”). There is also pressure to reduce the energy usage of PoW, which necessarily reduces the global hash rate, and therefore the cost of performing an attack that depends on the adversary having any given proportion of global hash rate.
+It could be argued that the issue of availability of services that depend on finality is mainly one of avoiding over-claiming about what is possible. Nevertheless there are also real usability issues if balances as seen by those services can differ significantly and for long periods from balances at the chain tip.
+Regardless, incorrect assumptions about the extent to which the finalized and available states can differ are likely to be exposed if there is a finalization stall. And those who made the assumptions may (quite reasonably!) not accept “everything is fine, those assumptions were always wrong” as a satisfactory response.
+An intuitive notion of “availability” for block‑chain protocols includes the ability to use the protocol as normal to spend funds. So, just to be clear, in a situation where that cannot happen we have lost availability, even if the block chain is advancing.
+Bounded availability is a weakening of dynamic availability [DKT2020]. It means that we intentionally sacrifice availability when some potentially hazardous operation —a “hazard” for short— would occur too far after the current finalization point. For now, assume for simplicity that our only hazard is spending funds. More generally, the notion of bounded availability can be applied to a wider range of protocols by tailoring the definition of “hazard” to the protocol.
+This talk by Soubhik Deb accompanying [DKT2020] provides a good explanation of the advantages of dynamic availability. We do not define bounded availability formally in this document, but informally, we aim to preserve the ability to securely adapt to large changes in hash rate or total stake.
+[NTT2020] calls the dynamically available block‑chain protocol that provides input to the rest of the contruction, the “longest chain” protocol. There are two reasons to avoid this terminology:
+The error in conflating the “longest chain” with the observed consensus-valid chain with most accumulated work, originates in the Bitcoin whitepaper. [Nakamoto2008, page 3]
+We will use the term “best‑chain protocol” instead. Note that this corresponds roughly to in the Snap‑and‑Chat construction, although the Crosslink 2 protocol that we propose will end up having other significant differences from Snap-and-Chat.
+We have not yet decided how to block hazards during a long finalization stall. We could do so directly, or by stopping block production in the more-available protocol. For reasons explained in the section on tail‑thrashing attacks below, it’s desirable not to stop block production. And so it’s consistent to have bounded availability together with another liveness property —which can be defined similarly to dynamic availability— that says the more-available protocol’s chain is still advancing. This is what we will aim for.
+We will call this method of blocking hazards, without stopping block production, “going into Stalled Mode”.
+This concept of Stalled Mode is very similar to a feature that was discussed early in the development of Zcash, but never fully designed or implemented. (After originally being called “Stalled Mode”, it was at some point renamed to “Emergency Mode”, but then the latter term was used for something else.)
+For Zcash, we propose that the main restriction of Stalled Mode should be to require coinbase-only blocks. This achieves a similar effect, for our purposes, as actually stalling the more-available protocol’s chain. Since funds cannot be spent in coinbase-only blocks, the vast majority of attacks that we are worried about would not be exploitable in this state.
+It is possible that a security flaw could affect coinbase transactions. We might want to turn off shielded coinbase for Stalled Mode blocks in order to reduce the chance of that.
+Also, mining rewards cannot be spent in a coinbase-only block; in particular, mining pools cannot distribute rewards. So there is a risk that an unscrupulous mining pool might try to do a rug-pull after mining of non-coinbase-only blocks resumes, if there were a very long finalization stall. But this approach works at least in the short term, and probably for long enough to allow manual intervention into the finalization protocol, or governance processes if needed.
+An analogy for the effect of this on availability that may be familiar to many people, is that it works like video streaming. All video streaming services use a buffer to paper over short-term interruptions or slow-downs of network access. In most cases, this buffer is bounded. This allows the video to be watched uninterrupted and at a constant rate in most circumstances. But if there is a longer-term network failure or insufficient sustained bandwidth, the playback will unavoidably stall. In our case, block production does not literally stall, but it’s the same as far as users’ ability to perform “hazardous” operations is concerned.
+So, why do we advocate this over:
+The reason to reject option 1 is straightforward: finality is a valuable security property that is necessary for some use cases.
+If a protocol only provides finality (option 2), then short-term availability is directly tied to finalization. It may be possible to make finalization stalls sufficiently rare or short-lived that this is tolerable. But that is more likely to be possible if and when there is a well-established staking ecosystem. Before that ecosystem is established, the protocol may be particularly vulnerable to stalls. Furthermore, it’s difficult to get to such a protocol from a pure PoW system like current Zcash.
+We argued in the previous section that allowing hazards in an unbounded finalization gap is bad. Option 3 entails an unbounded finalization gap that will allow hazards. However, that isn’t sufficient to argue that bounded availability is better. Perhaps there are no good solutions! What are we gaining from a bounded availability approach that would justify the complexity of a hybrid protocol without obtaining strict dynamic availability?
+The argument goes like this:
+The argument that it is difficult to completely prevent finalization stalls is supported by experience on Ethereum in May 2023, when there were two stalls within 24 hours, one for about 25 minutes and one for about 64 minutes. This experience is consistent with our argument:
+Retaining short-term availability does not result in a risk compensation hazard:
+A potential philosophical objection to lack of strict dynamic availability is that it creates a centralization risk to availability. That is, it becomes more likely that a coalition of validators can deliberately cause a denial of service. This objection may be more prevalent among people who would object to adding a finality layer or PoS at all.
+Consensus protocols sometimes fail. Potential causes of failure include:
+In these situations, overriding finality may be better than any other alternative.
+An example is a balance violation flaw due to a 64-bit integer overflow that was exploited on Bitcoin mainnet on 15th August 2010. The response was to roll back the chain to before the exploitation, which is widely considered to have been the right decision. The time between the exploit (at block height 74638) and the forked chain overtaking the exploited chain (at block height 74691) was 53 blocks, or around 9 hours.
+Of course, Bitcoin used and still uses a pure‑PoW consensus. But the applicability of the example does not depend on that: the flaw was independent of the consensus mechanism.
+Another example of a situation that prompted this kind of override was the DAO recursion exploit on the Ethereum main chain in June 2016. The response to this was the forced balance adjustment hard fork on 20th July 2016 commonly known as the DAO fork. Although this adjustment was not implemented as a rollback, and although Ethereum was using PoW at the time and did not make any formal finality guarantees, it did override transfers that would heuristically have been considered final at the fork height. Again, this flaw was independent of the consensus mechanism.
+The DAO fork was of course much more controversial than the Bitcoin fork, and a substantial minority of mining nodes split off to form Ethereum Classic. In any case, the point of this example is that it’s always possible to override finality in response to an exceptional situation, and that a chain’s community may decide to do so. The fact that Ethereum 2.0 now does claim a finality guarantee, would not in practice prevent a similar response in future that would override that guarantee.
+The question then is whether the procedure to override finality should be formalized or ad hoc. We argue that it should be formalized, including specifying the governance process to be used.
+This makes security analysis — of the consensus protocol per se, of the governance process, and of their interaction — much more feasible. Arguably a complete security analysis is not possible at all without it.
+It also front‑loads arguing about what procedure should be followed, and so it is more likely that stakeholders will agree to follow the process in any time‑critical incident.
+There is another possible way to model a protocol that claims finality but can be overridden in practice. We could say that the protocol after the override is a brand‑new protocol and chain (inheriting balances from the previous one, possibly modulo adjustments such as those that happened in the DAO fork).
+Although that would allow saying that the finality property has technically not been violated, it does not match how users think about an override situation. They are more likely to think of it as a protocol with finality that can be violated in exceptional cases — and they would reasonably want to know what those cases are and how they will be handled. It also does nothing to help with security analysis of such cases.
+Finality overrides and bounded availability are complementary in the following way: if a problem is urgent enough, then validators can be asked to stop validating. For genuinely harmful problems, it is likely to be in the interests of enough validators to stop that this causes a finalization stall. If this lasts longer than the availability bound then the protocol will go into Stalled Mode, giving time for the defined governance process to occur and decide what to do. And because the unfinalized consensus chain will contain only a limited period of user transactions that spend funds, the option of a long rollback remains realistically open.
+If, on the other hand, there is time pressure to make a governance decision about a rollback in order to reduce its length, that may result in a less well-considered decision.
+A possible objection is that there might be a coalition of validators who ignore the request to stop (possibly including the attacker or validators that an attacker can bribe), in which case the finalization stall would not happen. But that just means that we don’t gain the advantage of more time to make a governance decision; it isn’t actively a disadvantage relative to alternative designs. This outcome can also be thought of as a feature rather than a bug: going into Stalled Mode should be a last resort, and if the argument given for the request to stop failed to convince a sufficient number of validators that it was reason enough to do so, then perhaps it wasn’t a good enough reason.
+This resolves one of the main objections to the original Stalled Mode idea that stopped us from implementing it in Zcash. The original proposal was to use a signature with a key held by ECC to trigger Stalled Mode, which would arguably have been too centralized. The Stalled Mode described in this document, on the other hand, can only be entered by consensus of a larger validator set, or if there is an availability failure of the finalization protocol.
+It is also possible to make the argument that the threshold of stake needed is imposed by technical properties of the finality protocol and by the resources of the attacker, which might not be ideal for the purpose described above. However, we would argue that it does not need to be ideal, and will be in the right ballpark in practice.
+There’s a caveat related to doing intentional rollbacks when using the Stalled Mode approach, where block production in the more-available protocol continues during a long finalization stall. What happens to incentives of block producers (miners in the case of Proof‑of‑Work), given that they know the consensus chain might be intentionally rolled back? They might reasonably conclude that it is less valuable to produce those blocks, leading to a reduction of hash rate or other violations of the security assumptions of .
+This is actually fairly easy to solve. We have the governance procedures say that if we do an intentional rollback, the coinbase-only mining rewards will be preserved. I.e. we produce a block or blocks that include those rewards paid to the same addresses (adjusting the consensus to allow them to be created from thin air if necessary), have everyone check it thoroughly, and require the chain to restart from that block. So as long as block producers believe that this governance procedure will be followed and that the chain will eventually recover at a reasonable coin price, they will still have incentive to produce on , at least for a time.
+Although the community operating the governance procedures has already obtained the security benefit of mining done on the rolled-back chain by the time it creates the new chain, there is a strong incentive not to renege on the agreement with miners, because the same situation may happen again.
+Earlier we said that there were two possible approaches to preventing hazards during a long finalization stall:
+a) go into a Stalled Mode that directly disallows hazardous transactions (for example, by requiring blocks to be coinbase-only in Zcash);
+b) temporarily cause the more-available chain to stall.
+This section describes an important class of potential attacks on approach b) that are difficult to resolve. They are based on the fact that when the unfinalized chain stalls, an adversary has more time to find blocks, and this might violate security assumptions of the more-available protocol. For instance, if the more-available protocol is PoW-based, then its security in the steady state is predicated on the fact that an adversary with a given proportion of hash power has only a limited time to use that power, before the rest of the network finds another block.
+For an analysis of the concrete security of Nakamoto-like protocols, see [DKT+2020] and [GKR2020]. These papers confirm the intuition that the “private attack” —in which an adversary races privately against the rest of the network to construct a forking chain— is optimal, obtaining the same tight security bound independently using different techniques.
+During a chain stall, the adversary no longer has a limited time to construct a forking chain. If, say, the adversary has 10% hash power, then it can on average find a block in 10 block times. And so in 100 block times it can create a 10-block fork.
+It may in fact be worse than this: once miners know that a finalization stall is happening, their incentive to continue mining is reduced, since they know that there is a greater chance that their blocks might be rolled back. So we would expect the global hash rate to fall —even before the finality gap bound is hit— and then the adversary would have a greater proportion of hash rate.
+Even in a pure Ebb‑and‑Flow protocol, a finalization stall could cause miners to infer that their blocks are more likely to be rolled back, but the fact that the chain is continuing would make that more difficult to exploit. This issue with the global hash rate is mostly specific to the more-available protocol being PoW: if it were PoS, then its validators might as well continue proposing blocks because it is cheap to do so. There might be other attacks when the more-available protocol is PoS; we haven’t spent much time analyzing that case.
+The problem is that the more-available chain does not necessarily just halt during a chain stall. In fact, for a finality gap bound of blocks, an adversary could cause the -block “tail” of the chain as seen by any given node to “thrash” between different chains. We will call this a tail‑thrashing attack.
+If a protocol allowed such attacks then it would be a regression relative to the security we would normally expect from an otherwise similar PoW-based protocol. It only occurs during a finalization stall, but note that we cannot exclude the possibility of an adversary being able to provoke a finalization stall.
+Note that in the Snap‑and‑Chat construction, snapshots of are used as input to the BFT protocol. That implies that the tail‑thrashing problem could also affect the input to that protocol, which would be bad (not least for security analysis of availability, which seems somewhat intractable in that case).
+Also, when restarting , we would need to take account of the fact that the adversary has had an arbitrary length of time to build long chains from every block that we could potentially restart from. It could be possible to invalidate those chains by requiring blocks after the restart to be dependent on fresh randomness, but that sounds quite tricky (especially given that we want to restart without manual intervention if possible), and there may be other attacks we haven’t thought of. This motivates using approach a) instead.
+Note that we have still glossed over precisely how consensus rules would change to enforce a). This will be covered later in The Crosslink 2 Construction, but first we will discuss other issues with Snap-and-Chat.
+ +The PoW+TFL consensus protocol is logically an extension of the Zcash consensus rules to introduce trailing finality. This is achieved by compartmentalizing the top-level PoW+TFL protocol into two consensus subprotocols, one embodying most of the current consensus logic of Zcash and another the TFL. These protocols interact through a hybrid construction, which specifies how the protocols interact, and what changes from "off-the-shelf" behavior, if any, need to be imposed on the subprotocols. Each of these components (the two subprotocols and the hybrid construction) are somewhat modular: different subprotocols or hybrid constructions may be combined (with some modification) to produce a candidate PoW+TFL protocol.
+TODO: Add a protocol component diagram to "Design at a Glance" #122
+The hybrid construction is a major design component of the full consensus protocol which specifies how the subprotocols integrate. So far we have considered three candidates:
+We believe Crosslink is the best candidate, due to its more rigorous specification and security analysis, and due to the issues with Snap-and-Chat described in Notes on Snap-and-Chat.
+The PoW+TFL hybrid consensus consists of two interacting subprotocols:
+Note that the hybrid construction may require modification to the "off-the-shelf" versions of these subprotocols. In particular Crosslink requires each protocol to refer to the state of the other to provide objective validity.
+ +Here we strive to lay out our high level TFL design goals.
+Here we lay out ideal goals. As we develop a complete design, we are likely to inevitably encounter trade-offs some of which may preclude achieving the full idealized goals. Wherever possible, we motivate design decisions by these goals, and when goals are impacted by trade-offs we describe that impact and the rationale for the trade-off decision.
+For example, one ideal user experience goal below is to avoid disruption to existing wallets. However, the Crosslink construction may require wallets to alter their context of valid transactions differently from the current NU5/NU6 consensus protocol.
+We strive to start our protocol design process from user experience (UX) and use case considerations foremost, since at the end of the day all that matters in a protocol is what user needs it meets and how well.
+For a full PoS transition, ecosystem developers for products such as consensus nodes, wallets, mining services, chain analytics, and more will certainly need to update their code to support transitions. However, we carve out a few goals as an exception to this for this category of users:
+Zcash has always had exemplary safety, security, and privacy, and we aim to continue that tradition:
+TODO: Define privacy goals of TFL #118
+TODO: Define PoS Subprotocol desiderata which are distinct from Crosslink integration #117
+We want to follow some conservative design heuristics to minimize risk and mistakes:
+These are not goals of the TFL design, either to simplify the scope of the initial design (a.k.a. Out-of-Scope Goals), or because we believe some potential goal should not be supported (a.k.a. Anti-goals).
+While these desiderata may be common across the block‑chain consensus design space, they are not specific goals for the initial TFL design. Note that these may be goals for future protocol improvements.
+Prioritizing minimal time-to-finality over other considerations (such as protocol simplicity, impact on existing use cases, or other goals above).
+In-protocol liquid staking derivatives.
+Maximizing the PoS staked-voter count ceiling. For example, Tendermint BFT has a relatively low ceiling of ~hundreds of staked voters, whereas Ethereum’s Gasper supports hundreds of thousands of staked voters.
+Reducing energy usage. While this would presumably be a goal of a pure PoS transition, it likely cannot be achieved for hybrid PoW/PoS without loss of security.
+Distinctly from Out-of-Scope Goals we track "anti-goals" which are potential goals that we explicitly reject, which are potential goals we aim to not support even in future protocol improvements.
+We currently have no defined anti-goals.
+This requirement comes from a request from a DEX developer. While we have not yet surveyed DEX and Bridge designs, we're relying on this as a good starting point.
+This book introduces and specifies a Trailing Finality Layer for the Zcash network. This is version 0.1.0 of the book.
+This design augments the existing Zcash Proof‑of‑Work (PoW) network with a new consensus layer which provides trailing finality. This layer enables transactions included via PoW to become final which assures that they cannot be reverted by the protocol. This enables safer and simpler wallets and other infrastructure, and aids trust-minimized cross-chain bridges. This consensus layer uses Proof-of-Stake consensus, and enables ZEC holders to earn protocol rewards for contributing to the security of the Zcash network. By integrating a PoS layer with the current PoW Zcash protocol, this design specifies a hybrid consensus protocol dubbed PoW+TFL.
+The rest of this introductory chapter is aimed at a general audience interested in the context of this proposal within Zcash development, status and next steps, motivations, a primer on finality, and tips to get involved.
+ +