-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathpreliminaries.tex
134 lines (105 loc) · 11.8 KB
/
preliminaries.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
\section{Preliminaries}
\subsection{The UTXO Model}
In the UTXO model a transaction consists of inputs and outputs. An output consists of an amount and a locking script (called \texttt{scriptPubKey} in Bitcoin). In Bitcoin the locking script is described in a stack-based language called Bitcoin Script. An input consists of an unlocking script (called \texttt{scriptSig} in Bitcoin) and an output pointer (also called an outpoint) which is a tuple of the form \tpl{\texttt{txID}, \texttt{outputIndex}}. We say that an input \emph{spends} an output, where \texttt{txID} refers to the transaction id that includes the output and \texttt{outputIndex} refers to the zero-based index of the output to be spent in that transaction.
The global state of UTXO based cryptocurrencies includes all transaction outputs, and all spent transaction outputs. UTXO stands for unspent transaction outputs, which can be trivially computed from any state $\btcstate$ as $\btcstate.\utxo \triangleq \btcstate.\outputs \setminus \btcstate.\spentOutputs$.
An input is valid with regards to a state $\btcstate$ (a) if the output it references exists (i.e. $\in \btcstate.\outputs$) and is not already spent (i.e. $\notin \btcstate.\spentOutputs$) and (b) if the unlocking script of the input ``unlocks'' the locking script of the output. In the case of Bitcoin, the unlocking script runs on an empty stack, and subsequently the locking script runs on the resulting stack of the unlocking script's execution. The input is considered to unlock the output if the locking script terminates successfully.
The transaction as a whole is valid if all inputs are valid and the sum of the values generated by its outputs is at most the sum of the values brought in by its inputs. The absolute value of the difference of values is called the \emph{fee} and is used to incentivise miners in the network as we are going to see shortly. The state can be updated by a valid transaction as shown in \cref{alg.btcapply}, where the new outputs are added and newly spent outputs are marked as such.
The most usual form of output is Pay-to-Public-Key-Hash (P2PKH) which, as its name implies, includes the public key hash of the recipient. For an input spending it to be valid, it needs to contain a signature of the corresponding secret key over the new transaction skeleton and the output to be spent.
\begin{algorithm}[H]
\caption{\label{alg.btcapply} The $\btcapply$ function given a state $\btcstate$ and a transaction $tx$.}
\begin{algorithmic}[1]
\Function{$\btcapply$}{$\btcstate, tx$}
\For{$o \in tx.\outputs$}
\State{$\btcstate.\outputs \cupeq \{o\}$}
\EndFor
\For{$i \in tx.\inputs$}
\State{$\btcstate.\spentOutputs \cupeq \{i.\texttt{output}\}$}
\EndFor
\State\Return{$\btcstate$}
\EndFunction
\end{algorithmic}
\end{algorithm}
\subsection{The Account Model}
In the account model, a transfer transaction consists of a sender, a receiver, an amount, a nonce, a fee and a signature. The sender and receiver are public keys. The fee acts as an additional miner reward as in the UTXO model.
Each address has its own state. The state of an address is defined as $\tpl{\bal, \nonce}$. For a previously unused address all fields are assumed to be of zero value, specifically: $\ethstate[\texttt{address}] = \{\bal{:}\,0, \nonce{:}\,0\}$. With $\ethstate[\texttt{address}]$ we denote the state of the account with the specified address.
For a transfer transaction to be valid with respect to a state $\ethstate$, the following conditions need to hold.
\begin{itemize}
\item The signature must be valid and generated by the secret key of the sender over the rest of the transfer transaction.
\item The nonce must be equal to $\ethstate[\sender].\nonce + 1$.
\item The sender must have adequate balance.
\[
\ethstate[\sender].\bal \ge \amount+\fee
\]
\end{itemize}
We remark that the nonce is necessary in order to avoid replay attacks~\cite{ethereum-glossary}, for if it were absent some receiver would be able to repeat the same transaction ad infinitum until the sender account was completely drained. Note that a potential solution to this would be to maintain a set of processed transaction ids and add a requirement that the transaction to be evaluated for validity does not belong in that set. If the sender ever wishes to send the same amount to the same recipient, they will need to generate a new signature and thus the transaction will have a different transaction id. In theory this solution stands, however in practice due to the malleability of ECDSA a signature can be modified in a way that it changes and consequently the transaction id changes, while retaining its validity~\cite{malleability}.
A valid transaction according to a state can be applied to it to generate a new state. We perform this state transformation with a function $\ethapply(\ethstate, tx)$ which debits the sender, credits the receiver and increments the sender's nonce. The full definition of $\ethapply$ is given in \cref{alg.ethapply}.
\begin{algorithm}[H]
\caption{\label{alg.ethapply} The $\ethapply$ function given a $\ethstate$ and a transaction.}
\begin{algorithmic}[1]
\Function{$\ethapply$}{$\ethstate, tx$}
\State{$\ethstate[tx.\sender].\nonce \pluseq 1$}
\State{$\ethstate[tx.\sender].\bal \minuseq tx.\amount$}
\State{$\ethstate[tx.\receiver].\bal \pluseq tx.\amount$}
\State\Return{$\ethstate$}
\EndFunction
\end{algorithmic}
\end{algorithm}
\subsection{The Blockchain}
Transactions are usually arranged in batches and included in so-called \emph{blocks}. Blocks are arranged in a structure which resembles a linked list, in which each block contains a pointer to its previous block in the form of its hash. The only exception to this rule is the first block of the list, which is called the \emph{genesis} and does not include any such pointer. This linked list is called the \emph{chain}, which we usually denote with \chain. With $\chain[i]$ we denote the $i$-th block of the chain, and we customarily denote the genesis block as $\Gen \triangleq \chain[0]$.
We adopt the Python notation to refer to parts of the chain, which is illustrated in~\cref{notation}. We call $\chain[:-k]$ the \emph{stable} part of \chain.
\begin{table*}
\caption{The notation used throughout this work.\label{notation}}
\centering
\begin{tabular}{r|l}
\chain & the chain \\
$|\chain|$ & the chain length \\
$\chain[i]$ & the $i$-th block of the chain ($i \ge 0$)\\
\Gen & the genesis block ($\chain[0]$) \\
$\chain[-i]$ & the $i$-th block from the end of the chain ($i > 0$) \\
$\chain[i:j]$ & all chain blocks starting from $\chain[i]$ up to but not including $\chain[j]$ \\
$\chain[:j]$ & all chain blocks from genesis up to but not including $\chain[j]$ (equivalent to $\chain[0:j]$) \\
$\chain[i:]$ & all chain blocks from $\chain[i]$ up to and including the last (equivalent to $\chain[i:|\chain|]$) \\
$k$ & the stability parameter of the blockchain protocol \\
\end{tabular}
\end{table*}
Whether in the UTXO or account model, we say that each block has a state. The state of a block $\chain[i]$ is defined as the state of the previous block $\chain[i-1]$ upon application of all transactions present in $\chain[i]$. More formally, we define the state of a block as follows:
\[
\text{state}(\chain[i]) = \left\{
\begin{array}{cl}
\text{apply}(\text{state}(\chain[i-1]), \chain[i].\texttt{txs}) & \text{if } i \ge 0 \\
\epsilon & \text{otherwise}
\end{array}
\right.
\]
Application of a sequence of transactions occurs by applying each transaction and using the new state for the next application. This can be written more formally as follows
\[
\left\{
\begin{array}{rl}
\text{apply}(st, [tx_1, \dots, tx_z]) &= \text{apply}(\text{apply}(st, tx_1), [tx_2, \dots, tx_z]) \\
\text{apply}(st, []) &= st
\end{array}
\right.
\]
A block $b$ is split in two parts, the header and the body. The header is a fixed-size string, and the body is the place where all transactions reside. All the transactions in a block's body are committed to in the block's header, usually in the form of a Merkle Tree root over all of them. A block with its body is what we call a \emph{full block}. We will use $\chain[i]$ to refer to a block header or the full block at height $i$ interchangeably, as well as $\chain$ to refer to both the header chain and the full block chain.
Money generation happens with coinbase transactions. A coinbase transaction is a special transaction that provides miners (in Proof-of-Work based cryptocurrencies) or minters (in Proof-of-Stake based cryptocurrencies) with a reward. Each block may have a single coinbase transaction. For cryptocurrencies in the UTXO model the coinbase transaction contains no inputs and generates only as much value as the block reward per the consensus rules, plus any transaction fees. Coinbase transactions could also exist in account-based cryptocurrencies, for example having a $\sender$ field of all zeros. In practice Ethereum-based cryptocurrencies don't have them but instead directly include the miner's address inside the block header. For our purposes we will not consider the change of state due to miner rewards.
\subsection{Network Actors}
\paragraph{Full Nodes}
A \emph{full node} maintains the full block chain, along with every transaction. Every transaction is verified for validity and a block that contains an invalid transaction is considered invalid by the full node. A full node participates in a Peer-to-Peer (P2P) network. Other nodes may request blocks or transactions from it, and it serves them akin to a relay.
\paragraph{Miners}
A \emph{miner} is a full node that also attempts to mine a new block on top of the best block chain. If a block is successfully mined, it is broadcasted to the P2P network of full nodes for them to consider adopting it.
\subsection{Simple Payment Verification}
Simple Payment Verification (SPV) first appeared in the Bitcoin whitepaper~\cite{bitcoin} as a way for nodes to figure out their transactions without the need to download and store full blocks. An SPV client works by obtaining the header-chain from other full nodes and validating it. Upon doing that, it can verify any transaction is included in one of the blocks by verifying a Merkle inclusion proof against a Merkle root in one of the headers. We will now formally define an SPV client.
\begin{definition}[SPV client]
An \emph{SPV client} is an ITM that knows a genesis block \textsf{Gen} and connects to a set of full nodes $\mathcal{P}$, at least one of which is honest. It may only request block headers at a range of heights. It obtains all header-chains $\chain_1, \dots, \chain_{|\mathcal{P}|}$, keeping only those for which the chain structure holds ($\forall 1 < i \le |\chain|: H(\chain[i-1]) = \chain[i].previd$). It subsequently compares them and adopts the chain with the most difficulty.
\end{definition}
The assumption of at least one honest full node is necessary to avoid eclipse attacks~\cite{eclipse,eclipse-ethereum}.
We state some results pertaining to the adopted chain of an SPV client which have appeared previously in~\cite{sok,sidechains} and are more formally derivable from the works of~\cite{backbone,garay2017bitcoin,pass2017analysis}.
\begin{definition}[Honest majority assumption]
The adversarial mining power is at all times upper bounded by the half of the total mining power of the network.
\end{definition}
\begin{theorem}[SPV security]
Under the honest majority assumption, the stable part of an SPV client's adopted chain $\chain[:-k]$ is a prefix of the chain of an honest full node.
\end{theorem}
\begin{corollary}
Under the honest majority assumption, the underlying full chain of the stable part of an SPV client's adopted chain produces a valid state.
\end{corollary}