Let’s start with transactions, Bitcoin’s fundamental building block. We’re going to use a simplified model of a ledger for the moment. Instead of blocks, let’s suppose individual transactions are added to the ledger one at a time.
How can we build a currency on top of such a ledger? The first model you might think of, which is actually the mental model many people have for how Bitcoin works, is that you have an
account-based system. You can add some transactions that create new coins and credit them to somebody. And then later you can transfer them. A transaction would say something like “we’re moving 17 coins from Alice to Bob”, and it will be signed by Alice. That’s all the information about the transaction that’s contained in the ledger. In Figure 3.1, after Alice receives 25 coins in the first transaction and then transfers 17 coins to Bob in the second, she’d have 8 Bitcoins left in her account.
The downside to this way of doing things is that anyone who wants to determine if a transaction is valid will have to keep track of these account balances. Take another look at Figure 3.1. Does Alice have the 15 coins that she’s trying to transfer to David? To figure this out, you’d have to look backwards in time forever to see every transaction affecting Alice, and whether or not her net balance at the time that she tries to transfer 15 coins to David is greater than 15 coins. Of course we can make this a little bit more efficient with some data structures that track Alice’s balance after each transaction. But that’s going to require a lot of extra housekeeping besides the ledger itself.
Because of these downsides, Bitcoin doesn’t use an account-based model. Instead, Bitcoin uses a ledger that just keeps track of transactions similar to ScroogeCoin.
Transactions specify a number of inputs and a number of outputs (recall PayCoins in ScroogeCoin). You can think of the inputs as coins being consumed (created in a previous transaction) and the outputs as coins being created. For transactions in which new currency is being minted, there are no coins being consumed (recall CreateCoins in ScroogeCoin). Each transaction has a unique identifier. Outputs are indexed beginning with 0, so we will refer to the first output as “output 0”.
Let’s now work our way through Figure 3.2. Transaction 1 has no inputs because this transaction is creating new coins, and it has an output of 25 coins going to Alice. Also, since this is a transaction where new coins are being created, no signature is required. Now let’s say that Alice wants to send some of those coins over to Bob. To do so, she creates a new transaction, transaction 2 in our example. In the transaction, she has to explicitly refer to the previous transaction where these coins are coming from. Here, she refers to output 0 of transaction 1 (indeed the only output of transaction 1), which assigned 25 bitcoins to Alice. She also must specify the output addresses in the transaction.
In this example, Alice specifies two outputs, 17 coins to Bob, and 8 coins to Alice. And, of course, this whole thing is signed by Alice, so that we know that Alice actually authorizes this transaction.
Change addresses. Why does Alice have to send money to herself in this example? Just as coins in ScroogeCoin are immutable, in Bitcoin, the entirety of a transaction output must be consumed by another transaction, or none of it. Alice only wants to pay 17 bitcoins to Bob, but the output that she owns is worth 25 bitcoins. So she needs to create a new output where 8 bitcoins are sent back to herself. It could be a different address from the one that owned the 25 bitcoins, but it would have to be owned by her. This is called a change address.
Efficient verification. When a new transaction is added to the ledger, how easy is it to check if it is valid? In this example, we need to look up the transaction output that Alice referenced, make sure that it has a value of 25 bitcoins, and that it hasn’t already been spent. Looking up the transaction output is easy since we’re using hash pointers. To ensure it hasn’t been spent, we need to scan the block chain between the referenced transaction and the latest block. We don’t need to go all the way back to the beginning of the block chain, and it doesn’t require keeping any additional data structures (although, as we’ll see, additional data structures will speed things up).
Consolidating funds. As in ScroogeCoin, since transactions can have many inputs and many outputs, splitting and merging value is easy. For example, say Bob received money in two different transactions
- 17 bitcoins in one, and 2 in Bob might say, I’d like to have one transaction I can spend later where I have all 19 bitcoins. That’s easy — he creates a transaction with the two inputs and one output, with the output address being one that he owns. That lets him consolidate those two transactions.
Joint payments. Similarly, joint payments are also easy to do. Say Carol and Bob both want to pay David. They can create a transaction with two inputs and one output, but with the two inputs owned by two different people. And the only difference from the previous example is that since the two outputs from prior transactions that are being claimed here are from different addresses, the transaction will need two separate signatures — one by Carol and one by Bob.
Transaction syntax. Conceptually that’s really all there is to a Bitcoin transaction. Now let’s see how it’s represented at a low level in Bitcoin. Ultimately, every data structure that’s sent on the network is a string of bits. What’s shown in Figure 3.3 is very low-level, but this further gets compiled down to a compact binary format that’s not human-readable.
As you can see in Figure 3.3, there are three parts to a transaction: some metadata, a series of inputs, and a series of outputs.
- Metadata. There’s some housekeeping information — the size of the transaction, the number of inputs, and the number of There’s the hash of the entire transaction which serves as a unique ID for the transaction. That’s what allows us to use hash pointers to reference transactions. Finally there’s a “lock_time” field, which we’ll come back to later.
- The transaction inputs form an array, and each input has the same form. An input specifies a previous transaction, so it contains a hash of that transaction, which acts as a hash pointer to it. The input also contains the index of the previous transaction’s outputs that’s being claimed. And then there’s a signature. Remember that we have to sign to show that we actually have the ability to claim those previous transaction outputs.
- The outputs are again an array. Each output has just two fields. They each have a value, and the sum of all the output values has to be less than or equal to the sum of all the input values. If the sum of the output values is less than the sum of the input values, the difference is a transaction fee to the miner who publishes this transaction.
And then there’s a funny line that looks like what we want to be the recipient address. Each output is supposed to go to a specific public key, and indeed there is something in that field that looks like it’s the hash of a public key. But there’s also some other stuff that looks like a set of commands. Indeed, this field is a script, and we’ll discuss this presently.