Industry experts are very enthusiastic about Blockchain. Articles, blogs are being written about how it will transform Fintech. Innovative solutions based on blockchain are a reality in areas other than crypto-currencies. By understanding the design elements of Blockchain and Bitcoin many more innovative solutions can be created.
Usage of cryptography
Knowledge of the basics of cryptography is very important to make sense of it all, hence we delve into it first.
Cryptographic hash function
If you provide a stream of bytes to a cryptographic hash function, it generates a number called hash (sometimes referred to as digest). The following properties make it a very useful tool for Blockchain:
- It is impossible to guess the generated hash value for a stream of bytes. To get the hash value, one has to run the algorithm, there is no shortcut.
- For a given input it always generates the same hash value.
- It is infeasible to deduce the input based on the hash value. That means it is an irreversible mathematical function.
- No two different stream of bytes result in the same hash value. Even a small change in the input stream generates a totally different number. (More in the comments section below regarding this.)
The two main hash functions used by Bitcoin are:
- SHA-256 (returns 256 bit unsigned integers)
- RIPEMD-160 (returns 160 bit unsigned integers)
Sample hash values:
There are two things we observe from the table above:
- With a slight change in input, the hash values change dramatically and by looking only at the hash values, one cannot conclude that the first and second are even closely related.
- The values mentioned are actually text strings and do not look like numbers, although we expected them to be numbers.
They are actually numbers, represented this way to reduce the size of the presented text. For example, binary representation of the number 255 is ‘11111111’. Decimal representation is ‘255’. Hexadecimal representation is ‘ff’ or ‘FF’. The representations in the table above are hexadecimal representations of 32 byte and 20 byte numbers. If you notice the size of the representation reduces as we move from binary to decimal to hexadecimal representations. There are more compact representations commonly used in the industry like base36 encoding (eg. a PNR of an airline ticket with numbers and all the alphabets included in the representation).
Bitcoin makes extensive use of Base58 and Base58check encoding. These are much more compact form of representing the same data (essentially 32 byte and 20 byte integers).
Elliptic Curve Cryptography is used extensively in Bitcoin. ECC is a form of public-key cryptography also known as asymmetric cryptography. In this, a sender encrypts a stream of bytes using a public key. The encrypted data seems gibberish to everyone. Only the one in possession of the corresponding private key can decrypt the data and get the original data.
Usually the private-key, and public-key are generated on the computer / device of the one who will hold the private-key. The person then sends over the public key openly to everyone. Using the public-key one can only encrypt. For decryption private-key is needed.
In ECC, one can derive the public-key from the private-key. Just as in case of the hash function, the computation is irreversible. You cannot compute private-key from the public-key.
|ECC is safe to use until Quantum Computing becomes mainstream. There are papers which suggest how Quantum Computing will enable deriving private-keys from public-keys for ECC. The same technique is infeasible for other public-key cryptography like RSA even with Quantum Computing.|
There are some advantages of using ECC, which are very well exploited by Bitcoin. The size of keys published can be reduced drastically. The Elliptic Curve can actually be plotted on a graph paper (assuming the paper is huge). Each point has an x-coordinate and a y-coordinate. The curve can be plotted using the following formula:
y2 = x3+ax+b over Fp (where a, b, p are constants)
Both public-key and private-key are basically two separate points on a curve with coordinates (x,y). If you knew x, you can compute ‘two’ values for y (+ve and -ve). So if you represent +/- and provide x, you can easily get y, and thus the whole key. This way is used by Bitcoin to reduce the size of the stored keys.
Bitcoin uses Elliptic Curve Digital Signature Algorithm (ECDSA). In quite a few countries Digital signatures have a legal standing. Governments consider it safe enough.
To sign a document is to basically provide 2 numbers (computed through irreversible mathematical functions). Inputs for the signing mathematical function are:
- the document (stream of bytes)
- the private-key
At the end of the document the two numbers are appended.
Anyone with the possession of public-key can validate that the signature by using a verification function. Inputs to the verification function are:
- the document (stream of bytes)
- the signature (the two numbers at the end of the document)
- the public-key
What has been described above suffices for this post. Details can be found at http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.186-4.pdf
A Blockchain is made of blocks of data which are linked together. In Bitcoin, the data happens to be a list of transactions, which were performed in the past 10 mins. Anybody can add blocks and link them into the Blockchain. The Blockchain still remains a very authentic source of data. A few rogue elements will not be able to disrupt the authenticity of the data. Even if they try to add blocks and link them in a disruptive way, the rest of the network will catch up and discard or neutralise their rogue blocks. To understand how this is done, we must first understand how the Blockchain is formed and how blocks are added to it.
A peer-to-peer network
The Blockchain is stored on / with a network of participants (often referred to as nodes in network terminology). The participants adhere to certain rules, and share data with each other. The rules which are applied to sharing and submitting data are referred to as the bitcoin protocol.
Each participant connects to a few other participants. Each newly connected participant introduces more participants for connections. Typically a node does not connect to all the other nodes, instead maintains a few minimum number of connections. As all these nodes have equal power in the network and abide by the same rules, it is called a peer-to-peer network. There is no central node or leader node etc in this network.
There are many details of the network which are not needed for purpose of this post, and hence not covered here. Such details can be found at https://bitcoin.org/en/developer-guide#p2p-network and https://en.bitcoin.it/wiki/Network.
The Block, Miners
A block consists of two main components, the block-header and the data (a list of transactions).
As transactions happen, they are submitted to the network, and each participant of the network validates the transaction and passes it further into the network. There are a few nodes which collect the transactions, form a block and then submit the block to the network. These nodes are called miners. The miners get rewarded for their work through Bitcoins, hence the incentive of mining (forming and submitting blocks from transactions).
The block-header is used to form the chain. It has a “link” to the previous block. The link is actually a double hash of the previous block’s header. This is set in a field called hashPrevBlock in the newly formed block’s header.
hashPrevBlock = SHA-256 ( SHA-256 (block_header_of_the_previous_block) )
This looks very easy. However, the block will not get accepted by rest of the peers in the network so easily. It has to pass a particular “proof-of-work” test.
The Difficulty, Nonce and Proof-Of-Work
Peers accept a new block on the condition that, the hash of the submitted block’s header must be less than a “target”. Both the hash and the target are 256 bit integers.
The target is computed from a 32 bit field called difficulty (which is also stored in the block-header).
hash_value = SHA-256 (header_of_submitted_block) if ( hash_value < target ) accept (submitted_block) else reject (submitted_block)
The above logic masks an important aspect of mining:
- The hash value of any given stream of bytes will always be the same.
- The miner submits blocks based on transactions done so far.
- The hash value generated from the transactions received so far may be greater than the target (which will lead to rejection of the block)
- So will some blocks of transactions never get accepted?
The block-header has a field called “Nonce”. The value of Nonce can be changed by the miner. When Nonce is changed, a new hash value is generated. This hash value may be below the target, and thus meet the acceptance criterion. If not, then the Nonce is changed again. This process continues until the target is achieved. This is a very computation intensive process, and guessing a value for Nonce is not feasible due to the nature of cryptographic hash functions. There is no shortcut and hence different values for Nonce are tried by the miners, until they stumble upon a value which will get the block accepted. The miners competition is ‘Who finds such a value for Nonce first?’
|Sometimes adjusting Nonce may not be enough to generate a valid block. There are more provisions for making an impact on the hash value of the block-header. In one particular type of transaction, called the coinbase transaction additional Nonce can be provided.|
The miner has to try way too many combination of Nonce values to get the hash below the target. A sample of how nonce impacts the hash value is given in the proof-of-work link on bitcoin wiki.
Lesser the target value, harder it is to mine a block.
As technology evolves, the processing power increases. This is happening at a very rapid pace. To keep up with the evolving technology, the difficulty keeps adjusting itself. This difficulty adjustment is enforced by each node in the network. The adjustment ensures that a new block can only be submitted roughly in 10 mins of time. There is a timestamp in each block-header. If the last 2016 blocks have taken less than 20160 minutes, then the difficulty value is adjusted such that it becomes harder to mine blocks. i.e. the target value computed from difficulty value is a smaller number. Thus mining complexity always catches up with technological advances! The details can be found here:
The Block’s data: Transaction List
The transaction list in the block impacts the block-header. If the sequence of the same list of transactions is altered, the block-header changes. The block-header has a field called hashMerkleRoot. This is a 32 byte hash value of all the transactions in the block.
Merkle Root Computation
Let us assume there are 3 transactions (T1, T2, T3) in the block. A double hash is applied on each transaction. A double hash is exactly like the double hash in prevBlockHeader.
d1 = SHA-256 ( SHA-256 (T1) ) d2 = SHA-256 ( SHA-256 (T2) ) d3 = SHA-256 ( SHA-256 (T3) )
Now d1 and d2 are concatenated and another double hash is generated.
d12 = SHA-256 ( SHA-256 (d1 concat d2) )
As we are left with only one double hash corresponding to transaction T3, we concatenate d3 with itself and generate the double hash again.
d34 = SHA-256 ( SHA-256 (d3 concat d3) )
We follow this process until only one double hash is left, and we call it the merkle root. To generate the merkle root:
hashMerkleRoot = SHA-256 ( SHA-256 (d12 concat d34) )
Notice how we formed a binary tree of hashes. As the number of transactions increase, the depth of the tree increases.
If any of the transactions change or the sequence is altered, the hashMerkleRoot will change and thus the block-header will change. It has a huge cascading effect. All the peers can figure out independently, even if only the sequence is altered. The hash values will not match the data.
The miner’s reward
For the successful operation of Blockchain, miners must remain independent entities. The large the number of such entities, harder it is to rig the system. Honest miners have an incentive to be a part of this network. As the blocks that are submitted by miners get accepted through the network, the miner receives 12.5 BTC (Bitcoin Currency). This is as of Dec 2016. This will reduce in future, as Bitcoin is designed to be a deflationary currency. As this is a discussions around economics and monetary policies, it is not a part of this post. The links below provide details on these aspects:
Claiming the reward
The miner claims this reward by injecting a transaction called the coinbase / generation transaction as the first transaction in the block. If the block is accepted as a part of the Blockchain by other peers in the network, this reward is available for the miner to spend. Whenever a block is submitted, all the transactions and the hash values get validated by the peers.
Commission / Tip for the miner
Unless a transaction is included in the Blockchain, it is not considered confirmed. If I want to get a transaction executed fast, I need to provide an incentive to the miner to include my transaction in his block. I do this by leaving a tip for the miner.
In any transaction there is an input (source account and amount of money) and output (destination account and amount of money). The difference in the amounts of input and output typically is the tip. While transacting government currency, the banks / payment processing companies charge the processing fees. In Bitcoin it is up to you, if you want to pay any tip. Your transaction will eventually make it to some block.
Bitcoin and Transactions
Bitcoins can be bought from a friend who has some or from an ATM. Either ways you need “someone” who trades Bitcoins for some work or equivalent government currency. As Bitcoin is not a physical currency, it needs an electronic store. The Bitcoins you own are actually recorded in the Blockchain in the form of transactions. A Bitcoin transaction is like a transfer of ownership. This transfer of ownership happens to an account or what is called an address.
There are different kinds of addresses like joint-accounts, multi-signature accounts, accounts that can be used for arbitration etc. For now, we will focus on single owner accounts / addresses.
The address is not exactly an equivalent of an account. Once an address is used for a transaction, do not reuse the same address for another transaction. This is not a limitation, as you can create as many addresses as you need without even being online. Remember every node in the network is a peer, there is no central authority which will grant addresses to you! Address is actually a cryptographic hash of the public-key. The following link describes issues with address reuse: Address reuse.
For the single owner address:
address' = RIPEMD-160 ( SHA-256 (public-key) )
The address’ is a 160 bit number (20 bytes integer). This can be represented in various ways and the chosen way to represent this address is base58check encoding. Details can be found here. However, for now it suffices that this is a much shorter form of representation.
address = base58check_encode ( address' )
|There are different forms and formats of addresses, which are not discussed in this post, to keep the explanation simple. These other forms may be reviewed online later: Bitcoin Address and Technical Background.|
The Bitcoin one owns is actually stored as a transaction in the Blockchain. Let us assume that I buy 100 Bitcoins from a friend of mine. He pulls out this Bitcoin client (the bitcoin wallet), creates a transaction such that the output has 100 Bitcoins addressed to me, and submits it to the network. I provide him the address.
Every transaction has Inputs and Outputs. My friend’s wallet chose 3 previous transaction which had a total of 105 Bitcoins and created the inputs. As he wants to pay me only 100 Bitcoins, and wants to leave 1 Bitcoin as a tip for the miner, he pays himself 4 Bitcoins. The inputs add up to 105 Bitcoins, while the outputs add up to 104 Bitcoins. This leaves 1 BTC as tip for the miner.
Peculiarity of Bitcoins:
- There are no such 100, 50, 20, 10, 1 denominations. You can have bitcoins of 0.985 or any number greater than .00000001 (10-8). This smallest unit in Bitcoin is called Satoshi.
- When you spend the Bitcoin, it must be spent totally. If I want to spend 2 Bitcoins at a restaurant from the 100 Bitcoins that I have received, I would need to spend the 100 Bitcoins, and create an output of 2 Bitcoins for the restaurant and another output of 98 Bitcoins paying myself. Assuming no tip for miner.
- The difference in value between Inputs and Outputs is left as tip for miner.
- The unspent transaction outputs (the Bitcoins available to spend) are often referred to as UTXOs.
- Once a Bitcoin (output of a transaction) is spend, it cannot be spent again.
Transaction: Claiming ownership and spending Bitcoins
A transaction is identified by a TxnID, which is nothing by a double hash of the transaction. The link to previous transaction’s output is provided by the pair (TxnID, index_of_output). The transaction described above has 2 output items, and the one given to me is at index 0.
TxnID = SHA-256 ( SHA-256 (Transaction) )
The transaction record shown above is stored in the open, in the Blockchain. In essence all your Bitcoins are actually out there on the network. Looking at the transaction record no one can figure out who this belongs to. The address is just a 160 bit number after all.
If you want to spend your Bitcoins, you need to claim them. To claim the Bitcoins you need to send over a “script” along with the inputs in the transaction record. If the script succeeds, then you get to claim the Bitcoin and the transaction is accepted by the network.
The Output in which 100 Bitcoins are addressed to me, actually has a script. The script has the address. This address could be the public-key hashed as described above or something else. Yes, its not necessary that Bitcoins have to be addressed to a public-key. That is just one of the common ways to write a Bitcoin transaction. The script in the output is called the “lock script”.
The Input also has a script in which it claims the Output of the previous transaction. This script is called the “unlock script”.
One of the common ways to create the unlock and lock script is:
Unlock script provided with input: 1. Provide a Digital Signature of the transaction until that input. 2. Provide the public_key. Lock script recorded with output: 3. Make a copy of the public_key from unlock-script 4. Create address from that public_key 5. Compare the address in the lock script with the created address 6. Check digital signature with the provided public_key
The scripting language is very elaborate for the nature of the work it does, and therefore is not limited to the above usage. The details of the script can be found here: Bitcoin Script.
The script is interpreted by each node in the network. Therefore a bad transaction with an invalid script will not be able to make it through. None of the nodes will accept such a transaction.
|The technique of using lock and unlock scripts is a wonderful way of validation and claim. I believe, this is one such technique which has gotten much less attention by the industry, than what it deserves.
One may argue that the claim validation mechanism could have been much simpler. This might be because of the initial reading of the link: Bitcoin Script. However, such an argument undermines the versatility it provides and potential future usage.
The main Blockchain and forks
It is quite likely that multiple miners would be able to submit their blocks for inclusion in the block chain around the same time. If Miner_1 created Block_443350 and submitted it to a Peer_110345 and at the same time Miner_2 created the same Block number and submitted it to Peer_2345. Both the Blocks will get accepted. This is known as a fork.
|In all likelihood, the block-header created for the same block by Miner_1 and Miner_2 will be different. This is because the transactions submitted could be far away from one miner compared to the other. Different propagation delays for the miners cause difference in the sequence of the transaction list.|
The possibility described above is very likely and happens often. It is the responsibility of the peers to relay valid blocks further into the network, and thus after some time the fork becomes evident to all the peers.
When such a situation arises, both the blocks are accepted. Eventually after 10 mins either of the blocks will get extended by Block_443351. If it happens so, the extended block automatically becomes the main branch.
However, it is possible that both the blocks got extended, again at the same time by Miner3 and Miner4. The fork continues. Usually within 6 blocks only one main branch would have emerged.
The rule for each peer is to choose a main branch such that cumulative difficulty of the branch until the first block ever created is the greatest. The first block is known as the genesis block.
Note that the consensus is emerging over blocks, and not at the time of submission / acceptance of the block! Hence, this is called “emergent consensus”. The consensus evolves as all peers follow certain rules: Protocol rules.
Confirmation for transaction
Each block mined upon your transaction is a confirmation. To make sure that your transaction will be fully honoured, wait for 6 confirmations. Usually forks resolve in 6 blocks, hence the most common recommendation is to wait for 6 confirmations.
In short you may consider your transaction “settled” (in common banking terminology) within 1 hour (10 mins per confirmation).
The following link provides more details on confirmations: Confirmation.
It is practically impossible to make a change in any transaction or block and recompute the hash values after so many confirmations. The re-computation / mining would have to be done for all the transactions done after the modified transaction / block. Even if such a change is done and a block is submitted, none of the peers will consider it as the main branch of the Blockchain. This is because the existing main branch will have a far greater cumulative difficulty than the fork that has been maliciously created.
The duration needed to compute each block will remain roughly around 10 mins. All the other honest miners will keep adding blocks at that speed, making it harder for the malicious miner to try to convert his branch as the main branch.
The emergent consensus, mining, the mining difficulty, tip, reward, lock-unlock scripts and users willing to try Bitcoin together makes the Bitcoin and Blockchain technology work. The bitcoin protocol followed by the peer-to-peer network nodes prevent rogue elements from disrupting the Blockchain.
A lot of design problems encountered in distributed systems and financial systems have been solved by Bitcoin and Blockchain. Quite a few of the techniques and tricks can be used by developers in their work, and some more innovative solutions can be created from scratch.
The code for Bitcoin is open for anyone to see, take and contribute. It can be found here: https://github.com/bitcoin/bitcoin. It is available under MIT license which is one of the most permissive open source licenses.
I hope this post serves as a good starting point for someone who is interested in knowing Bitcoin and Blockchain well.