From the previous blogs, we understood the following
- Complexity & cost of the current financial systems and what constitutes a blockchain
- How hashing ensures data integrity & trust-less verification of blocks
In this blog, we will understand how the data is stored and how decentralization plays a critical role in securing the Bitcoin network
Immutable Ledger
Let us start with the immutable Ledger. As we saw in earlier blogs, each block has a list of transactions and the miner who “wins” in cracking the cryptographic puzzle, gets to link his block to the blockchain. His “about to be added block” has the latest block’s hash in the previous hash field, so as to ensure the link is proper. Similarly the future block that would be added, would have this current blocks hash as the previous hash, therefore ensuring all the blocks are sequentially connected in a chain
Now if someone wants to change the data of an already approved block, it becomes almost impossible because all the subsequent blocks would indicate that the hash does not match (as we already saw the beauty of SHA256 hashing, that even a single space can have an avalanche effect on the hash), therefore would not allow it to replace the already attached block. This is why we call this as an immutable ledger. To be clearer, I will explain with an example
In the above diagram, the first blockchain is the correct blockchain which had passed through all the checks. Now let us say a bad actor wants to insert a transaction that benefits them. So this person generates a block with a fraudulent transaction in B#:56004 (black block) and tries inserting into the blockchain. The moment the fraudulent miner tries to do this, the subsequent blocks (B# 56005 onward) turns red, indicating that the previous hash does not match. In order for the fraudulent miner to make his version of blockchain the one everyone “keeps”, he would need to change all the subsequent blocks. This change is not just in his copy but he will need to make this across 51% of the nodes (we will see this in the subsequent section)
P2P Network
In the previous blog, we briefly came across the term node, where the node verifies the block of a “successful” miner and accepts or rejects it. Let us go a little further and understand what a node is
A node is one computer, which has the *full copy of the Bitcoin ledger and is connected to every other node directly or through other nodes
* Due to the increasing size of the whole Bitcoin blockchain, currently standing at 493 GB as of Oct 2022, there are also nodes which do not have the copy of the entire Blockchain
Therefore the nodes are nothing but a distributed peer to peer (P2P) computer network which are constantly talking to each other. Every time a new block is added, which for Bitcoin it is around every 10 minutes, every node on the network verifies the new block hash (and many more checks) and if satisfied, it links this new block to its local copy of the blockchain and propagates the information to other connected nodes. If a node has a mismatch, it gets detected quickly because they are continuously talking with other nodes; then based on the majority of the nodes data, the minority nodes unmatched blocks will be replaced, thus ensuring inconsistent information is removed
The distributed network also ensures that redundancy is built into the design. If any node logs off or if the data is corrupted or lost, the blockchain continues uninterrupted and the impacted node simply copies the latest data from another node and start verifying when it is back online. This is stark opposite to a centralized system, where if the Bank’s server goes down, you cannot transact (you do recollect the SMS where the Bank informs you that their systems would be down for scheduled maintenance)
In the case of a malicious actor whom we discussed above, when he/she tries replacing a malicious block, they not only need to change block #56004 & its subsequent mined blocks in their local copy but they will also need to make these changes across majority of the nodes. As you can see, that is an impossible task due to the distributed P2P network
Another example
If you want a more relatable example to make you understand the sheer impossibility, let us overlay the above example with that of a land record transaction. Many of the land records are maintained in books or digitized in centralized servers, which means, if you get access to the right person, you would only need to make one change – either in a hard copy ledger or a centralized database and lo & behold the land Title has changed! However, if the same land records are maintained in a distributed cryptographically secured blockchain, this act becomes impossible due to hashing which would break the chain and the decentralization where the same copy of the blockchain is running in thousands of nodes across the world. Here is an example of how the blockchain technology can solve property purchase
What have we learnt so far?
In this series, so far we have understood
- Complexity & cost of the current financial systems and what constitutes a blockchain
- How hashing ensures data integrity & trust-less verification of blocks
- How the immutable ledger and distributed P2P network ensures secure & trust-less transactions with built in redundancy
I hope by now you are convinced on how we are able to completely remove the “middle man” and have trust-less, robust, censorship resistant network by using cryptography & decentralization. The next nagging question that you may have is on what exactly is the difficulty coefficient that we spoke about in the earlier blog and how does that affect mining. We shall dive into this in the next blog
Btw, unlike mining which requires expensive special ASIC mining machines, a node has been designed such that you & I can run to secure the Bitcoin network on a laptop/PC. I will write about this in a later blog