An absolute beginner's guide to blockchain - Part 3
So here is the long-awaited Part 3 of my absolute beginner’s guide to blockchain.
Before I go on I’m going to introduce an amusing paradox.
However, if you have read parts 1 & 2, then you are far from being an absolute beginner when it comes to blockchain technology and Bitcoin. You know all about the fallibility of fiat currency, you know about distributed ledgers, and you even understand a bit about how encryption is used to ensure transactions are secure.
But I quite like the title, and changing it now feels like it would be a big deal. So we’re going to stick with it for now.
Maintaining The Distributed Ledger
The thing we need to talk about today is how we ensure that everyone’s version of the ledger is the same. There is no point in relying on a distributed ledger if each version of it is different.
How do we make sure that Mark is able to spend the bitcoin I sent him? For that to happen it requires that the accepted version of the ledger records him as having received that bitcoin in the first place. What if I, in a fit of cunning, decide to broadcast my message only to Mark, instead of to the network as a whole? Mark will think I have paid him as we agreed, but the distributed ledger as a whole will not agree. At this point we essentially have two different ledgers, which I am sure you will agree is far from ideal.
The simple and most accurate answer, and the one that most people give, is that the version of the ledger that has had the most computational work put in to it is the trusted one.
But clearly that makes no sense at all unless you already understand it, and so it isn’t very helpful as statements go.
However, the principle of this is that carrying out any form of fraudulent activity on the system requires so much computational work it becomes non-viable. To explain that we need to look at what “blocks” are, how they are created, and how they are “chained” together.
NOTE: You don’t actually need to know this to buy and sell bitcoin, but if you’ve got this far I’m going to assume that you think it is as cool as I do. So here goes…
Using Encryption to Validate Information
SHA256 is what is known as a cryptographic hash function. In its most basic form, it is a way of generating a seemingly random list of characters (this list is what is known as a “Hash”). However, it is not actually random in any way.
If you put the same information in, you get the same list out.
EVERY SINGLE TIME.
If I put in the entire text of Part 1 of this series it returns the Hash:
No matter how many times I run this function the resulting hash will always be the same. As long as I don’t change the input.
If I change the input even slightly - one character in hundreds of thousands difference – I get a completely different output. And because of the way SHA256 generates this output I cannot predict how it will be different.
As an example, if I change the first upper case letter of Part 1 to a lower-case letter and run the function again it produces this result:
You can see that the two hashes are completely different despite the input only being different by one character. If I put that upper-case letter back in, then I get the very same Hash I got earlier.
There are two facts you need to accept about this to understand what I am going to say next.
- This result is consistent. The same input will ALWAYS produce the same Hash.
- There is currently no way to reverse engineer an input from a SHA256 Hash.
Number 2 is the tricky one to accept. It feels like it should be possible to run the system backwards, but it really isn’t.
You can play with this here.
If all you had was a Hash, and you were attempting to figure out what the original input was, the only way you could do that would be to guess an input, run it through SHA256, and then compare the output to the one you already had.
This doesn’t sound like a particularly arduous process, but bear in mind that there are 2256 possible outputs, and that is the number of guesses you would need to make to guarantee you got the right one.
I’ve always found it hard to really get my head round numbers expressed as powers, so here is the same number in its long form.
That is significantly more guesses than there are atoms in the entire Milky Way Galaxy, so suffice it to say that it’s not going to be possible to guess the answer.
What does this have to do with bitcoin security?
Well it’s lucky you ask that, because I’m about to tell you!
To add transactions to the official ledger they need to be part of a “block”.
That block is simply a list of the transactions that have happened in the 10 minutes or so prior to its creation.
If you take a list of transactions of bitcoin, and run that list through SHA256 you will of course get a unique hash.
The chances of the first 18 characters of that hash all being “0”s are infinitesimally small.
However, if the entire network has agreed beforehand that for a block to be considered valid it must produce a hash that starts with 18 “0”s then we must change the input.
We do that by adding a number to the end of the list and then running SHA256 again.
If you take that same list and add the number “1” to the end and try again it will generate a different hash. The odds will still be so small as to be almost impossible that the resulting hash would begin with 18 “0”s. But if you try enough different numbers, you will eventually get a hash that does indeed start with 18 “0”s.
From our brief discussion of SHA256 above we know that this is no mean feat, and so insisting on this ensures that a truly massive number of guesses must be carried out. The only feasible way for these guesses to be carried out is through the use of specialist computers running trillions of guesses. This is what we mean when we talk about the “computational work” that must be put in to ensure the list of transactions is considered valid.*
Once someone else has generated the hash, it is incredibly easy to check that the number is correct. Just run the block through SHA256 yourself and you’ll see the list of 18 zeros right in front of you. But finding that number in the first place involves a huge amount of work.
This unique number is known as a “proof of work”. And that number is intrinsically tied to the transactions in the list. If you change even 1 character in 1 of the transactions the proof of work is no longer valid – SHA256 will no longer return the same Hash.
This is not all there is in a block however.
How "Blocks" are "Chained"
Each block must not only contain all the transactions from the specific period, it must also contain the hash function of the previous block. This effectively chains the “block” to the one preceding it, and so on down the line, to the very first block.
This is the “Block-chain”, and this is what the ledger is made from. And remember that it is this ledger that is actually the currency itself (if this still feels weird go back to Part 1).
However, this system does not happen by itself. The idea of the blockchain is that it allows anyone who wants to be a “block creator” to do so. In effect these people listen for transactions, group them together, and then race to create a valid “proof of work”.
This process is not simple, or even cheap. Running the sort of computer able to create a proof of work costs a lot of money, and so block creators are rewarded by being allowed to add in to the list of transactions that they receive 12.5 bitcoins.
This process is what is often called “mining” and it is how new coins are created. Over time the number they receive for creating a valid proof of work will decrease until the system has hit its peak.
You’ll remember we said earlier we always trust the form of the ledger that has the most computational work?
What this basically means is that it is always harder to commit fraud than not. To put that in some sort of perspective, in 2015 the global bitcoin network was assessed as having 100 times more computational power than Google.
If I wanted to fool Mark into thinking that the block I created with the fraudulent transaction in it was correct, I would have to not only create a proof of work faster than anyone else. But I would then have to continue to beat everyone else with every subsequent block forever.
It’s possible that by pure luck I might end up getting the first block out before anyone else, but sooner or later my luck is going to run out and my fraudulent chain will be shorter than the genuine chain. At this point my version of the ledger, the chain of blocks I have created, is ditched as containing less computational work, and therefore not “true”. The term for this is “consensus”.
However, this has been a massive digression.
In effect what all of this means, is that it is impossible to defraud the system, and that the amount of new coins is extremely tightly controlled.
Just like we said money needed to be in Part 2.
And with that I’m going to stop.
In summary Bitcoin is a valuable currency, it has an immutable ledger of transactions, currency can only be spent once, and the amount of currency is tightly controlled.
It matches our criteria exactly.
And it does so without the need for a trusted third party.
If you have got through all of this, and actually understood it, you now know more about Blockchain and Bitcoin than almost everyone out there.
In Part 4 (yes there will be a part 4, sorry…) we’ll look at some of the other potential uses for blockchain instead of simply bitcoin. That’s when things start to get really exciting!
*The number of “0”s changes depending on how many people are currently mining. It is chosen in order to ensure it still takes approximately the same amount of time to create a valid proof of work.