[Part 2/2] What Is The Blockchain And How It Keeps Digital Currencies Secure
In the first part of the series we saw the basic structure and functionality of the Blockchain. In this part, we will introduce you to one of cryptography's most valuable tool, a hash function, and see how it is applied within the Blockchain system. Then, we'll talk about how one or more users can try to make modifications in the transaction record to their advantage, and whether they can succeed or not.
- The sealing process within the Blockchain
- How is a transactions document finally sealed?
- Protection from future modifications
- What if the malicious users are many?
Before we get to see how a document is sealed off, let’s talk about hashing.
In general, a function is a machine that you feed with data (input) and gives you back a result (output). So, if for example you insert the number “90” on the machine as input, it will return the string "d3k" as an output.
Without going into unnecessary (for the purposes of this article) mathematical details, we’ll define a Hash function as a function that accepts data of random size as an input and converts it into an integer of fixed size.
Additionally, a hash function has an important property that makes them extremely useful: the process by which any input is converted into the output is unknown and cannot be reversed. That is, if you have the string "d3k", there is no way you can know which number was fed into the machine as input to produce it. On the other hand, every time that you use the same input, you will always get the same output (it is a deterministic process).
So, we could write that process down as a simple mathematical model this way:
hash(90) = “d3k”
If you try another number, then the hash function will return another result:
hash(5) = “download3k”
Suppose that you want to find the number which returns as an output a string that starts with three zeros, i.e.:
hash(x) = “000…”
You need to start feeding the machine different numbers until it returns a string that starts with three zeros. If you are lucky, after a few thousand trials perhaps, you will find a number that produces such a string. This brute force method is a very difficult and exhausting process. However, if you just want to know if, for example, number 986 returns a string that starts with three zeros, things are a lot easier- all you have to do is try that number and look at the result to see if it matches your requirements indeed.
From this we can conclude that if someone has a number (input) and a string (output), he can easily check if these two pieces of information form are a valid pair for the hash function, since all that it takes to validate that is feed the machine with his input and see if the resulting string matches his own.
On the contrary, it is very difficult to find out to which number a string came from, if that's all you have. An infinite number of trials would have to be performed, by entering a different number every time and checking whether the machine returned the desired output.
As we mentioned above, this is the most important property of hash functions. If we know the output, it is incredibly difficult to figure out the input, and if we have an input-output pair, it is very easy to check their validity.
Let’s see how a hash function works in the Blockchain.
Suppose that we have number 88234, and we need to find which number should be added to it so that the hash function returns a string that starts with three zeros. This is a similar case as before, since in order to find the number we are looking for we must try countless numbers.
Suppose now that that the number we are looking for is 64274, which, when added to 88234 and processed through the hash function, results into a string that meets our requirements. In this case, the number 64274 becomes the seal for the number 88234. If we suppose that we have a transactions document with the number 88234, in order to seal it you need to mark it with the number 64274.
This number is the so-called "Proof Of Work", a name suggesting that effort has been made to calculate that number. Anyone who wants to check if a document has been modified, all he has to do add the contents of the page with that number and feed them through hash function.
In order to seal a document containing the network’s transactions, we will need to calculate a number that, when we attach it to the transactions list and give the resulting data as an input to the hash function, a string starting with three zeros will be returned.
Note: The three-zeros example is just for grasping the context of the process. More complex computations are performed in the real Blockchain.
Once this number is calculated, after spending time and energy, the document is now sealed. This number allows anyone to check the integrity of the page, if one tries to change its contents.
So, let's go back where we left off: we've written some transactions on a page, it becomes full, and all ten people in the network calculate the seal number. The first one who finds the result tells the others. When the others hear the number, they make a validation check. If it produces the required result, everyone adds it on their page and stores it in their folder.
But what if someone else in the network (e.g. user #3) does not get the right result with this number? Such cases are not uncommon, since user #3 probably:
- Misinterpreted a transaction that was communicated to him in the network.
- Wrote incorrectly a transaction that was announced in the network.
- Tried to steal when recording the transactions in order to favor himself or someone else on the network.
Whatever the reason might be, user #3 has only one option: to cancel his document, copy it correctly from someone else and add it in his own folder. If the page is not added in the folder, user #3 will stop recording transactions, thus cease to be part of the network.
Thus, the number with which the majority agrees on is considered to be the trusted seal number.
However, since someone will eventually find the result and announce it to others, why not just chill and wait for him to finish the hard work?
At this point, motivation comes into play. All those who are part of the Blockchain are entitled to rewards. The first one to calculate the seal number is rewarded with a sum of money for the computational power and time consumed.
So, if user #8 finds the sealing number first, he will be given a certain amount of money, e.g. $1, without taking it from another user. In other words, $1 was produced.
Now, you can think of a document as a transactions Block, and the folder as a Chain of Blocks, thus the term “Blockchain”.
If someone decides to alter a transaction to his advantage, we can find him through the page’s seal number. But what if he also modifies this number and points to the modified page with it?
To avoid the scenario in which someone goes to a previous Block and changes both the Block and its security number, the protocol is actually a bit different.
We said earlier that we have a known number, e.g. 88234, which represents the list of transactions per page. We add to it one more number (after computing it), e.g. 64274, which is the number with which we seal the page.
Actually, there is also a third known number in this operation. It is the result of the hash function for the previous document (Block).
With this simple trick, it is ensured that each Block depends on its previous one. So, if one of users wanted to go back and alter a Block to steal, he would also have to alter all pages and numbers up to that one. This requires a huge effort from this user’s end.
An altered Block will lead to a different Blockchain, compared to the one created by the majority of the team. The one who attempts to steal must create the new, false chain on his own. Therefore, it will make a disproportionate effort to build his chain with the same speed that the proper one is built from the rest of the team.
Due to the face that the “good” users are (at least in our example) more, the true chain will always be the longest in the network.
If the “bad” users in our example are six (more than half of the total users), the protocol will collapse. The scenario is known as the "51% Attack". If the majority of network’s users decide to steal the rest of the network, the protocol will fail due to its design. The longer chain will not be the right one, but it will be modified in the interest of the malicious.
Although this is a low-probability scenario, it is also one of Blockchain's few security gaps. Blockchain has been built with the noble assumption that the majority in a group of people will always be “good”.
There you go, now you know with simple terms how Blockchain works. If you have any questions or thoughts, let us know in the comments section below!