Blockchain 101
An introductory guide to blockchain
Blockchain is a technology that enables information to be recorded in a shared database. It is often referred to as a "distributed digital ledger technology".
History
Blockchain originated from a technological solution proposed in 2008 by a person using the pseudonym "Satoshi Nakamoto" for the purpose of creating a platform that enables the secure exchange of a digital currency (named "Bitcoin") using cryptography without the intervention of an intermediary such as a bank or other payments processing agent.1 It is important to understand that the implementation of blockchain technology by the Bitcoin network is expected to be different from the implementation of the technology in other contexts. Blockchain technology has a broad range of potential uses that extend far beyond "Bitcoin" and other digital payments.
How it works
The database is shared among a network of participants operating the blockchain, meaning there is no single centralised version. Each participant hosts a copy of the database, which is automatically updated across the network as new information is added. This decentralisation means that the information is not controlled by any single person.
All copies of the database must remain in sync with one another or else the information would be deemed corrupted. To prevent potential corruption, blockchain uses multiple cryptographic tools to store the information into "blocks" which are verified by the network participants before being added to the database. Each blockchain network will set its own rules or "consensus protocol" which establishes how the verification process must work.
How it works
The information in the blocks is stored in a manner that enables users interacting at the application layer, which sits above the blockchain technology and would be customised for the specific requirements of the participants, to access that information.
Simplified blockchain technology stack
The particular usage of cryptography in blockchain technology, combined with blockchain's distributed nature, means that it is currently impossible to edit information in the database. This is where the ledger analogy comes in. A ledger is a permanent record of debits and credits; if an error is made in the ledger an entry cannot be struck out—additional entries of debits and/or credits must be recorded to rectify the error. Therefore, participants can trust that information stored in the blockchain has not been altered. This security is key to the appeal of blockchain to businesses around the world.
Underlying technologies
Blockchain is unique in its application of certain pre-existing technological concepts to achieve an extremely secure method of recording information. As a result, information stored in a blockchain can be trusted to be not only correct, but effectively incorruptible. An explanation of these underlying technologies follows.
Public key encryption
Blockchain does not rely on the username and password system which is most commonly implemented for achieving access to and protection of secure information. Instead, it relies on a two-step authentication process using public key encryption.
Each participant is issued a public key. Each public key is an algorithmically-generated string of numbers and/or letters which represents that participant. The public key is recorded as being linked to the relevant participant and can be shared with other participants to enable them to interact.
Each participant is also issued one or multiple private keys. Each private key is also an algorithmically-generated string of numbers and/or letters, but unlike the public key, these must be kept secure by the participant. Private keys are typically stored in a text file on a computer or mobile device and can themselves be encrypted to prevent others from using them. If a participant's private key were to become known to another person, that person would be able to impersonate the participant in the blockchain network.
Issuance of public and private keys, and their pairings is managed through software at the application layer. The asymmetric algorithm used to create each set of public and private keys is such that the paired keys have a mathematical relationship that allows the private key to decrypt the information encrypted using the public key.
Therefore, with public key encryption, participants in a blockchain network can send information in a secure manner. The sender will sign his or her message with a private key and, because the corresponding public key will be the only key mathematically linked to the sender's private key, any participant who knows the sender's public key (which in theory could be everyone else in the network) can be confident that the message came from the sender. The rest of the blockchain network can then validate this information through the relevant consensus protocol, after which the information can be collected to form part of a block to be recorded in the database.
In a blockchain network, the public keys of the participants would be knowable by the other participants, although the real identity of each participant could be protected and, thus, remain unknown. The ability to remain pseudo-anonymous in a blockchain network is particularly attractive from a data protection perspective.
Public key encryption explained
Alice has a box with a lock on it. It is a special lock that unlike typical locks, has three states: A (locked), B (unlocked) and C (locked).
There are two keys to this lock. The first key can turn the lock clockwise (from A to B to C). The second key can turn the lock counter-clockwise (form C to B to A).
The first key is Alice's private key. The second key is Alice's public key. Alice can give copies of the public key to anyone she likes.
If Alice wants to store confidential information in her box, she can lock it with her public key (leaving the lock position at A). The only way to retrieve this information is with the private key, which only Alice has.
If someone else wants to give Alice some information, they can leave it in her box, locking it with a copy of the public key, which she has given them (also leaving the lock position at A). The only person who can retrieve this information is Alice, because she has the private key, which is the only key which unlocks the box from the A position.
If Alice wants to share information by leaving it in her box, she can lock it with her private key (leaving the lock position at C). Anyone with a copy of Alice's public key can unlock the box to retrieve that information.
Cryptographic hashing
In addition to ensuring the transmittal and sharing of information across the blockchain network is secured using public key encryption, the information itself is stored in an encrypted form.
While encryption is not a new concept, it is still not widely adopted by either businesses or end users for pure data storage purposes. This means that much of the information being stored in databases appears in plain text, easily readable and thus potentially usable for malicious purposes.
Blockchain uses a method of encryption called "cryptographic hashing", to seek to eliminate the risk that an unauthorised person could figure out what the underlying information stored in the database is.
There are two important concepts that must be understood to grasp what cryptographic hashing entails and why it provides superior security of data.
Hashing
A "hash" is the output of a computer programme which takes an input and converts it into an alphanumeric string. "Hashing" is the act of creating a hash.
The size of the alphanumeric string depends on how the hash function is programmed, but all of the outputs of the same hash function will have the same size. For example, cryptographic hash function Secure Hash Algorithm 256-bit (SHA-256)2 , which is used by the Bitcoin network, produces an alphanumeric string consisting of 64 characters regardless of the size of the input. So, an input consisting of the entirety of a novel and an input consisting of a person's name would each produce a 64-character hash.
The same input will always have the same hash when using the same algorithmic hash function. For example, each time the input "Fox" is run through SHA-256, the same 64-character hash will be output.
In addition, any change to the input - changing a single character, changing the capitalisation of a letter or the introduction of an extra space or punctuation mark—will cause the output hash to be completely different. For example, the inputs "Fox" and "fox" run through SHA-256 will have completely different hash outputs.
Finally, a hash function cannot be "reverse engineered". The only way to determine what input created a hash output is to guess possible inputs, over and over again until the input entered produces the hash which is to be decrypted. Based on currently available computer processing power, it is "computationally infeasible" to guess the input of a cryptographically strong hash in this manner.
All of this means that hashing is an extremely secure encryption method for storing information.
Hash functions |
---|
Key features of a hash function:
|
Merkle Trees
As stated above, a hash function can take an input of any size and convert it into the same sized output and the resulting hash cannot be reverse engineered. This means that it is possible to run two hashes through a hash function to get a resulting third hash, which is even harder to reverse engineer than the two originals. And this can be repeated again and again.
For example, eight initial pieces of information can be run through a hash function to produce eight initial hashes. These eight hashes can be grouped by two and run through a hash function to produce four hashes. Continuing on in this manner would result in one ultimate hash which represents the entirely of those eight initial pieces of information.
All of these hashes, when taken together, create a tree structure known as a "Merkle Tree"3.
Merkle Tree
A Merkle Tree is effectively a tamper-proof log of the initial eight pieces of information at the bottom of the tree. Any change to an input to a hash function results in a completely different output; if any change were made to the initial information, all the hashes in the tree would be different, including the ultimate hash at the top of the tree.
Attempts at tampering with the information in a Merkle Tree are immediately discernible by looking at this ultimate hash sitting at the top of the tree, which is called the "Merkle Root". If the Merkle Root has changed, someone has tampered with the information.
Therefore, applying cryptographic hashing enables a person to easily verify whether a Merkle Tree is a true record of information.
Blockchaining
The final technological concept necessary to understanding blockchain is the method of blockchaining itself.
A participant in a blockchain network can collect multiple sets of information to be recorded in the database to form a block. Before that block is actually added to the blockchain, the following steps must take place:
- The information is hashed together to form a Merkle Tree. The Merkle Tree is stored in the body of the block.
- The Merkle Root is stored in the block header. Also contained in the block header are: the block number (this will be sequential in the chain); a timestamp (so the network knows when the block was formed); and the previous block's header hash (called the "pointer").
- Certain elements in the header are hashed together to output the header hash of the block.
- The details of the block are broadcast out to the blockchain network for verification.
Anatomy of a block
Whether or not a block is actually added to the blockchain depends on the consensus protocol adopted by the blockchain network. A consensus protocol can be any set of rules adopted by the network. It may involve the participants being required to solve a mathematical challenge (for example, as required by the Bitcoin network) or it may only require a certain percentage of participants in the network to agree that the block should be added. The requirements of each blockchain network will dictate what consensus protocol is used.
Once the requirements of the relevant consensus protocol have been satisfied, the block is added to the blockchain and the database is updated across the network. The blockchaining process will continue as participants seek to record additional information in the database, with each block being chained together by the pointers.
Visual representation of a blockchain
The pointer in each block contains the block header hash of the previous block, thus forming a sequential chain of information.
It should now be evident why blockchain is considered to be so secure. The chain of information stored in blocks is represented through linked hashes, each with a specific relationship to the underlying information. If a person were to seek to change any part of that underlying information, a cascade of errors would propagate across the entire blockchain, causing it to fail completely. All of the subsequent hashes in the relevant Merkle Tree and, therefore, the block header hash and, therefore, the pointer would be changed because an initial input had changed. If the pointer no longer points to the previous block then every subsequent block would cease to be chained properly.
A cascade of errors
A change to the underlying information will cascade up the Merkle Tree and change the Merkle Root of the block. This, in turn, will change the inputs for the block header hash. If the block header hash is changed, the pointer in the subsequent block will break and every subsequent block will ceased to be chained properly.
Therefore, to change any information in the database, every single piece of information from that initial point and subsequent would need to be updated in every single database hosted by every participant in the network. The amount of effort required to achieve this is a significant deterrent to would-be hackers or other persons attempting to interfere with the blockchain.
Security
Although using blockchain technology makes the recording and storage of information extremely secure, it is not invulnerable to attack. However, as noted above, the amount of computing power and effort to effectively attack a blockchain is a significant deterrent. Additionally, in general, as the interests of participants in blockchains are aligned, attacking the system from within is self-defeating.
A key factor that will influence the level of security risk is whether the network is "permissionless" (public) or "permissioned" (private).
Permissionless networks
A permissionless network is one in which anyone is able to (i) view the information in the database, (ii) submit information to be recorded in the database and (iii) host a version of the database and participate in the verification of information and blocks. Bitcoin is the most well-known permissionless blockchain network.
Because there could be hundreds or thousands of participants in a permissionless network, there is an increased risk that one of them may have malicious intent, but also a decreased risk that any one of them would have the computing power to carry out an attack on the blockchain.
While a permissionless network could empower a single administrator or group of administrators to dictate, control or change the consensus protocol, it is more likely that control will remain fully decentralised.
In addition, participation in a permissionless network will typically be pseudo-anonymous, given the information recorded in the database (even if encrypted) is publically accessible.
Permissioned networks
A permissioned network is one in which its participants are pre-selected or subjected to either specified participation criteria or approval by an administrator or group of administrators.
Permissioned networks are likely to comprise a smaller group of participants who have decided to form the blockchain network together to achieve a common purpose. The database in a permissioned network is also more likely to be accessible only by its participants. However, there could be uses where the information is made available to the public or to specific external entities.
Permissioned networks are expected to be more commonly adopted for uses where more sensitive information is being recorded, such as in relation to financial, identity or health services.
Due to fewer participants in a permissioned network, there is an increased risk attached if one of them has malicious intent and the computing power to carry out an attack on the blockchain. However, where the blockchain network comprises commercial partners, it seems unlikely that a participant would be incentivised to abuse its power
Potential attacks
- 51% attacks
As discussed above, each blockchain network can establish a consensus protocol setting out rules for how and when information can be verified and added to the database. The power to control the network can be widely distributed (as in a permissionless network) or concentrated in only a few participants (as is more likely in a permissioned network).
If one or a group of participants controls at least 51% of the hashing power on the network, meaning it provides more than half of the computing power being used to operate the blockchain, that participant or group of participants could dominate activity on the blockchain.
If a dominant participant has malicious intent, it could effect a denial of service attack on the blockchain by filling blocks with useless data while preventing other participants from submitting or recording information. Or, the dominant participant could control which participants are able to interact on the blockchain.
Attacking the blockchain in this manner would render it useless.
- Zero confirmation attacks
In a scenario where the information being recorded on the blockchain is meant to form a record of transactions, such as payments or transfers of other assets, the recording of that information in a verified blockchain acts as a confirmation that the transaction is valid.
A participant could attempt a fraud on the network, for example, by seeking to record information about two different transactions in which it seeks to transfer the same asset. Using public key encryption, the participant might sign each transaction with its private key and provide both counterparties with its public key.
The counterparties to these transactions would likely not be aware that the participant has promised to transfer the asset to both of them. If the consensus protocol for the blockchain network is such that the first verified block containing information about one of the transactions gets added to the blockchain (without requiring a check as to whether or not there are conflicting transactions), then the other transaction would be invalid at this point. The counterparty to this invalid transaction would lose out on the asset.
Implementing blockchain
Blockchain is just one technology in the arsenal of next generation technologies underlying "Web 3.0". Businesses and organisations around the world are looking towards these Web 3.0 technologies to help transform all aspects of what they do and how they serve their end users.
Implementing blockchain, however, is not a straightforward task. Most of the commercial applications of blockchain remain in proof of concept stage. It is expected that commercial, widespread, deployment of blockchain will occur in the mid-term. Pending that there are also significant legal and regulatory considerations which need to be resolved (we will look at these in other publications).
However, significant players in the technology industry, such as IBM, Linux, Cisco and Intel, are working together to create a blockchain platform that can be adopted more widely by businesses. New players have also emerged to provide blockchain-as-a-service solutions.
Businesses looking to implement blockchain need to start engaging now.
Notes
1 See Bitcoin: A Peer-to-Peer Electronic Cash System, Satoshi Nakamoto (October 2008), available at https://bitcoin.org/bitcoin.pdf.
2 See http://passwordsgenerator.net/sha256-hash-generator/ to see an example of SHA-256 in action.
3 Named after Ralph Merkle, an American computer scientist and the inventor of cryptographic hashing.
Key Contacts
We bring together lawyers of the highest calibre with the technical knowledge, industry experience and regional know-how to provide the incisive advice our clients need.
-
Partner, Chief Digital Officer, Head of Ashurst Advance DigitalLondon+44 20 7859 2755
Keep up to date
Sign up to receive the latest legal developments, insights and news from Ashurst. By signing up, you agree to receive commercial messages from us. You may unsubscribe at any time.
Sign upThe information provided is not intended to be a comprehensive review of all developments in the law and practice, or to cover all aspects of those referred to.
Readers should take legal advice before applying it to specific issues or transactions.