After the discussion at
https://bitsharestalk.org/index.php?topic=6584, I realize it is really important to have a coherent argument to address the POS vs POW debate. The hard part of getting other cryptocurrency fans, who are not already enlightened about POS, over to BitShares is going to be addressing all of their concerns about POS, Nothing-at-Stake, and their belief that POW is necessary for secure consensus. The other difficult challenge that will need to be addressed is convincing the POS believers, and NXTers in particular, that there is an appropriate balance between centralization and decentralization, and that hopefully DPOS has properly struck that balance to be decentralized enough (and with low enough barrier to entry) to be corruption-free and trustless, but centralized enough to be efficient (low transaction fees and fast block production). There is already great discussion on that topic happening at
https://bitsharestalk.org/index.php?topic=5564. But this topic is not about the centralization vs decentralization argument but rather the POW vs POS argument.
This is a first draft, and I would appreciate feedback. I hope I didn't make mistakes in understanding some of the details of the technologies, but please correct me if I am wrong. I am also interested in what people think about my arguments about the economics of fake blockchain history attacks for resync periods less than the threshold (and whether 6 months is even an appropriate threshold or not). I really want to try to develop an argument that can address POW supporters' concerns and convince them that POS is the right way to go.
POW vs POS consensus systemsPeople in the POW (Proof-of-Work) community generally accept the concept of Nothing-at-Stake as a fundamental flaw in POS (Proof-of-Stake) consensus systems that make those systems, in their view, inferior to POW consensus systems. What is Nothing-at-Stake and is it actually a legitimate concern in practical POS systems? I hope that I will be able to convince people that it is not really an issue, and that, on the contrary, POS has many advantages to POW.
All consensus systems require all participants to maintain a consistent view of a shared database as the database is modified over time. In blockchain-based consensus systems, this database is a log of appended blocks of data, in which each block (other than the first block, called the genesis block) contains a hash of the content of the previous block. This forms a cryptographically secure chain of blocks such that modifying any block requires modifying all of the blocks that come after it in the chain. In a blockchain-based consensus system in which all participants are always connected to the network, consensus can be maintained as long as all participants can agree on which block to append next. Temporary network disruptions may cause some short forks which need to be resolved quickly. So, some other mechanism is also needed to resolve these forks. Finally, since all participants cannot be online all of the time, some mechanism is needed for participants to safely resync with the network after some period of time offline.
Deciding which block to append next:POW systems use a stochastic computational process, called mining, to determine which block everyone agrees to accept and
append to the blockchain. It is essentially a cryptographic lottery in which the probability of winning is a function of a specific form of computational power (called hashing power) and the current consensus-accepted difficulty. Anyone can verify that a block won the cryptographic lottery by looking at the block and knowing the current difficulty.
The POS system in Peercoin also uses mining to determine which block everyone agrees to accept next. However, in this case the probability of success is a function of hashing power, the current consensus-accepted network-level difficulty, and the coin-age (product of the unspent transaction output value and elapsed time since transaction was created) of the stake used for block production. The function has a very strong dependence on the coin-age such that it is more profitable to buy more coins (or stake) to increase coin-age than to buy more computing power to increase hashing power. Again, anyone can verify that a block won the lottery by looking at the block, knowing the current network-level difficulty, and verifying that the coin-age used to produce the block is legitimate (which requires having a consistent view of the blockchain up to that point).
The POS system in BitShares is called Delegated Proof of Stake (DPOS), because it allows stake owners to delegate the power that their stake provides to other users called delegates. This power is the same power that stake provides in Peercoin and other POS systems: the power to create new blocks. Given a consistent view of the blockchain up to a certain point, anyone can know who the current active delegates are, the order in which these delegates will produce blocks in the current round, and thus which specific delegate is responsible to produce the next block. A cryptographic lottery in the form of mining is not necessary to determine who earns the right to produce the next block. Random values from the non-colluding delegates of the previous round determine the random ordering of the delegates for the next round. Stakeholders are able to vote for delegates using their stake through cryptographically-secure transactions that are stored in the blockchain, and the voting activity by the stakeholders can change the set of active delegates at any time.
Resolving short forks:Due to the nondeterministic nature of mining as well as network propagation delays, short forks are possible in both POW systems and Peercoin-like POS systems. Network propagation delays may also cause extremely short forks in DPOS as well.
POW systems resolve the forks by agreeing to build on the chain with the most work done (the sum of the difficulty values at each block up to current head block in the blockchain). If everyone follows this rule, eventually all the nodes will come to a consensus on one particular chain, thus resolving the fork.
Peercoin-like POS systems can resolve the fork by building on a chain with the most amount of some other metric, like the total amount of coin-age consumed. Again, as long as everyone follows the same rule, the network will eventually naturally converge to just one of the forks.
Although, DPOS is able to randomize the order of delegates within a round, the order of the delegates in a given round is known prior to any of the delegates producing blocks in that round. For this reason, block production order can be considered deterministic. Nevertheless, very small forks could be possible because of network issues. For example, if block N is delayed by the network for too long, the producer of block N+1 might assume that the producer of block N was not available to produce his block at his designated time slot, so instead will chain off block N-1. The producer of block N+2 may have seen block N and/or block N+1. If he saw both, he always chooses the one that is supposed to come later in time, on the other hand if he sees only one or the other, he builds off of the one he saw. Thus, the chain is built with either block N or block N+1 considered missing, but the network is able to quickly get back to a consensus on the true chain because of the deterministic ordering of block producers.
Resyncing with the network after some period of time offline:So far, the assumption has been that all participants were well connected to the network and therefore able to easily maintain consensus on the blockchain. Under these assumptions POW does not provide any advantages over POS. But realistically, users cannot always be online. And yet, they need some way of reconnecting with the network and getting up to speed on the current state of the blockchain from where they last left it without allowing an attacker to fool them onto a fake chain. If an attacker is able to fool the user into believing in fake additions to the blockchain since the last block seen by the user's client, the attacker can break consensus and thus damage the value of the network. In particular, the user becomes vulnerable to a double-spend attack by the attacker since they think they are getting tokens of value in exchange for goods/services (due to belief in the fake transaction history) but others on the true chain know that those tokens are worthless and will therefore not accept them in exchange for goods/services.
POW resolves this issue by using the same method used to resolve short forks: pick the chain with the most work done. Attackers have no way of faking the block acceptance criteria. They need to put in the work necessary to match the difficulty requirements at that point in the blockchain. Attackers can create a fake blockchain history by putting in the necessary work, but if they have less than <50% hashing power, their accumulated amount of work will be less than the accumulated work of the true chain. As long as the true chain is made visible to the resyncing user, he can easily pick it over the fake chains.
POS tries to resolve this issue by also making it difficult for attackers to fake the block acceptance criteria. In the case of Peercoin-like POS systems, it needs to be difficult for attackers to get coin-age (which is ultimately dependent on the amount of stake in the attacker's control). In the case of DPOS, it needs to be difficult for the attacker to get control of the delegates. Because of the way delegates work, the attacker would actually need to control nearly all of the 101 delegates to fake the blockchain history (see
here and
here for details). However, if the attacker controlled more than 50% of the stake, he could vote in all of his own delegates. So all POS systems are ultimately vulnerable if the attacker is able to get the majority of the stake. For a POS system to be secure from fake blockchain history attacks, the majority of the stake in the system needs to be kept away from the control of an attacker during the time a user is offline. However, if an attacker was able to capture only a small minority of the stake while the user was offline, the attacker cannot create a fake blockchain that the user would accept as valid.
POW supporters like to point out that the attacker does not need to control >50% on a live system; as long as an attacker controls >50% of the stake at any point in time t on the blockchain, that attacker could easily build a fake blockchain from that point forward that would fool a user's client if its last resync point was before time t. For a completely new user synchronizing from the genesis block, this means the attacker only needs to control >50% of the stake at any point in time in the history of the blockchain. This is the meaning behind the Nothing-at-Stake name. The users who owned >50% of the stake in the system in the past, may no longer own any stake in the system in the present. While it would be foolish for a present-day >50% stake holder to harm the network, someone who held >50% of the stake in the past but holds nothing at stake in the present has nothing to lose with an attack attempt.
As bad as this may look for POS systems, with more careful analysis, it is clear it is not actually a problem. A user in a POS system will always have a checkpoint in the not-too-distant past. This checkpoint either comes from the last block of the locally-saved, trusted blockchain (or perhaps just the locally-saved hash of the last seen block), or it can be hardcoded into the particular version of the wallet. As long as that checkpoint is in the not-too-distant past, users would not be vulnerable to fake blockchain history attacks in realistic scenarios. If the checkpoint is older than some threshold, then other measures are needed. This threshold can vary depending on the circumstances of the network and the paranoia of the user, but I think a threshold of 6 months is sufficient in most realistic scenarios.
Resyncing after being offline for less than 6 months should not be a cause for concern of fake blockchain history attacks. The only way such an attack can successfully work is if the attacker obtains ownership of >50% of the stake existing at some point during that 6 month period. The attacker would like to buy old private keys at very low cost from users who had stake in the system in the 6 month period but now no longer do. They have to no longer have stake in the system otherwise they would be foolish to sell old private keys to someone whose only purpose for buying old keys is clearly to attack the system and thus reduce the value of the seller's existing stake. But the attacker will not be able to find enough private key sellers that match that criteria, because it is extremely unlikely for stakeholders with >50% of the stake to completely exit out of the system within a 6 month period. The attacker is forced to legitimately buy into the system at a high cost if he wants to attack the network. But an attacker who grows his stake over some period of time until it reaches >50% would likely not attack the network while still holding the stake, otherwise they would cause the most damage to themselves. If the attacker is able to begin and finish selling their >50% of stake during that 6 month period, then the attacker has the opportunity to carry out a fake blockchain history attack against the victim who was offline for 6 months. However, the price one pays trading assets depends on how quickly they need to finish the trade. The attacker can take his time building up the stake to not have to overpay in order to incentivize stake holders to sell, but he is forced to sell at a discount to incentivize enough people to buy to quickly sell off his stake before the 6 month deadline. Pulling off this kind of buy-sell cycle is going to cost the attacker a lot of money. It is only rational to do this if this one buy-sell cycle provides him with enough opportunity to recover his costs through double-spend attacks. But the only people he can attack are people who were offline for about 6 months. Most people would be resyncing at much higher frequencies than that, which would be really hard to attack. Trying to sell >50% of stake in one week would cause a flash crash of the price of the coin (hurting the attacker the most). Also, from a practical manner, the attacker doesn't have any good way of knowing who has been offline during the same time period they set up the buy-sell cycle to actually target these individuals. So, even if there are a decent number of people out there that the attacker could target to make his money back, it isn't trivial to identify them.
So what about resyncing after being offline for more than 6 months? With the exception of resyncing from a genesis block on a new computer, it would be a very unusual circumstance to be doing this. The vast majority of people would be resyncing on a much more frequent basis. Nevertheless, in these rare cases, users would follow the same procedure that users who are resyncing from a genesis block on a new computer would follow. First, if the user already has an up-to-date blockchain on one computer and they just want to set up their wallet on a new computer, the clients could provide an easy method for the existing trusted client to communicate a hash of a recent block to the new client. Since a user obviously trusts himself and the client he has already been using, he can carry over that trust to the new device. What about a completely new user who has never been part of this network before? Or someone who lost their hard drive (but still has a backup of their private keys) and wants to reinstall the client from scratch on their computer? In these cases, the users would rely on the snapshot hardcoded in the latest version of the client software (which would be <6 months old). A new user needs to download the client software anyway; and, they need to have some way of trusting the software they download. If the attacker was able to provide a fake client with a fake snapshot, they would again be vulnerable to the fake blockchain history attack. But if the attacker was able to provide a fake client, the user would be compromised in so many ways. The fake client could just steal the user's private keys! Or if they are using a hardware wallet, the fake client could present a false view of the blockchain to make the user think he got paid when he didn't.
Ultimately there has to be some trust when it comes to these consensus technologies. Bitcoin users may not worry about fake blockchain histories because of POW, but if their wallet is compromised none of that matters. Therefore, Bitcoin users still need to somehow trust the Bitcoin client software they run on their computers. They can compile from source, but they still need to trust that the source is safe. They can rely on other people to audit the source code and tell them it is safe, but then they are just putting the trust on the auditors. Those auditors could collude together to attack the user. If the user is really geeky, he can audit the source code himself, which would take a very long time.
Similarly, in a POS system, the users also typically rely on either the developers or auditors to tell them that a particular client is safe to use. But that also carries with it the information of the most recent snapshot. If the developers try to change the snapshot hash to carry out a massive fake blockchain attack, auditors who have the legitimate blockchain up to the time of the client upgrade stored locally on their computer can check to see that the snapshot hash does not match any of the blocks on their stored blockchain and sound the alarms. If the user does not want to trust the developer or the auditors, he can audit the source code himself, but he would also need to somehow verify the latest snapshot hash. If he has a stored version of the blockchain up to client upgrade time, he can verify it the same way the auditors did. If he is evaluating this starting from scratch, then he needs to ask people he trusts that have already been on the network for a while that the hash is correct (and thus whether this program he has on is computer is going to connect him to the thing everyone else is already connected to). This may seem like a lot of work, but it is far less work than the code audit.
Advantages of POS over POW:The point of all of the above was to show that in realistic scenarios, the cost of a 51% attack is too high to benefit the attacker. It is typically too expensive to get 51% of hashing power of a high hash rate proof of work coin for the minimal benefits it provides (killing the network and/or difficult to acheive double-spends). And, it is typically too expensive to get 51% of the stake in a high market cap coin for the minimal benefits it provides (killing the network and/or difficult to acheive double-spends). But users trade the guarantee that fake blockchain history attacks are virtually impossible in POW systems for an assurance that fake blockchain history attacks are merely highly improbable in POS systems. If that was the only difference between POW and POS, it would make sense to use POW. However, there are a few very important ways that POS is actually superior to POW.
In a POW system, if the attacker has enough hashing power to attack a POW system, he can also attack any other weaker POW system that uses a similar hashing algorithm. On the otherhand, buying up 51% of the stake of a POS system does not give the attacker any advantage for attacking another POS system. On the contrary, it is likely going to consume the attacker's money, leaving him too little money left over to do it again. This is incredibly important when one realizes what would likely happen if someone was foolish enough to try to buy 51% of the stake in a DPOS DAC only to kill it by taking over the delegates and refusing to sign blocks. People can take a snapshot of the failed DAC, identify the unspent transaction outputs which were voting for the corrupt delegates at the time of the DAC failure, and create a new genesis block from that snapshot with those particular unspent transaction outputs made void. A DAC identical to the previous failed one is created using this new genesis block, which takes stake control away from the attacker leaving the other innocent 49% of stake holders with 100% of the stake of the new DAC. Those who sold to the attacker should be happy because they made a voluntary exchange and got out before the DAC failed; those who did not sell to the attacker are also happy because they doubled their stake in a DAC that will quickly regain its old value, which should hopefully compensate them for the brief outage of the DAC.
The other major benefit of POS over POW is the cost needed to secure the network. In a POW system, the security of the network is directly tied to the cost of mining. If the cost of mining is cheap, an attacker can afford to gain more than 50% of the hashing power. No amount of clever mining algorithms or ASICs will change that relationship. As the electricity cost per gigahash goes down, the difficulty will go up to keep the total cost of mining high enough to secure the network. At some point when growth in the value of the system saturates and it can no longer support large coin inflation to pay miners, like is the case in Bitcoin currently, the cost of securing the network will fall entirely on the transaction fees. POW will either have higher transaction fees than POS systems to secure the network, or if users do not accept transactions fees that are too high, the security of the network will get worse. POS systems do not suffer from these issues because they do not need to waste the money from fees on electricity to secure the network. And in fact, as the system gets bigger (meaning the market cap grows) the POS system becomes even more secure naturally because it becomes harder for an attacker to acquire >50% of the stake.