Author Topic: Can we get formal documentation on the blockchain consensus algorithm?  (Read 7511 times)

0 Members and 1 Guest are viewing this topic.

Offline bytemaster

The protocol is VERY simple, longest valid chain wins.   

Blocks from other forks are not considered until those forks are longer than the current fork. 

The code is written with a maximum "automatic" rollback period, any forks longer than that automatic period require manual intervention.  This is not part of consensus, it is there as a protection to the user because if it happens it indicates an attack of some form and closer attention may be needed.

Everything else I wrote is merely identifying what the possible scenarios are for the "longest chain" and explaining why the "longest chain" is all that matters.

There are THREE primary use cases:

1. Majority of Witnesses Collude
2. Minority of Witnesses Collude
3. No Collusion, just network problems / bugs.

The minority will never be able to produce a longer chain than the majority unless the majority resign and yield to the minority or the majority have poor performance for a long period of time.

If the Majority of Witnesses Collude we must ask what they are colluding to do:

1. Block transactions
2. Block anyone from voting them out.

As long as they are not colluding to prevent voting then the network will suffer from low delegate participation (as the minority delegates are ignored) until the majority is voted out.
If the are preventing people from voting then the minority needs to stop producing and users need to stop transacting on majority chain.  Everyone would pick a minority fork and vote out the bad witnesses.  At this point the minority fork would have 100% block production and the old majority fork will still be stumbling along at 50 to 90% corrupt-witness participation.   Eventually the minority fork will overtake it and become the longest chain.

Technically it is possible for the majority of stakeholders to collude to fracture the network and that is what most of your "attacks" are assuming: that the same users will vote differently on different forks and that they will not work to reach consensus.    I think this is not a realistic case.

   

For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline arhag

  • Hero Member
  • *****
  • Posts: 1214
    • View Profile
    • My posts on Steem
  • BitShares: arhag
  • GitHub: arhag
There are certainly a lot of edge cases that we need to carefully document and I think we can vaguely categorize them by the percentage of colluding witnesses and the time horizon for the attack:

100% collusion of witnesses means they can block transactions that would vote them out.  The network would be held hostage to their demands until a hard-fork can be distributed. 

Given that reality, the case of witnesses turning corrupt *AFTER* they have been voted out is rather unlikely and far less effective; therefore, it is safe to assume it wouldn't happen.

Can you please answer what the clients would do according to the protocol in the scenario I described in my previous post? I first want to just understand the protocol well enough and understand what attacks are possible before considering the attack economics and the probability of attacks occurring or the feasibility of successfully carrying out a particular attack.

Also, I don't understand why you think it is unlikely for witnesses that have already been voted out to collude to create a fake blockchain. At that point they have nothing to lose, especially if their real identities were never revealed.


The network is "safe" against both of those attacks so long as 5% of the witnesses do not collude.  That 5% would be enough to allow the minority chain that had support of the stakeholders to elect new witnesses and eventually overtake the attackers chain which would be stuck at 95% participation assuming that in a healthy network participation averages over 95%.

I think this is a different scenario from the one I was discussing, but an interesting one nonetheless. I don't understand how the protocol allows a minority chain to recover. I understand that a minority chain could include transactions that change enough votes to vote out the other 95% of bad witnesses by the next maintenance interval, but how exactly do clients decide to switch to that chain given that it is a minority chain? What if there were multiple minority chains that elected a different set of witnesses? How do clients choose which one to adopt?


During the attack it is not safe to transact without PICKING a fork that you know will eventually win.  It would essentially be a "bank holiday" while votes are cast and a new consensus is reached.  Fortunately, everyone would be well aware of what was going on because the "minority" witnesses would indicate their intent to "fork" after the colluding witnesses start rejecting their otherwise valid blocks only because they vote against the colluding witnesses.   It would be an "open and shut" case and the network would likely recover in hours.

This is all very interesting to further explore in detail and try to work out the edge cases. But I would also like clarification on how much of what you are describing is part of the current protocol that either is already implemented or will be by the time BitShares 2.0 is released, and how much of this are just future ideas that could and probably should eventually be included as part of the protocol and implemented in code in the clients. In other words, if right now 95% of delegates started rejecting transactions that voted them out or voted others in, would BitShares automatically recover (and by automatically I am still assuming enough stakeholders actually sign and broadcast transactions that change their votes to new non-censoring delegates) without any work necessary from the developers and without trusted checkpoints?

Offline bytemaster

There are certainly a lot of edge cases that we need to carefully document and I think we can vaguely categorize them by the percentage of colluding witnesses and the time horizon for the attack:

100% collusion of witnesses means they can block transactions that would vote them out.  The network would be held hostage to their demands until a hard-fork can be distributed. 

Given that reality, the case of witnesses turning corrupt *AFTER* they have been voted out is rather unlikely and far less effective; therefore, it is safe to assume it wouldn't happen.

The network is "safe" against both of those attacks so long as 5% of the witnesses do not collude.  That 5% would be enough to allow the minority chain that had support of the stakeholders to elect new
witnesses and eventually overtake the attackers chain which would be stuck at 95% participation assuming that in a healthy network participation averages over 95%.

During the attack it is not safe to transact without PICKING a fork that you know will eventually win.  It would essentially be a "bank holiday" while votes are cast and a new consensus is reached.  Fortunately, everyone would be well aware of what was going on because the "minority" witnesses would indicate their intent to "fork" after the colluding witnesses start rejecting their otherwise valid blocks only because they vote against the colluding witnesses.   It would be an "open and shut" case and the network would likely recover in hours.

For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline arhag

  • Hero Member
  • *****
  • Posts: 1214
    • View Profile
    • My posts on Steem
  • BitShares: arhag
  • GitHub: arhag
1.  The blockchain always picks the longest chain signed by valid witnesses that doesn't have timestamps in the future.

When does the client choose a blockchain? Is it always evaluating all possible valid chains? Or rather is it always evaluating all possible valid chains other than ones that would cause it to reorganize from its current chain in a way such that the fork point of the change is older than 1000 blocks? I guess I need to explain that better (see the scenario I describe below).

2.  The blockchain maintains an "UNDO" history for the last 1000 blocks which means that if there is a fork it must be resolved within 1000 blocks or manual intervention will be required via a checkpoint.   With 1 second blocks this is about 16 minutes *IF* one block was produced every second, but if that were the case there couldn't be a fork.   So if w assume "worst case" a 50/50 split in the network, then it would take 32 minutes of un-interrupted hard-forking before manual intervention would be required.

Let's say Alice has the following blockchain stored locally: B1, B2, ..., B1500.
The real consensus blockchain at time T is: B1, B2, ..., B3168.
There is a fake blockchain floating around in the network which first appeared at time T and looks like this: B1, B2, ..., B2000, B2001', B2002', ..., B3200'.
The common fork point between this fake blockchain and the real consensus chain is B2000. After this point this blockchain diverges form the real blockchain. The witnesses (which I will call "fake witnesses" from this point forward) at the time of B2000 continue to be active in the fake blockchain whereas they were all voted out in the real chain and replaced by new witnesses (which I will call "real witnesses" from this point forward) at block B2100 (start of new maintenance interval). Let's assume all the significant votes that caused the "fake witnesses" to be replaced by the "real witnesses" appeared between blocks B2001 and B2099.
This fake blockchain is a valid chain signed by witnesses according to the rules of the protocol following that chain. At time T it is longer than the real blockchain by about 32 blocks because the "fake witnesses" were able to have 100% witness participation while the "real witnesses" were only able to maintain 99% witness participation over the last 1000 blocks. Therefore, by the longest chain rule, people should prefer the fake chain over the real chain, but because the fork point is more than 1000 blocks old at the time T when the fake blockchain appeared on the network, none of the nodes online at that time would be tricked onto that fake chain (am I correct in this understanding?).

Now right around this time (a little bit after time T), Alice connects with the network and starts syncing. She can see there are two blockchains that build off the chain she currently has stored locally and both seem to follow the protocol rules. Does she pick the real chain (B1, B2, ..., B3168), the fake chain (B1, B2, ..., B2000, B2001', B2002', ..., B3200'), does she get stuck at block B2000, or something else entirely? Is the outcome dependent on which blocks she receives first. In other words, what if she gets the blocks from the fake blockchain well before the blocks from the real one. Can she sync to the present timestamp and have her local blockchain become the fake chain, and then later when she gets the blocks from the real chain she rejects them because the fork point is more than 1000 blocks old from her new current head block? And similarly, if she got the blocks from the real blockchain well before the fake one, would she instead sync to the real blockchain and reject the fake one?
« Last Edit: July 12, 2015, 06:13:09 pm by arhag »

Offline bytemaster

This is on my mind a lot.   It is hard to pick and choose which things should take priority.   Right now I think having a working product released is highest priority. 

I can answer some questions here:

1.  The blockchain always picks the longest chain signed by valid witnesses that doesn't have timestamps in the future.
2.  The blockchain maintains an "UNDO" history for the last 1000 blocks which means that if there is a fork it must be resolved within 1000 blocks or manual intervention will be required via a checkpoint.   With 1 second blocks this is about 16 minutes *IF* one block was produced every second, but if that were the case there couldn't be a fork.   So if w assume "worst case" a 50/50 split in the network, then it would take 32 minutes of un-interrupted hard-forking before manual intervention would be required.
3.  The 1000 block number can be increased to as large as desired and in theory could be all the way back to the last "checkpoint" or even genesis.   If it went all the way back to genesis then you would never get "stuck" on a fork, but this is overkill.     In theory we could make it dynamic and only hold history while witness participation is below 90% and then clear it as soon as it get above 90%.   
4. TaPOS is there to protect users from having their transactions included on alternative forks which may have different meanings (different account IDs, etc).  TaPOS could in theory be used as a check on the witnesses and be used to determine the "best chain", but it is much more costly to do and ultimately unnecessary.   No metrics have been defined for using TaPOS as part of the "best chain" consensus algorithm.

For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline xeroc

  • Board Moderator
  • Hero Member
  • *****
  • Posts: 12922
  • ChainSquad GmbH
    • View Profile
    • ChainSquad GmbH
  • BitShares: xeroc
  • GitHub: xeroc
We need this to be published in a DPOS/TaPOS whitepaper .. can devs spare some time and put the tech together? I can help do the texting/formating around but I need a formal description first ..

Offline sittingduck

  • Sr. Member
  • ****
  • Posts: 246
    • View Profile
Good questions. 


Sent from my iPhone using Tapatalk

Offline arhag

  • Hero Member
  • *****
  • Posts: 1214
    • View Profile
    • My posts on Steem
  • BitShares: arhag
  • GitHub: arhag
I'm looking for good documentation about the consensus protocol of DPOS (either BitShares 0.x or 2.0 is fine, but I would prefer the 2.0 one) and how nodes decide upon the particular blockchain to use. Specifically, I want to know everything about blockchain reorganization. When does a client, given some existing validated blockchain stored locally, accept a reorganization and when does it not? I want to understand all the possible failure modes we are aware of and how they can be achieved. How does the timing of which blocks are received in which order and by when (relative to the syncing process) change the outcome of which blockchain a client ends up choosing? Under what conditions can a full node get stuck and be unable to sync forward? Are all such cases bugs or are there cases in which this is the legitimate behavior? If is is the latter (which I believe to be the case), is the only way to fix the problem to add a trusted checkpoint to the client? If the problem can be fixed by resyncing the blockchain from scratch (which I have noticed had been recommended in the past as a solution to these stuck blockchain problems that had existed before) is that not an indicator of a bug in that particular case? In other words, if it wasn't a bug and got legitimately stuck, shouldn't it get stuck in the same place again (otherwise how is such non-deterministic behavior not a bug)?