The network is "safe" against both of those attacks so long as 5% of the witnesses do not collude. That 5% would be enough to allow the minority chain that had support of the stakeholders to elect new witnesses and eventually overtake the attackers chain which would be stuck at 95% participation assuming that in a healthy network participation averages over 95%.
I think this is a different scenario from the one I was discussing, but an interesting one nonetheless. I don't understand how the protocol allows a minority chain to recover. I understand that a minority chain could include transactions that change enough votes to vote out the other 95% of bad witnesses by the next maintenance interval, but how exactly do clients decide to switch to that chain given that it is a minority chain? What if there were multiple minority chains that elected a different set of witnesses? How do clients choose which one to adopt?
I also want to add to these questions a bit to hopefully make it more clear what I am asking here.
Let's say there are N = 100 witnesses all producing blocks and everything is good. A new round begins (which is also the second to last round of this maintenance interval) and it is ordered according to: W1, W2, ..., W100. When it is W2's turn they produce a block that has a transaction that votes out witnesses W1 and W3 to W100, and replaces them with W101 to W198 (this transaction is the deciding vote, meaning if included it will kick those witnesses out and replace them with W101 to W198). When it is W3's turn to produce a block, they build off of W1's block, not W2's block, but nevertheless include that transaction that votes out the other witnesses. When it is W4's turn to produce a block, they build off of W1's block, not that of W2 or W3, and they do not include that vote changing transaction. W5 builds off of W4's block and doesn't include the vote changing transaction as well. And similarly for W6 to W100, thus ending that round with a witness participation of 98% (since the blocks produced by W2 and W3 were not included in the majority chain and therefore considered missing).
Now at the start of the next block (which is the last round of this maintenance interval), a witness shuffle takes place in the majority chain using the accumulate random number (which I guess did not include W2's and W3's random numbers since they were not revealed in the main chain). W2 and W3 are ordered somewhere in the middle of that round and will not produce blocks that round, even if they did they would likely be skipped over by the other witnesses. The other witnesses will also not include the vote change transaction.
What about the other orphan forks by W2 and W3? What happens at the end of the previous round when 99% of their witnesses were missing blocks. Is there a witness shuffle? If so how does the algorithm work? In fact, more generally how does the random number commitment, reveal, and accumulation process work in light of missing blocks in rounds or witnesses being voted in and out in maintenance interval transitions? I presume however the witnesses are ordered in the new round, both W2 and W3 will have an opportunity to produce a block in each chain. So let's say W2 builds a block off their own block in their chain and W3 builds a block off their own block in their own chain.
Let me first get some notation out of the way.
B(W, t) means a block signed by witness W with timestamp t.
The blockchain interval is Δt.
B
1 ← B
2 is a blockchain subsequence that means block B
2 builds off block B
1.
B
1 ← B
2 = B(W3, t) ← B
3 denotes a blockchain subsequence of three blocks where the middle block is given an alias of B
2 and has a timestamp t and was signed by W3.
B(n) denotes a blockchain starting from the genesis block and ending with a block with block number n.
P(t : time, i : bounded_int(
[0, size(L)!
)), L : [optional(witness_t)]) defines a particular blockchain subsequence given by the following pseudocode:
for all types I,
fn permuted<I>(acc : [I], indices : [I], j : uint) -> [I];
// Deterministically permutes seq using index j according to whatever blockchain protocol rules require.
// Note that with no restriction on the permutation the type of j should actually be bounded_int( [0, size(indices)!) ).
// However if the blockchain protocol rules are more restrictive than the bounds on j need to be tighter.
// For completeness here is one permutation implementation that allows any permutation of indices (no further restrictions like those needed for DPOS 2.0).
fn permuted<I>(acc : [I], seq : [I], j : uint ) -> [I]
{
ASSERT( j < size(seq)! );
match size(seq) {
0 => acc,
m => let j1 = j % m in
let j2 = j / m in
let swapped_seq = swap(seq, 0, j1) in // let swapped_seq equal seq except with the values at the index 0 and j1 swapped
permuted<I>(append(acc, head(swapped_seq)), tail(swapped_seq), j2)
}
}
fn helper(start : time, acc : [(time, witness_t)], seq : [optional(witness_t)]) -> [(time, witness_t)] {
match seq {
empty_list => acc,
prepend(head, tail) => match head {
Some(w : witness_t) => helper(start + Δt, append(acc, (start, w)), tail),
None => helper(start + 2Δt, acc, tail)
}
}
datatype block = Block(witness_t, time)
datatype block_subsequence = Rest | Chain(block_subsequence, block)
fn helper2(L' : [(time, witness_t)], acc : block_subsequence) -> block_subsequence {
match L' {
prepend((head_t, head_w), tail) => helper2(tail, Chain(acc, Block(head_w, head_t))),
_ => acc
}
}
fn P(t : time, i : bounded_int( [0, size(L)!) ), L : [optional(witness_t)]) -> block_subsequence {
let L' = helper(t, [], permuted<optional(witness_t)>([], L, i)) in
helper2(L', Rest)
}
L is a sequence of witnesses, e.g. [Some(W1), Some(W2), None, Some(W5), ...], that defines the witnesses that will be participating in block signing and how many are missing. As a shorthand [Some(W1), Some(W2), None, Some(W5), ...] can be written as {W1, W2, _, W5, ...}. L' is a sequence of (time, witness_t) tuples, e.g. [(t, W2), (t+2Δt, W5), (t+3Δt, W1), ...], that defines the blocks of the blockchain subsequence returned by P, e.g. B(W2, t) ← B(W5, t+2Δt) ← B(W1, t+3Δt) ← ..., in the same order. Note that the timestamps of the blocks of the resulting blockchain subsequence will always be within the interval
[t, t+(size(L)-1)*Δt
].
Examples:
B(W1, t) ← P(t+Δt, 0, {W2, W3}) is equivalent to the blockchain subsequence B(W1, t) ← B(W2, t+Δt) ← B(W3, t+2Δt).
B(W1, t) ← P(t+Δt, 1, {W2, W3}) is equivalent to the blockchain subsequence B(W1, t) ← B(W3, t+Δt) ← B(W2, t+2Δt).
B(W1, t) ← P(t+Δt, 6, {W2, _, W3, W4}) is equivalent to the blockchain subsequence B(W1, t) ← B(W3, t+Δt) ← B(W2, t+2Δt) ← B(W4, t+4Δt).
B(W1, t) ← P(t+Δt, 2, {W2, _ x 5, W3}), which is the shorthand for B(W1, t) ← P(t+Δt, 6, {W2, _, _, _, _, _, W3}), is equivalent to the blockchain subsequence B(W1, t) ← B(W2, t+3Δt) ← B(W3, t+7Δt).
Okay, enough notation.
So by the end of this round (the last round of the maintenance interval), just after time t+199Δt, the three blockchains look like this:
Chain 1 (majority chain with length n + 196):
Bbase ← B
f = B(W1, t) ← B(W4, t+3Δt) ← B(W5, t+4Δt) ← ... ← B
l,1 = B(W100, t+99Δt) ← P(t+100Δt, i
1,1, {W1, _, _, W4, W5, ..., W100}),
where i
1,1 is some index derived from the accumulated random value as of block B
l,1, and
Bbase =
B(n) for some positive integer n.
Chain 2 (minority chain with length n + 3):
Bbase ← B(W1, t) ← B
l,2 = B(W2, t+Δt) ← B
v,2 = B(W2, t+k
1Δt),
where 100 ≤ k
1 ≤ 199 depending on the witness shuffling algorithm determined just after time t+99Δt using information by block B
l,2.
Chain 3 (minority chain with length n + 3):
Bbase ← B(W1, t) ← B
l,3 = B(W3, t+2Δt) ← B
v,3 = B(W3, t+k
2Δt),
where 100 ≤ k
2 ≤ 199 depending on the witness shuffling algorithm determined just after time t+99Δt using information by block B
l,3.
The common fork point of all three chains is block B
f which has block number n+1.
By time t+200Δt, a new maintenance interval has started and therefore the votes have been updated and some witnesses might have been removed and replaced by others. This happens in each of the three chains despite the fact that chain 2 and chain 3 had only 1 block in the previous round, correct? Let's assume that no change of witnesses occur in chain 1. So the same witness W1 to W100 are still active in that chain, there is just a regular reordering since it is the start of a new round.
But in chain 2 and 3, we can imagine enough votes were accumulated in B
v,2 and B
v,3, respectively, that witnesses W1 and W4 to W100 were voted out and replaced by witnesses W101 to W198. Therefore, I assume that in the next round, a witness shuffling will occur for the sets {W2, W3, W101, W102, ..., W198} in both chain 2 and chain 3 (but each with their own random number used for the shuffling), correct? If so, the blockchain another 100 blocks later could end up something like this:
Chain 1 (majority chain with length n + 294):
Bbase ← B(W1, t) ← B(W4, t+3Δt) ← B(W5, t+4Δt) ← ... ← B(W100, t+99Δt) ← P(t+100Δt, i
1,1, {W1, _, _, W4, W5, ..., W100}) ← P(t+200Δt, i
1,2, {W1, _, _, W4, W5, ..., W100}),
where i
1,2 is some index that gives the "appropriate" permutation to satisfy the blockchain protocol rules for witness shuffling at this point in the blockchain.
Chain 2 (majority chain, at least for this past round since it has 91% witness participation, with length n + 94):
Bbase ← B(W1, t) ← B
l,2 = B(W2, t+Δt) ← B(W2, t+k
1Δt) ← P(t+200Δt, i
2,2, {_ x 9, W2, W101, W102, ..., W190}).
Chain 3 (minority chain, since it only has 9% witness participation in this last round, with length n + 12):
Bbase ← B(W1, t) ← B
l,3 = B(W3, t+2Δt) ← B(W3, t+k
2Δt) ← P(t+200Δt, i
3,2, {_ x 91, W3, W191, W192, ..., W198}).
Here I assumed that nearly all (90 out of 98) of the newly elected witnesses chose to build on chain 2, while the rest (8 out of 98) decided to support chain 3. Notice that none of the 98 could build on chain 1 because they were active witnesses in that chain (the transactions that got them elected in were not included in chain 1).
Also notice that someone looking at just the past round would think that chain 2 is doing pretty decently since 91% of its active witnesses produced blocks in the last round (despite the fact that only 1% of witnesses produced blocks in the prior round). However, going by the longest chain metric neither chain 2 nor chain 3 win compared to chain 1, which is clearly the longest chain. Therefore, all full nodes following the protocol still stay on chain 1. If we assume that the remaining witnesses that are wasting their time on chain 3 decide to finally join chain 2, then chain 2 can theoretically achieve 100% witness participation from that point forward.
If witnesses W2, W3, W191, ..., W198 continue to always produce blocks for every round thereafter, and if witnesses W1, W4, ..., W100 continue to always produce blocks for every round as well, then the block height of chain 1 will increase by 98 each round while the block height of chain 2 will increase by 100 each round. So after r rounds, chain 1 has n1 = n + 294 + 98*r blocks and chain 2 has n2 = n + 94 + 100*r blocks. We can see that n1 equals n2 at r = 100, or after 100 more rounds (or 10,000 more blocks in chain 2 or just before time t + 10199Δt) chain 1 and chain 2 are of equal length. In the next round afterwards, chain 2 will become the longest chain. However, by this point the common fork point between these two chains (which remember is at block B
f and block number n+1) is way more than 1000 blocks old. In fact it is at least 10094 blocks old. So none of the live full nodes will switch over to chain 2, correct?
Now some other user, Alice, runs their BitShares full node client and connects to the network. The last block they had synced in their previous session of using the client just prior to shutting down was block B
f which has block number n+1 and timestamp t+Δt. Now it is time t+10201Δt (10200 seconds, or 2.83 hours, later with Δt = 1 second) and Alice's client needs to sync the last nearly 3 hours worth of blocks. From Alice's perspective, the longest chain is chain 2 not chain 1. Her client was not online to see what happened between time t+Δt and t+10201Δt. The client has no way of knowing that everyone has decided to stick with chain 1 because the fork point was too old to trigger a blockchain reorganization. So after getting the two candidate chains, does Alice's client conclude that it must get stuck at block B
f and warn the user? At what time in the syncing process does it figure this out? Given the two chains, 1 and 2, stored locally that are in the state as I described it as of time t+10199Δt, is the decision Alice's client makes deterministic or does it depend on when the client receives the corresponding blocks as it is syncing? Does the client keep going back and forth between chain 1 and 2 (using its undo history to constantly rewind back to B
f) as each chain alternates in winning over the other in longest length like a close horse race as blocks from each chain come in through the peer to peer network?
My understanding of how this should work is that the clients should be designed to get stuck unless they are able to sync to the head block that has a valid timestamp that gets as close as possible (relative to the time reported by NTP) without exceeding it AND where the common fork point of any other competing valid chains that occur in a block after the block that was last synced by the client by the end of the previous session is not too old to prevent reorganization. I believe that is the only time where it makes sense to keep moving forward with the longest chain with no error, and any other scenario requires the client to get stuck and warn the user (would you say this characterization of the problem is correct?).
Now, putting those questions aside for now, I think it is worth looking at a simpler scenario of just two chains and to try to determine what it takes in terms of honest witnesses and in average witness participation for the "good" chain to win in longest length prior to the chain reorganization limit being reached. So in this new scenario we will suppose that there are G good witnesses (W
orig,1, ..., W
orig,G) that start on the new chain as soon as they have enough votes to vote out the bad witnesses (W
bad,1, ..., W
bad,N-G), presumably because they are censoring transactions, and no one wastes time on any other minority chains. The newly elected witnesses (W
new,1, ..., W
new,N-G) along with the original good witnesses then produce blocks each round with an average f witness participate rate (1 - G/N < f ≤ 1). In that case we might have the following chains early in the process (just before time t + 2*N*Δt):
Bad chain (length n + 2*(N-G)):
Bbase ← P(t, i
b,1, {_ x G, W
bad,1, ..., W
bad,N-G}) ← P(t + N*Δt, i
b,2, {_ x G, W
bad,1, ..., W
bad,N-G}).
Good chain (length n + k + G + N):
Bbase ← B(W
bad,j1, t) ← B(W
bad,j2, t+Δt) ← ... ← B
fork = B(W
bad,jk, t+(k-1)*Δt) ← P(t+k*Δt, i
g,1, {_ x (N-G-k), W
orig,1, ..., W
orig,G}) ← P(t + N*Δt, i
g,2, {W
orig,1, ..., W
orig,G, W
new,1, ..., W
new,N-G}),
where 0 ≤ k ≤ N - G.
At this point the bad chain is the longer chain as long as N - k - 3G > 0. If the good witnesses get lucky (k = N - G), then the good chain will already be the longer chain by this point in time. While we cannot tell what k will be (since it depends on the witness shuffling and thus on the random values), we can calculate the probability distribution p(k) assuming uniformly random permutation. The distribution is p(k) = (G * (N-G)! * (N-k-1)! ) / ( (N-G-k)! * N! ) for 0 ≤ k ≤ N - G, and so the expected value for k should be ∑ k*p(k) from k = 0 to (N-G), which according to
WolframAlpha is (N-G)/(G+1). So substituting that expression for k in the inequality and manipulating the terms a little gives the inequality 3*G - N + 4 < 0, or N > T(G) where T(G) = 3*G + 4. When N > T(G), then at this point in time the bad chain will still be the longer chain, but if N < T(G), then at this point the good chain will have already become the longer chain. With N = 100, T(G) first becomes greater than N at G = 33.
Despite estimating the expected value for k, I will proceed with the worse case assumption that k = 0. Assuming the bad chain is still the longer chain, we can see what happens after several rounds. We should expect the bad chain to add (N-G) blocks each round, but we will only assume the good chain can add f*N blocks each round. So after r additional rounds (just before time t + (2*N + r)*Δt) the length of the bad chain is n + (2+r)*(N-G), and the length of the good chain is at least n + G + N + r*f*N. The difference between the length of the good chain and the length of the bad chain after r rounds is approximately D(r) = 3*G - N + r*(G - (1-f)*N). We can
see that D(r) will be positive under the constraints (1-f)*N < G < N/2 and 0.5 < f < 1 when r > (g^2 - 2*f*g+ 2*f - 1)/(f+g-1) where g = G/N. We need D(r) to become positive (so that the good chain becomes the longest chain) before the fork block, B
fork at block number n+k, becomes older than R = 1000 blocks. This means that as soon as (2+r)*(N-G) > R, which is at about r = R/(N-G) - 2, D(r) must be positive which thus requires that r = (R/N)/(1-g) - 2 > (g^2 - 2*f*g+ 2*f - 1)/(f+g-1), or R > N*(1-g)*(2 + (g^2 - 2*f*g+ 2*f - 1)/(f+g-1)). This can also be rewritten as f > (g^2 - 2*(1-g) + R' - 1) / (R'/(1-g) - 2*(2-g)), where R' = R/N, and assuming R' > 2*(1-g)*(2-g) which given the previous inequality holds true for
f > 2/3. So with f > 2/3, which means g < 1/3 (for g > 1/3 the good chain wins without any extra rounds being necessary), and the constraint that 1-g < f < 1, we
find that R' > (-g^3 + g^2 - g + 1)/g.
So with R = 1000 and N = 100 (thus R' = 10), the minimum value of g necessary to guarantee the good chain will become the longer chain prior to the chain reorganization limit being reached is approximately
0.0916. This means at least 10 good witnesses out of the total 100 witnesses are needed. And the minimum value of f needed as a function of g is given by
this plot. With G = 10, the average witness participation needs to at least be 98.62%. With G = 15, we get a more sensible value of 90.8% as the minimum witness participation needed to ensure the good chain becomes the longest chain prior to the chain reorganization limit being reached.