Author Topic: Slight change in the block validation to better secure lightweight clients  (Read 2335 times)

0 Members and 1 Guest are viewing this topic.

Offline arhag

  • Hero Member
  • *****
  • Posts: 1214
    • View Profile
    • My posts on Steem
  • BitShares: arhag
  • GitHub: arhag
I think the lightweight client will need to talk to many delegate servers to make sure it can actually trust the information it is getting. Even better would be to use an approach similar to what Ethereum is taking where the entire state of the database is encoded with a single hash (root hash of the Merkle Patricia tree). The most recent snapshot of the database state would be taken by each delegate (delegates would coordinate so they are working with the same recent snapshot), they would calculate the root hash (SNAPSHOT_ROOT_HASH), they would verify with one another that they all have the same hash, work together using a threshold signature scheme to generate a Schnorr signature on HASH(CHAIN_ID || SNAPSHOT_BLOCK_HEIGHT || SNAPSHOT_ROOT_HASH), and then include that in the next block. Then they would repeat the process all over again with another most recent snapshot of the database state.

Either way, the lightweight client will need some way of knowing who the top 101 active delegates are at the moment. They need to be able to confidently obtain this fact even if they have been offline for a while. This proposal discusses one way this can be done while providing decent confidence to the lightweight client user that they aren't being tricked.

I propose that we tweak the way the delegates sign blocks. I presume that currently some kind of hash of the current block is taken and that hash is signed. Perhaps a chain ID (CHAIN_ID) is included in this hash as well before signing. Instead, each delegate would sign HASH(CHAIN_ID || CUR_BLOCK_HEIGHT || CUR_BLOCK_HEADER_HASH || PREV_ROUND_DELEGATES_HASH) where CUR_BLOCK_HEIGHT is the block height of the current block, and CUR_BLOCK_HEADER_HASH is the hash of the header of the current block. The block headers would include the hash of the body of the current block (CUR_BLOCK_BODY_HASH), the timestamp (CUR_BLOCK_TIME), the previous block header hash (PREV_BLOCK_HEADER_HASH), the signing delegate's random number committment for the next round (CUR_BLOCK_RANDOM_COMMIT), the signing delegate's random number reveal for the current round (CUR_BLOCK_RANDOM_REVEAL), and potentially a some other bits of informations (particularly SNAPSHOT_ROOT_HASH, SNAPSHOT_BLOCK_HEIGHT, and the threshold signature, as defined in the first paragraph, if we were to represent the database state as a Merkle Patricia tree).

PREV_ROUND_DELEGATES_HASH is calculated according to the following procedure: take the set of 101 active delegates as of the last block in the previous round; lexigraphically order the BTS addresses of the delegate accounts (the addresses corresponding to the private keys the delegates use to sign the blocks); and calculate the hash of the concatenation of these 101 BTS addresses in their lexigraphical order, which is called PREV_ROUND_DELEGATES_HASH.

It now becomes possible for someone to provide the up to 101 signatures within a given round along with their corresponding BLOCK_HEADER_HASH, the BLOCK_HEIGHT, the BLOCK_TIME, the BLOCK_HEADER_HASH of the block at the end of the round prior to the given round, the CHAIN_ID, and the set of 101 active delegates at the end of the round prior to the given round, and anyone can prove that the claimed delegates of the current round signed off that they are the valid active delegates of that round. This may seem useless because it appears to be circular reasoning, but if any of these delegates lied about this (provided signatures about fake claims that were not consistent with what they provided on the real live blockchain), the recipient of the proof would only need to have the signature of the bad delegate, CHAIN_ID, PREV_ROUND_DELEGATES_HASH, and the BLOCK_HEIGHT and BLOCK_HEADER_HASH of the block the delegate supposedly signed to prove to anyone else that the delegate double signed. Thus, if these signatures belonged to delegates that were active delegates in the present time, one can be fairly confident that they are not providing fake claims because otherwise it would be very easy for the proof recipient to get them fired.

The lightweight client keeps track of the set of active delegates, S, as of block N that it believes it knows is true for some particular value of N. As it can later confidently update its belief about the set of active delegates, S', as of a newer block N' > N, it will replace its old knowledge with this new knowledge. So the amount of information to store in this regard remains constant instead of growing linearly with time. For the lightweight client to update its belief that the set of active delegates as of some block N' is S', it requires:
  • that N' be the block at the end of a round,
  • that it receive proof (the proof described in the previous paragraph) from M delegates in S' (ideally all 101) that claim the active delegates as of block N' are S',
  • and, those M delegates in S' make up at least 51% of the delegates in set S.
Each of the signatures (+ corresponding BLOCK_HEIGHT and BLOCK_HEADER_HASH and PREV_ROUND_DELEGATES_HASH) for a delegate that used to be in one of the sets S that the lightweight client trusted at one point but are no longer in the most recent set S', are placed in a database of "old delegate claims". The similar data for delegates that are currently in the set S' are placed in a database of "current delegate claims". The size of the "current delegate claims" database does not grow over time, but the size of "old delegate claims" does. As the "old delegate claims" database grows too big, the lightweight client is free to delete the signatures and corresponding data that have the smallest BLOCK_HEIGHTs (signatures from oldest blocks) until the database becomes manageable in size again. The idea is that the older these claims become the less likely the user is going to need them to prove double signs, since they would have likely already done so if the claims were in fact false. Furthermore, anytime the lightweight client is upgraded with a new built-in trusted checkpoint, all signatures and corresponding data that are from blocks older than that checkpoint can be removed from the "old delegate claims" database.

The above setup means that current delegates that want to make false claims can know that it is incredibly likely that their victims will have proof to get them fired (once they discover they were tricked) even if they collude to create a scenario where they were "voted out" and replaced by other delegates (that are actually fake delegates they control). However, the lightweight client user only gets protection from delegates that are active delegates in the present since they are the ones who have something to lose. If more than 50% of the delegates that were active by the time the lightweight client last synchronized their client were replaced by other delegates, those >50% delegates might collude to trick the lightweight client to believe in a new set S' the next time it syncs that is not the real present set of active delegate. The lightweight client will assume that the probability of this kind of turnover happening over a period of time less than the period that the lightweight client has been offline AND >50% of the now retired delegates deciding to collude with the lightweight client sync server to trick the lightweight client is very low that it can ignore that threat. What would typically happen if there was a period of rapid turnover while the lightweight client was offline is that the lightweight client sync server would be unable to find a proof for the lightweight client that satisfied the three bullet points above. This would trigger the lightweight client to demand the user to provide a trusted recent checkpoint in order to resync after a long period of being offline. For extra security, the lightweight client user may want to just provide a manual trusted checkpoint beforehand, without being prompted for it, if the user is synchronizing the client after being offline for a long time (months).

So the typical flow for a lightweight client syncing after some period of being offline is to first connect to a lightweight client sync server (which can be provided at no cost to users and paid for by delegates) which will get the lightweight client to update its set S to set of current active delegates (while storing the proofs necessary to get delegates fired in case they lie). It can also periodically communicate with this sync server to keep this set up-to-date while the client is running. The sync server can also provide statements signed by the delegates (the sync server can have a communication link to the delegate servers) that expire frequently (say every 30 minutes) and provide a list of mirror servers for each delegate with the corresponding public key responsible for that mirror server. This way lightweight clients can connect to servers run by the delegates that do not store the same private key that the delegate block signing server stores. If any of the mirror servers are compromised, the delegate's private key isn't compromised, the delegate can stop including that mirror server and corresponding public key in its signed statements, and lightweight clients will not be tricked into connecting to that compromised server within at most 30 minutes (likely sooner if a signed revocation statement reaches the lightclient via the sync server). The sync server does not need to do any other work. It does not provide the results to blockchain_* commands via the RPC. The delegate mirror servers are responsible for that.

While the delegate mirror servers could still lie in their responses to the RPC, it is assumed that they will not because they are known to be the servers of the current active delegates who are trusted by the stakeholders. Each RPC response could be signed by the private key associated with that mirror server. In theory, enough information would be provided to the lightweight clients to act as proof that either the delegate was being evil or their mirror server was compromised. Either way, if the delegate tried to be evil on a large scale or was just negligent and irresponsible, they would very likely be caught and voted out. Just the threat that the protocol allows lightweight clients to potential store their signed responses could be enough of a deterrent to force the delegates to behave, even if the official lightweight clients aren't built with that functionality to begin with. Also, for the sake of extra security, the lightweight client would need to get responses to their RPC from many of the active delegates and make sure they all say the same thing. Fluxer555 has discussed some of these ideas in this post.

While the level of security provided as described by the above paragraph would work adequately initially. Eventually, I would prefer something better. That is why I want the entire database state to be encoded as a Merkle Patricia tree (with the most recent database snapshot root hash and corresponding block height and threshold signature included in the block header) so that a single lightweight server that is not run by the delegates can provide log(N) proof of the existence of any value for a corresponding key within the database as of a given block. This makes it unnecessary for the lightweight client to locally store any additional proofs, other than the ones needed to securely sync to the most recent set S, and yet still be confident about the validity of the RPC responses from a single untrusted lightweight server.