BitShares Forum

Main => General Discussion => Topic started by: sumantso on January 04, 2015, 11:33:00 pm

Title: Blockchain size
Post by: sumantso on January 04, 2015, 11:33:00 pm: https://bitcointalk.org/index.php?topic=913605.0

If I remember correctly, doesn't Bitshares throw away the older blocks after a certain number? I am not sure so didn't post anything, but if it is a post there would draw attention.
Title: Re: Blockchain size
Post by: bytemaster on January 04, 2015, 11:41:29 pm: Never throws away anything.
Title: Re: Blockchain size
Post by: bytemaster on January 04, 2015, 11:41:48 pm: Well it could after 24 hrs
Title: Re: Blockchain size
Post by: sumantso on January 04, 2015, 11:45:34 pm: Quote from: bytemaster on January 04, 2015, 11:41:29 pm
Never throws away anything.

I was mistaken then. I thought the maximum number of blocks was fixed, like after a certain number, every additional block would cause the earliest block to disappear.
Title: Re: Blockchain size
Post by: arhag on January 05, 2015, 12:13:37 am: Quote from: bytemaster on January 04, 2015, 11:41:48 pm
Well it could after 24 hrs

How? Wouldn't you need some kind of checkpoint not just of a recent block hash but also of the state of the entire database at that point? And that checkpoint has to be cross-platform so everyone would be in consensus on the database state hash.

By the way, I really really would like to see BitShares have that feature in the future. Ethereum has it with their Patricia tree (https://github.com/ethereum/wiki/wiki/Patricia-Tree) database.

Edit: Now that I think about it more, I suspect you meant the local client can throw away those old blocks from the raw_chain but the entire blockchain needs to still exist somewhere on the internet for a new client to resync from genesis. That's less exciting.
Title: Re: Blockchain size
Post by: fluxer555 on January 05, 2015, 02:55:09 am: Quote from: sumantso on January 04, 2015, 11:33:00 pm
https://bitcointalk.org/index.php?topic=913605.0

If I remember correctly, doesn't Bitshares throw away the older blocks after a certain number? I am not sure so didn't post anything, but if it is a post there would draw attention.

I seem to remember when blockchain design was being discussed that a trailing two years of transactions would be kept, and the rest pruned.
Title: Re: Blockchain size
Post by: theoretical on January 05, 2015, 09:07:58 pm: At one point, there were discussions of a penalty for those who do not move their balances at least once per year. I believe the stated reasons were to encourage people to keep delegate votes up-to-date, encourage people to combine their balances, and provide a mechanism to permanently destroy "dust" balances.

However, AFAIK this scheme has never been implemented, and is currently no longer under consideration.
Title: Re: Blockchain size
Post by: emski on January 05, 2015, 09:27:22 pm: Does the following make sense:

Select first N blocks.
Create a snapshot of all unspent balances at N-th block.
Hardcode that snapshot in some future release that should fork the network at N+K-th block.
After the N+K-th block use the snapshot as input for any transaction (older clients will not work, but this happens with most of the updates anyway).

This should manually dust off the blockchain with a single update.
Title: Re: Blockchain size
Post by: arhag on January 05, 2015, 09:38:01 pm: Quote from: emski on January 05, 2015, 09:27:22 pm
Does the following make sense:

Select first N blocks.
Create a snapshot of all unspent balances at N-th block.
Hardcode that snapshot in some future release that should fork the network at N+K-th block.
After the N+K-th block use the snapshot as input for any transaction (older clients will not work, but this happens with most of the updates anyway).

This should manually dust off the blockchain with a single update.

The bolded part is the hard part. You need to represent the entire state of the database snapshot using a single root hash. And every client, regardless of the platform they run on and the particular implementation of the client (what if they don't want to use LevelDB?), needs to be able to deterministically derive the same root hash given a particular snapshot consensus state of the database. This is why I linked to the Ethereum wiki page on Merkle Patricia trees (https://github.com/ethereum/wiki/wiki/Patricia-Tree).

Besides allowing the blockchain to be pruned and allowing a client to bootstrap to the present state more quickly by using a recent snapshot instead of the genesis, it also has the benefit of providing better lightweight client security (https://github.com/ethereum/wiki/wiki/Light-client-protocol).

C'mon, let's do this. Let's not let Ethereum have the superior technology in this case.
Title: Re: Blockchain size
Post by: santaclause102 on January 05, 2015, 09:41:37 pm: Quote from: arhag on January 05, 2015, 09:38:01 pm
Quote from: emski on January 05, 2015, 09:27:22 pm
Does the following make sense:

Select first N blocks.
Create a snapshot of all unspent balances at N-th block.
Hardcode that snapshot in some future release that should fork the network at N+K-th block.
After the N+K-th block use the snapshot as input for any transaction (older clients will not work, but this happens with most of the updates anyway).

This should manually dust off the blockchain with a single update.

The bolded part is the hard part. You need to represent the entire state of the database snapshot using a single root hash. And every client, regardless of the platform they run on and the particular implementation of the client (what if they don't want to use LevelDB?), needs to be able to deterministically derive the same root hash given a particular snapshot consensus state of the database. This is why I linked to the Ethereum wiki page on Merkle Patricia trees (https://github.com/ethereum/wiki/wiki/Patricia-Tree).

Besides allowing the blockchain to be pruned and allowing a client to bootstrap to the present state more quickly by using a recent snapshot instead of the genesis, it also has the benefit of providing better lightweight client security (https://github.com/ethereum/wiki/wiki/Light-client-protocol).

C'mon, let's do this. Let's not let Ethereum have the superior technology in this case.
I will talk to your employer tomorrow...
Title: Re: Blockchain size
Post by: emski on January 05, 2015, 09:50:32 pm: Quote from: arhag on January 05, 2015, 09:38:01 pm
Quote from: emski on January 05, 2015, 09:27:22 pm
Does the following make sense:

Select first N blocks.
Create a snapshot of all unspent balances at N-th block.
Hardcode that snapshot in some future release that should fork the network at N+K-th block.
After the N+K-th block use the snapshot as input for any transaction (older clients will not work, but this happens with most of the updates anyway).

This should manually dust off the blockchain with a single update.

The bolded part is the hard part. You need to represent the entire state of the database snapshot using a single root hash. And every client, regardless of the platform they run on and the particular implementation of the client (what if they don't want to use LevelDB?), needs to be able to deterministically derive the same root hash given a particular snapshot consensus state of the database. This is why I linked to the Ethereum wiki page on Merkle Patricia trees (https://github.com/ethereum/wiki/wiki/Patricia-Tree).

Besides allowing the blockchain to be pruned and allowing a client to bootstrap to the present state more quickly by using a recent snapshot instead of the genesis, it also has the benefit of providing better lightweight client security (https://github.com/ethereum/wiki/wiki/Light-client-protocol).

C'mon, let's do this. Let's not let Ethereum have the superior technology in this case.

So you cant hardcode few hundred megabytes of snapshot data in any format and read it on any platform?
Is it that hard ?
Title: Re: Blockchain size
Post by: arhag on January 05, 2015, 10:05:01 pm: Quote from: emski on January 05, 2015, 09:50:32 pm
So you cant hardcode few hundred megabytes of snapshot data in any format and read it on any platform?
Is it that hard ?

Fine, maybe it's not hard, but it will take a lot of additional code and some careful design.

First the data format has to be well defined, and if we are going to do this we better do it right. Hence, the Merkle Patricia tree so that clients can do operations on the objects in the database in O(log N) and lightweight clients can look at just the parts of the database they are interested in (and prove it is in the database with a small sized proof). What is stored in the database is important. There is no need to store extra indexes that provide speed in lookups but can be easily be regenerated from the rest of the database by the client. On the other hand, certain extra cached data may be really important to include to allow lightweight clients to quickly prove what they need to about the state of the database without having to iterate through everything in it. So all of this has to be carefully designed.

Second, the snapshotting process should be continuously happening in parallel. I don't think it should be synchronous like in Ethereum (part of creating the new block is to already have the Merkle root hash of the new state of the database). But the clients should always be generating the Merkle root hash of a recent snapshot of the database and signing its validity in the block headers as they go.
Title: Re: Blockchain size
Post by: emski on January 05, 2015, 10:36:30 pm: @arhag
My proposal was a little bit more radical (and dirty):
Snapshot data is a list of all address:amount pairs.
It is hardcoded in the clients and it is trusted explicitly (of course clients should be able to verify it by referring to the first N blocks, but once the snapshot is accepted first N blocks should be kept for historical reasons only).
It should shrink the blockchain and more or less do the trick without much coding.

EDIT: Actually it looks too dirty to be even considered... I shouldn't propose anything after 23:00...