Author Topic: Blockchain size projections  (Read 4851 times)

0 Members and 1 Guest are viewing this topic.

Offline arhag

  • Hero Member
  • *****
  • Posts: 1214
    • View Profile
    • My posts on Steem
  • BitShares: arhag
  • GitHub: arhag
Sounds doable (maybe one day..); but note that the full state is still not bounded (nor has it been partitioned) so delegate resource requirements will still keep growing. Whether the growth will be fast enough to be a serious problem--I don't know.

Sure the database state still grows. But that is to be expected when our community is growing! Every new account registration and asset registration is going to grow the database. But I think reasonable bounds will exist: there is only a limited number of accounts people will want (and there is only a finite number of people on this planet), and if not that means our account registration fee is too low; there is probably some reasonable bound on the number of unique balances each account will have; there are reasonable bounds to the number of orders in any given order book (individuals will split their orders to a fine price granularity only up to a limit, and typically only for orders at prices within a few percentage difference from the current price); asset registrations are hopefully bounded to a sensible number by the fees needed to register them. So I don't think of the database as growing indefinitely with time but rather growing proportionally to user adoption (and there is a point where adoption has to saturate).

And obviously the database shouldn't store anything that grows with time rather than adoption (such as various histories of price feeds, trades, etc. or previously defined delegate slates that are no longer used), at least not for more than some limited window of time. Those things could be useful or even necessary to store in a cache database (and perhaps that cache database should be in the Merkle Patricia tree so that even light wallets can access it), but since it is a cache focused on a limited moving window of time, it would still be bounded in size as the old, unnecessary data is progressively garbage collected.

And certainly expiring balances would help with this. That is another reason I prefer the expiration fee that seems to no longer be of interest. It clears out all those balances that are probably never going to be used (due to lost private keys) but would indefinitely take up space in the database.
« Last Edit: February 05, 2015, 01:39:35 am by arhag »

Offline vikram

Sounds doable (maybe one day..); but note that the full state is still not bounded (nor has it been partitioned) so delegate resource requirements will still keep growing. Whether the growth will be fast enough to be a serious problem--I don't know.

Offline arhag

  • Hero Member
  • *****
  • Posts: 1214
    • View Profile
    • My posts on Steem
  • BitShares: arhag
  • GitHub: arhag
The blocks themselves are not the problem--they are just the inputs to the state machine and can be thrown out. The problem is that the amount of information that defines the full state of the network in our system grows without bound. In theory, if we changed the system to expire all the different kinds of data after some amount of time, we might be able to bound the size, but I am not aware of any work that's been done in this direction.

You still need the full blockchain for a new machine to get to the present state of the database even if the client includes a recent trusted checkpoint.

What about the idea of encoding the running snapshot of the database in such a way that it can deterministically be reduced to a single hash that commits to the entire state of the database, and having the delegates include that hash (and the block height of the block which was the head block at the time of database snapshot that the hash refers to) in the block headers as part of their job. The hash would have to be correct for the block to be valid. Once delegates finished computing (in parallel to regular block producing operations) the hash of one recent snapshot in parallel, they begin computing the hash of the state of the database immediately after the previous hash was computed. All full nodes (including of course the delegate notes) are able to coordinate on which database snapshot they will be computing the hash of next.

Then it becomes possible for the client with a recent trusted checkpoint to also get the hash of a recent trusted database state. They can ask any node that has a copy of that state (or has the full blockchain and can regenerate a copy of that state) to provide the full state to the requester. The requester will compute the hash to make sure it is in fact the true trusted database state and then continue evolving the database from that point using the portion of blockchain starting from that point. If all clients store the same database state snapshot once every 6 months (the one as of block N where N is the largest integer less than ((the current block height) - 1577846) and where N % 1577846 == 0) and only keep the most recent snapshot (to save space), then a new full client (with a trusted checkpoint no older than 6 months) only needs to download and process 6 to 12 months worth of the blockchain (unless they can find a node that also has a database snapshot that is more recent than 6 months but still older than the trusted checkpoint, in which case they only have to download and process less than 6 months worth of the blockchain) rather than the entire thing, in addition to downloading and validating the entire database state that existed somewhere between 6 and 12 months in the past.

Furthermore, if the database encoding was done in a way that anyone can provide log(K) sized proofs of the existence of some (key, value) pair in the database (where K is the number of unique keys that hold a value in the database), and the database was designed in a clever way, it becomes possible for lightweight clients to easily verify (with only the minimal trust required that the current active delegates will not go against their economic interest to defraud users when it is guaranteed that they will eventually be caught) the proof provided by an untrusted third party (say the light wallet server) about something they are interested to know about the blockchain as it existed in the most recent snapshot (which ideally should happen frequently enough that the most recent snapshot is always less than a minute old). This of course assumes that the light wallet knows who the current active 101 delegates are and has a way to update that set over time (under normal circumstances) in a lightweight manner without needing to rely on trusted checkpoints built into the client (other than to initialize the belief in the active set of delegates on a new machine or after a long time of being offline). If the snapshot frequency happens fast enough that its period is less than 17 minutes, then ideally all (or >80%) of the delegates should sign the previous block header (which creates a verification chain to the most recent snapshot) every block, so that the light wallet can actually have enough trust in the validity of the root hash of an extremely recent database snapshot.
« Last Edit: February 05, 2015, 12:20:56 am by arhag »

Offline vikram


Has the idea of pruning the blockchain been abandoned ?

The blocks themselves are not the problem--they are just the inputs to the state machine and can be thrown out. The problem is that the amount of information that defines the full state of the network in our system grows without bound. In theory, if we changed the system to expire all the different kinds of data after some amount of time, we might be able to bound the size, but I am not aware of any work that's been done in this direction.

Offline gamey

  • Hero Member
  • *****
  • Posts: 2253
    • View Profile

Has the idea of pruning the blockchain been abandoned ?
I speak for myself and only myself.

Offline vikram

Bump. No data? Blockchain size seems to be a potential issue for Bitshares.
It is not an issue:
 - only delegates require the full chain .. so 101 ..
 - light wallets are currently under development

No need for everyone to store the full thing unless you WANT to

Is that really the case?  I thought everyone is downloading the full blockchain.  And the light client wont have a market.

xeroc is speaking theoretically; currently most everyone processes the entire blockchain--but this does not scale and in the future it will mostly be delegates processing the full blockchain while most users use light clients.

Offline bytemaster

Bump. No data? Blockchain size seems to be a potential issue for Bitshares.
It is not an issue:
 - only delegates require the full chain .. so 101 ..
 - light wallets are currently under development

No need for everyone to store the full thing unless you WANT to

Is that really the case?  I thought everyone is downloading the full blockchain.  And the light client wont have a market.

Light Client will have a market.. just not on first release. 
For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline matt608

  • Hero Member
  • *****
  • Posts: 878
    • View Profile
Bump. No data? Blockchain size seems to be a potential issue for Bitshares.
It is not an issue:
 - only delegates require the full chain .. so 101 ..
 - light wallets are currently under development

No need for everyone to store the full thing unless you WANT to

Is that really the case?  I thought everyone is downloading the full blockchain.  And the light client wont have a market. 

Offline Chronos

Bump. No data? Blockchain size seems to be a potential issue for Bitshares.
It is not an issue:
 - only delegates require the full chain .. so 101 ..
 - light wallets are currently under development

No need for everyone to store the full thing unless you WANT to
I hadn't thought that only delegates need to have the entire chain. Good point! Light wallets are very important here.
« Last Edit: February 04, 2015, 09:49:30 pm by Chronos »

Offline Ander

  • Hero Member
  • *****
  • Posts: 3506
    • View Profile
  • BitShares: Ander
I never thought to track this, I could start doing it though. Here's the current output of disk_usage:

Code: [Select]
(wallet closed) >>> disk_usage
{
  "blockchain": "841 MiB",
  "dac_state": "1 GiB",
  "logs": "19 MiB",
  "mail_client": "71 KiB",
  "mail_server": null,
  "network_peers": "8 MiB"

So it seems the blockchain is less than 2GB, and it is thus growing slower than bitcoin's?

What would happen if the transaction rate increased to match bitcoin's current rate, would it start growing a lot faster?
https://metaexchange.info | Bitcoin<->Altcoin exchange | Instant | Safe | Low spreads

Offline svk

I never thought to track this, I could start doing it though. Here's the current output of disk_usage:

Code: [Select]
(wallet closed) >>> disk_usage
{
  "blockchain": "841 MiB",
  "dac_state": "1 GiB",
  "logs": "19 MiB",
  "mail_client": "71 KiB",
  "mail_server": null,
  "network_peers": "8 MiB"
Worker: dev.bitsharesblocks

Offline xeroc

  • Board Moderator
  • Hero Member
  • *****
  • Posts: 12922
  • ChainSquad GmbH
    • View Profile
    • ChainSquad GmbH
  • BitShares: xeroc
  • GitHub: xeroc
Bump. No data? Blockchain size seems to be a potential issue for Bitshares.
It is not an issue:
 - only delegates require the full chain .. so 101 ..
 - light wallets are currently under development

No need for everyone to store the full thing unless you WANT to

Offline Chronos

Bump. No data? Blockchain size seems to be a potential issue for Bitshares.

Offline Methodise

Projected would be nice, as well.
BTS: methodise

Offline Chronos

Does anyone have a historical graph of size over time?

Offline vikram

For the current size, use the "disk_usage" command or just directly measure the dbs in your data directory.

Offline Chronos

We all know Bitcoin blockchain continues to grow. Quickly approaching 30GB already. How does the BTS chain compare, in terms of longer-term extrapolations of current trends? I'm particularly interested in how the 10-second block timing and the on-chain exchange contribute to the total size.