BitShares Forum

Main => General Discussion => Topic started by: abit on July 30, 2016, 10:34:39 pm

Title: Serious network issue around 2016-07-29T21:20:00
Post by: abit on July 30, 2016, 10:34:39 pm
Wanted to write more about this issue last night, but didn't get time. Still no much time now.

The issue is described in https://github.com/cryptonomex/graphene/issues/645, which has been triggered in BitShares network and MUSE network several times recently.

BitShares witnesses please apply this patch, if not already done: https://github.com/abitmore/bitshares-2/commit/2cf8fb62c98befb70fedffe739a05bc4ac5a0de8

MUSE witnesses please check Telegram channel for more info.

Code: [Select]
...
git checkout v2.0.160328

# add these lines after `git checkout v2.0.160328` and before `make`
git remote add abit https://github.com/abitmore/bitshares-2.git
git fetch abit
git merge -m "Merge abit/fix-645-block-timestamp" abit/fix-645-block-timestamp

git submodule update --init --recursive
...
Title: Re: Serious network issue around 2016-07-29T21:20:00
Post by: ripplexiaoshan on July 31, 2016, 04:27:22 pm
 +5% +5% +5%
Title: Re: Serious network issue around 2016-07-29T21:20:00
Post by: Thom on July 31, 2016, 08:03:19 pm
This morning found 2 of my nodes had crashed. I restarted them without issue, and at the same time began to rebuild the client & witness based on abit's source tree (https://github.com/abitmore/bitshares-2.git). I used exactly the same methodology to build as I always do. There were no errors.

I deployed the binaries to each of my 4 VPS systems and restarted 2 of them. Both failed in exactly the same way at exactly the same point:

Code: [Select]
   3.21%   268000 of 8348922   
   3.23395%   270000 of 8348922   
   3.25791%   272000 of 8348922   
   3.28186%   274000 of 8348922   
   3.30582%   276000 of 8348922   
   3.32977%   278000 of 8348922   
   3.35373%   280000 of 8348922   
   3.37768%   282000 of 8348922   
   3.40164%   284000 of 8348922   
2550774ms th_a       application.cpp:400           startup              ] 10 assert_exception: Assert Exception
is_valid_name( name ):
    {}
    th_a  account.cpp:162 validate

    {"trx":{"ref_block_num":21932,"ref_block_prefix":4190597394,"expiration":"2015-10-23T15:10:51","operations":[[5,{"fee":{"amount":19
030078,"asset_id":"1.3.0"},"registrar":"1.2.32495","referrer":"1.2.32495","referrer_percent":0,"name":"bitspace.no","owner":{"weight_th
reshold":1,"account_auths":[],"key_auths":[["BTS5ybcCjQ9eey9XaDhkBzQnnVTDhqLaAfLKbqYKQ3EpnvXef11jH",1]],"address_auths":[]},"active":{"
weight_threshold":1,"account_auths":[],"key_auths":[["BTS89kx8WgD7hmNsFt5nRcVi6LRiotNAEjz2L8MFg8N8ETrsMgrFa",1]],"address_auths":[]},"o
ptions":{"memo_key":"BTS52AuTLaMv8CgZrJ6Em7sTZYMa9PB2gVYe9AXrEqC7epsCWBjvT","voting_account":"1.2.5","num_witness":0,"num_committee":0,
"votes":[],"extensions":[]},"extensions":[]}]],"extensions":[],"signatures":["1f36a38d3457788cb689d7025b8685ef4f02d654ce88ed8a73898aa02
645635df17651e280d89e7ce71b4bf02f1e01fc1fe0eb5406e3c3a130dfc3b927198231ad"]}}
    th_a  db_block.cpp:615 _apply_transaction

    {"next_block.block_num()":284077}
    th_a  db_block.cpp:518 _apply_block

    {"data_dir":"/home/admin/.BitShares2/blockchain"}
    th_a  db_management.cpp:97 reindex
2550777ms th_a       application.cpp:946           startup              ] 10 assert_exception: Assert Exception
is_valid_name( name ):
    {}
    th_a  account.cpp:162 validate

    {"trx":{"ref_block_num":21932,"ref_block_prefix":4190597394,"expiration":"2015-10-23T15:10:51","operations":[[5,{"fee":{"amount":19
030078,"asset_id":"1.3.0"},"registrar":"1.2.32495","referrer":"1.2.32495","referrer_percent":0,"name":"bitspace.no","owner":{"weight_th
reshold":1,"account_auths":[],"key_auths":[["BTS5ybcCjQ9eey9XaDhkBzQnnVTDhqLaAfLKbqYKQ3EpnvXef11jH",1]],"address_auths":[]},"active":{"
weight_threshold":1,"account_auths":[],"key_auths":[["BTS89kx8WgD7hmNsFt5nRcVi6LRiotNAEjz2L8MFg8N8ETrsMgrFa",1]],"address_auths":[]},"o
ptions":{"memo_key":"BTS52AuTLaMv8CgZrJ6Em7sTZYMa9PB2gVYe9AXrEqC7epsCWBjvT","voting_account":"1.2.5","num_witness":0,"num_committee":0,
"votes":[],"extensions":[]},"extensions":[]}]],"extensions":[],"signatures":["1f36a38d3457788cb689d7025b8685ef4f02d654ce88ed8a73898aa02
645635df17651e280d89e7ce71b4bf02f1e01fc1fe0eb5406e3c3a130dfc3b927198231ad"]}}
    th_a  db_block.cpp:615 _apply_transaction

    {"next_block.block_num()":284077}

I used the --replay-blockchain arg to launch. I will try the --resync-blockchain on one node, but I highly doubt that will resolve the issue.

I reviewed the issue 645 in github. Also noted the discussion on telegram. Seems like a windows of 5 seconds with a 3 second block time may be marginal? Not sure. No mention of need to resync, so I only replayed.

Also the above assert is on an "invalid name", which seems unrelated to timing. I assume that abit's github repository is identical to the bitshares repo except for his quick patch. Is that assumption valid? Why an exception on name?
Title: Re: Serious network issue around 2016-07-29T21:20:00
Post by: abit on August 01, 2016, 12:39:50 pm
This morning found 2 of my nodes had crashed. I restarted them without issue, and at the same time began to rebuild the client & witness based on abit's source tree (https://github.com/abitmore/bitshares-2.git). I used exactly the same methodology to build as I always do. There were no errors.

I deployed the binaries to each of my 4 VPS systems and restarted 2 of them. Both failed in exactly the same way at exactly the same point:

Code: [Select]
   3.21%   268000 of 8348922   
   3.23395%   270000 of 8348922   
   3.25791%   272000 of 8348922   
   3.28186%   274000 of 8348922   
   3.30582%   276000 of 8348922   
   3.32977%   278000 of 8348922   
   3.35373%   280000 of 8348922   
   3.37768%   282000 of 8348922   
   3.40164%   284000 of 8348922   
2550774ms th_a       application.cpp:400           startup              ] 10 assert_exception: Assert Exception
is_valid_name( name ):
    {}
    th_a  account.cpp:162 validate

    {"trx":{"ref_block_num":21932,"ref_block_prefix":4190597394,"expiration":"2015-10-23T15:10:51","operations":[[5,{"fee":{"amount":19
030078,"asset_id":"1.3.0"},"registrar":"1.2.32495","referrer":"1.2.32495","referrer_percent":0,"name":"bitspace.no","owner":{"weight_th
reshold":1,"account_auths":[],"key_auths":[["BTS5ybcCjQ9eey9XaDhkBzQnnVTDhqLaAfLKbqYKQ3EpnvXef11jH",1]],"address_auths":[]},"active":{"
weight_threshold":1,"account_auths":[],"key_auths":[["BTS89kx8WgD7hmNsFt5nRcVi6LRiotNAEjz2L8MFg8N8ETrsMgrFa",1]],"address_auths":[]},"o
ptions":{"memo_key":"BTS52AuTLaMv8CgZrJ6Em7sTZYMa9PB2gVYe9AXrEqC7epsCWBjvT","voting_account":"1.2.5","num_witness":0,"num_committee":0,
"votes":[],"extensions":[]},"extensions":[]}]],"extensions":[],"signatures":["1f36a38d3457788cb689d7025b8685ef4f02d654ce88ed8a73898aa02
645635df17651e280d89e7ce71b4bf02f1e01fc1fe0eb5406e3c3a130dfc3b927198231ad"]}}
    th_a  db_block.cpp:615 _apply_transaction

    {"next_block.block_num()":284077}
    th_a  db_block.cpp:518 _apply_block

    {"data_dir":"/home/admin/.BitShares2/blockchain"}
    th_a  db_management.cpp:97 reindex
2550777ms th_a       application.cpp:946           startup              ] 10 assert_exception: Assert Exception
is_valid_name( name ):
    {}
    th_a  account.cpp:162 validate

    {"trx":{"ref_block_num":21932,"ref_block_prefix":4190597394,"expiration":"2015-10-23T15:10:51","operations":[[5,{"fee":{"amount":19
030078,"asset_id":"1.3.0"},"registrar":"1.2.32495","referrer":"1.2.32495","referrer_percent":0,"name":"bitspace.no","owner":{"weight_th
reshold":1,"account_auths":[],"key_auths":[["BTS5ybcCjQ9eey9XaDhkBzQnnVTDhqLaAfLKbqYKQ3EpnvXef11jH",1]],"address_auths":[]},"active":{"
weight_threshold":1,"account_auths":[],"key_auths":[["BTS89kx8WgD7hmNsFt5nRcVi6LRiotNAEjz2L8MFg8N8ETrsMgrFa",1]],"address_auths":[]},"o
ptions":{"memo_key":"BTS52AuTLaMv8CgZrJ6Em7sTZYMa9PB2gVYe9AXrEqC7epsCWBjvT","voting_account":"1.2.5","num_witness":0,"num_committee":0,
"votes":[],"extensions":[]},"extensions":[]}]],"extensions":[],"signatures":["1f36a38d3457788cb689d7025b8685ef4f02d654ce88ed8a73898aa02
645635df17651e280d89e7ce71b4bf02f1e01fc1fe0eb5406e3c3a130dfc3b927198231ad"]}}
    th_a  db_block.cpp:615 _apply_transaction

    {"next_block.block_num()":284077}

I used the --replay-blockchain arg to launch. I will try the --resync-blockchain on one node, but I highly doubt that will resolve the issue.

I reviewed the issue 645 in github. Also noted the discussion on telegram. Seems like a windows of 5 seconds with a 3 second block time may be marginal? Not sure. No mention of need to resync, so I only replayed.

Also the above assert is on an "invalid name", which seems unrelated to timing. I assume that abit's github repository is identical to the bitshares repo except for his quick patch. Is that assumption valid? Why an exception on name?
My default source tree is outdated. Please only use the patch. The steps are in OP.
Title: Re: Serious network issue around 2016-07-29T21:20:00
Post by: xeroc on August 02, 2016, 08:50:33 am
I have merged the patch as requested in the pull request by @abit ..
Title: Re: Serious network issue around 2016-07-29T21:20:00
Post by: nmywn on August 02, 2016, 04:59:06 pm
Thank you, abit.
Thank you, xeroc.