Author Topic: witness_node completely froze overnight  (Read 7505 times)

0 Members and 1 Guest are viewing this topic.

Offline alt

  • Hero Member
  • *****
  • Posts: 2821
    • View Profile
  • BitShares: baozi
I believe we have identified the cause of this and will roll out a fix next week.
thanks!

Offline bytemaster

I believe we have identified the cause of this and will roll out a fix next week.
For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline abit

  • Committee member
  • Hero Member
  • *
  • Posts: 4664
    • View Profile
    • Abit's Hive Blog
  • BitShares: abit
  • GitHub: abitmore
BitShares committee member: abit
BitShares witness: in.abit

Offline abit

  • Committee member
  • Hero Member
  • *
  • Posts: 4664
    • View Profile
    • Abit's Hive Blog
  • BitShares: abit
  • GitHub: abitmore
Code: [Select]
#0  0x00000000021c352c in graphene::chain::price_feed::price_feed (this=0x23d86de8)
    at /app/bts/bts2-2.0.160103b/libraries/chain/include/graphene/chain/protocol/asset.hpp:150
#1  0x00000000024e61c9 in std::pair<fc::time_point_sec, graphene::chain::price_feed>::pair (this=0x23d86de0)
    at /usr/include/c++/4.8/bits/stl_pair.h:127
#2  0x00000000024e6203 in boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> >::pair (this=0x23d86dd8, x=...)
    at /app/boost_1_57_0.bin/include/boost/container/detail/pair.hpp:115
#3  0x00000000024e3dc1 in boost::container::allocator_traits<std::allocator<boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> > > >::priv_construct_dispatch2<boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> ><boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> >&> > (p=0x23d86dd8) at /app/boost_1_57_0.bin/include/boost/container/allocator_traits.hpp:343
#4  0x00000000024e0984 in boost::container::allocator_traits<std::allocator<boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> > > >::priv_construct<boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> ><boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> >&> > (a=..., p=0x23d86dd8) at /app/boost_1_57_0.bin/include/boost/container/allocator_traits.hpp:334
#5  0x00000000024dbec7 in boost::container::allocator_traits<std::allocator<boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> > > >::construct<boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> ><boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> >&> > (a=..., p=0x23d86dd8) at /app/boost_1_57_0.bin/include/boost/container/allocator_traits.hpp:288
#6  0x00000000024d4f18 in boost::container::uninitialized_copy_alloc_n<std::allocator<boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> > >, boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> >*, boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> >*> (a=..., f=0x23ccfe28, n=1, r=0x23d86dd8) at /app/boost_1_57_0.bin/include/boost/container/detail/utilities.hpp:630
#7  0x00000000024ccf4d in boost::container::vector<boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> >, std::allocator<boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> > > >::vector (this=0x7ffff5ce6b70, x=...)
    at /app/boost_1_57_0.bin/include/boost/container/vector.hpp:726
#8  0x00000000024c4129 in boost::container::container_detail::flat_tree<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> >, boost::container::container_detail::select1st<boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> > >, std::less<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object> >, std::allocator<boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> > > >::Data::Data (this=0x7ffff5ce6b70, d=...) at /app/boost_1_57_0.bin/include/boost/container/detail/flat_tree.hpp:122
#9  0x00000000024bad2f in boost::container::container_detail::flat_tree<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> >, boost::container::container_detail::select1st<boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> > >, std::less<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object> >, std::allocator<boost::container::container_detail::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> > > >::flat_tree (this=0x7ffff5ce6b70, x=...) at /app/boost_1_57_0.bin/include/boost/container/detail/flat_tree.hpp:217
#10 0x00000000024afb43 in boost::container::flat_map<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed>, std::less<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object> >, std::allocator<std::pair<graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, std::pair<fc::time_point_sec, graphene::chain::price_feed> > > >::flat_map (this=0x7ffff5ce6b70, x=...)
    at /app/boost_1_57_0.bin/include/boost/container/flat_map.hpp:268
#11 0x00000000024a27a5 in graphene::chain::asset_bitasset_data_object::asset_bitasset_data_object (this=0x7ffff5ce6b30)
    at /app/bts/bts2-2.0.160103b/libraries/chain/include/graphene/chain/asset_object.hpp:164
#12 0x000000000248439b in graphene::chain::database::clear_expired_orders (this=0x596e278)
    at /app/bts/bts2-2.0.160103b/libraries/chain/db_update.cpp:294
#13 0x000000000246045f in graphene::chain::database::_apply_block (this=0x596e278, next_block=...)
    at /app/bts/bts2-2.0.160103b/libraries/chain/db_block.cpp:519
#14 0x000000000245f90e in graphene::chain::database::__lambda19::operator() (__closure=0x7ffff5ce6ef0)
    at /app/bts/bts2-2.0.160103b/libraries/chain/db_block.cpp:475
#15 0x0000000002486757 in graphene::chain::detail::with_skip_flags<graphene::chain::database::apply_block(const graphene::chain::signed_block&, uint32_t)::__lambda19>(graphene::chain::database &, uint32_t, graphene::chain::database::__lambda19) (db=...,
    skip_flags=0, callback=...) at /app/bts/bts2-2.0.160103b/libraries/chain/include/graphene/chain/db_with.hpp:126
#16 0x000000000245fd25 in graphene::chain::database::apply_block (this=0x596e278, next_block=..., skip=0)
    at /app/bts/bts2-2.0.160103b/libraries/chain/db_block.cpp:476
#17 0x00000000024583d8 in graphene::chain::database::_push_block (this=0x596e278, new_block=...)
    at /app/bts/bts2-2.0.160103b/libraries/chain/db_block.cpp:191
#18 0x0000000002457383 in graphene::chain::database::__lambda15::__lambda16::operator() (__closure=0x7ffff5ce75f0)
    at /app/bts/bts2-2.0.160103b/libraries/chain/db_block.cpp:116
#19 0x0000000002486560 in graphene::chain::detail::without_pending_transactions<graphene::chain::database::push_block(const graphene::chain::signed_block&, uint32_t)::__lambda15::__lambda16>(graphene::chain::database &, <unknown type in /app/bts/bts2-2.0.160103b/programs/witness_node/witness_node, CU 0x1063cb1, DIE 0x12afeb4>, graphene::chain::database::__lambda15::__lambda16) (db=...,
    pending_transactions=<unknown type in /app/bts/bts2-2.0.160103b/programs/witness_node/witness_node, CU 0x1063cb1, DIE 0x12afeb4>, callback=...) at /app/bts/bts2-2.0.160103b/libraries/chain/include/graphene/chain/db_with.hpp:143
#20 0x00000000024573ff in graphene::chain::database::__lambda15::operator() (__closure=0x7ffff5ce7690)
    at /app/bts/bts2-2.0.160103b/libraries/chain/db_block.cpp:117
#21 0x00000000024865db in graphene::chain::detail::with_skip_flags<graphene::chain::database::push_block(const graphene::chain::signed_block&, uint32_t)::__lambda15>(graphene::chain::database &, uint32_t, graphene::chain::database::__lambda15) (db=..., skip_flags=0,
    callback=...) at /app/bts/bts2-2.0.160103b/libraries/chain/include/graphene/chain/db_with.hpp:126
#22 0x0000000002457458 in graphene::chain::database::push_block (this=0x596e278, new_block=..., skip=0)
    at /app/bts/bts2-2.0.160103b/libraries/chain/db_block.cpp:118
#23 0x00000000020c01b4 in graphene::app::detail::application_impl::handle_block (this=0x596e140, blk_msg=..., sync_mode=false,
    contained_transaction_message_ids=...) at /app/bts/bts2-2.0.160103b/libraries/app/application.cpp:505
#24 0x00000000029a4a63 in graphene::net::detail::statistics_gathering_node_delegate_wrapper::__lambda56::operator() (
    __closure=0x7fffe80b9ee8) at /app/bts/bts2-2.0.160103b/libraries/net/node.cpp:5388
#25 0x00000000029b610c in fc::detail::functor_run<graphene::net::detail::statistics_gathering_node_delegate_wrapper::handle_block(const graphene::net::block_message&, bool, std::vector<fc::ripemd160>&)::__lambda56>::run(void *, void *) (functor=0x7fffe80b9ee8,
    prom=0x7fffe80b9fd8) at /app/bts/bts2-2.0.160103b/libraries/fc/include/fc/thread/task.hpp:77
#26 0x00000000027760f9 in fc::task_base::run_impl (this=0x7fffe80b9f10)
    at /app/bts/bts2-2.0.160103b/libraries/fc/src/thread/task.cpp:43
#27 0x000000000277608a in fc::task_base::run (this=0x7fffe80b9f10) at /app/bts/bts2-2.0.160103b/libraries/fc/src/thread/task.cpp:32
#28 0x000000000276a76c in fc::thread_d::run_next_task (this=0x59a4410)
    at /app/bts/bts2-2.0.160103b/libraries/fc/src/thread/thread_d.hpp:495
#29 0x000000000276ac10 in fc::thread_d::process_tasks (this=0x59a4410)
    at /app/bts/bts2-2.0.160103b/libraries/fc/src/thread/thread_d.hpp:544
#30 0x000000000276a283 in fc::thread_d::start_process_tasks (my=93996048)
    at /app/bts/bts2-2.0.160103b/libraries/fc/src/thread/thread_d.hpp:475
#31 0x0000000002aceb41 in make_fcontext () at libs/context/src/asm/make_x86_64_sysv_elf_gas.S:64
#32 0x0000000000000000 in ?? ()

Sure it's related to clear_expired_orders
BitShares committee member: abit
BitShares witness: in.abit

Offline abit

  • Committee member
  • Hero Member
  • *
  • Posts: 4664
    • View Profile
    • Abit's Hive Blog
  • BitShares: abit
  • GitHub: abitmore
@iHashFury some questions:
* did you start it with --replay-blockchain?
* was your witness_node running in gdb? if yes, a backtrace would be helpful.

--replay-blockchain run automatically after the update.

I believe it's different.

I just found that my backup node has been in stuck status for quite some time, and CPU is 100%. It's a debug build of version 2.0.160103b, running in gdb, started without --replay-blockchain, however no automatically replay when it was started (database was considered clean).

Code: [Select]
1701380ms th_a       application.cpp:497           handle_block         ] Got block: #2367989 time: 2016-01-04T12:28:21 latency: 380 ms from: verbaltech2  irreversible: 2367972 (-17)
1702000ms th_a       witness.cpp:194               block_production_loo ] Not producing block because slot has not yet arrived
1703000ms th_a       witness.cpp:194               block_production_loo ] Not producing block because slot has not yet arrived
1704000ms th_a       witness.cpp:191               block_production_loo ] Not producing block because it isn't my turn
1704347ms th_a       application.cpp:497           handle_block         ] Got block: #2367990 time: 2016-01-04T12:28:24 latency: 347 ms from: spectral  irreversible: 2367972 (-18)
1152420ms ntp        ntp.cpp:177                   read_loop            ] ntp_delta_time updated to 88 us
258788ms ntp        ntp.cpp:177                   read_loop            ] ntp_delta_time updated to 36647 us
1164747ms ntp        ntp.cpp:177                   read_loop            ] ntp_delta_time updated to -109792 us
2070280ms ntp        ntp.cpp:177                   read_loop            ] ntp_delta_time updated to -216 us
3570604ms ntp        ntp.cpp:177                   read_loop            ] ntp_delta_time updated to -1283 us
Instructions needed. Wish I can help.

//Edit:
Both my primary witness node and seed node work fine, both were started with --replay-blockchain.
« Last Edit: January 06, 2016, 12:22:59 am by abit »
BitShares committee member: abit
BitShares witness: in.abit

iHashFury

  • Guest
@iHashFury some questions:
* did you start it with --replay-blockchain?
* was your witness_node running in gdb? if yes, a backtrace would be helpful.

--replay-blockchain run automatically after the update.
I was not running in gdb.

Offline abit

  • Committee member
  • Hero Member
  • *
  • Posts: 4664
    • View Profile
    • Abit's Hive Blog
  • BitShares: abit
  • GitHub: abitmore
@iHashFury some questions:
* did you start it with --replay-blockchain?
* was your witness_node running in gdb? if yes, a backtrace would be helpful.
BitShares committee member: abit
BitShares witness: in.abit

iHashFury

  • Guest
My witness node froze and maxed out a CPU - Not sure why but there is alot of network_connect_loop action after 2016-01-05T18:20:35 ?

Code: [Select]
2016-01-05T18:48:33 p2p:dump_node_status_task     dump_node_status ] --------- END MEMORY USAGE ------------ node.cpp:4684
2016-01-05T18:48:35 p2p:p2p_network_connect_loop p2p_network_connect_ ] Starting an iteration of p2p_network_connect_loop(). node.cpp:892
2016-01-05T18:48:35 p2p:p2p_network_connect_loop display_current_conn ] Currently have 19 of [20/200] connections

lost a connection

Code: [Select]
2016-01-05T18:48:35 p2p:p2p_network_connect_loop display_current_conn ]    handshaking: [ip-addres-removed]:1776 with 000000000000000000000000000000000000000000000000000000000000000000  [unknown] node.cpp:1743
2016-01-05T18:48:35   p2p:connect_to_task           connect_to ] fatal: error connecting to peer [ip-addres-removed]:1776: 0 exception: unspecified
Connection refused
    {"message":"Connection refused"}
    asio  asio.cpp:60 error_handler peer_connection.cpp:257

Connection refused

Code: [Select]
2016-01-05T18:48:35   p2p:connect_to_task trigger_p2p_network_ ] Triggering connect loop now node.cpp:985
2016-01-05T18:48:35   p2p:connect_to_task schedule_peer_for_de ] scheduling peer for deletion: [ip-addres-removed]:1776 (this will not block) node.cpp:1637
2016-01-05T18:48:35   p2p:connect_to_task schedule_peer_for_de ] asyncing delayed_peer_deletion_task to delete 1 peers node.cpp:1642
2016-01-05T18:48:35 p2p:delayed_peer_deletion_task delayed_peer_deletio ] beginning an iteration of delayed_peer_deletion_task with 1 in queue node.cpp:1601
2016-01-05T18:48:35 p2p:delayed_peer_deletion_task              destroy ] calling close_connection() peer_connection.cpp:124
2016-01-05T18:48:35 p2p:delayed_peer_deletion_task              destroy ] close_connection completed normally peer_connection.cpp:126
2016-01-05T18:48:35 p2p:delayed_peer_deletion_task              destroy ] canceling _send_queued_messages task peer_connection.cpp:139
2016-01-05T18:48:35 p2p:delayed_peer_deletion_task              destroy ] cancel_and_wait completed normally peer_connection.cpp:141
2016-01-05T18:48:35 p2p:delayed_peer_deletion_task              destroy ] canceling accept_or_connect_task peer_connection.cpp:154
2016-01-05T18:48:35 p2p:delayed_peer_deletion_task              destroy ] Unexpected exception from peer_connection's accept_or_connect_task : {"code":0,"name":"exception","message":"unspecified","stack":[{"context":{"level":"error","file":"asio.cpp","line":60,"method":"error_handler","hostname":"","thread_name":"asio","timestamp":"2016-01-05T18:48:35"},"format":"${message} ","data":{"message":"Connection refused"}}]} peer_connection.cpp:160
2016-01-05T18:48:35 p2p:delayed_peer_deletion_task   destroy_connection ] in destroy_connection() for message_oriented_connection.cpp:283
2016-01-05T18:48:35 p2p:delayed_peer_deletion_task   destroy_connection ] in destroy_connection() for message_oriented_connection.cpp:283
2016-01-05T18:48:35 p2p:delayed_peer_deletion_task delayed_peer_deletio ] leaving delayed_peer_deletion_task node.cpp:1604

Not sure what I'm looking for but have the p2p.log if required.

Offline bytemaster

This is very helpful.  We are looking into it closer.
For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline alt

  • Hero Member
  • *****
  • Posts: 2821
    • View Profile
  • BitShares: baozi
I have repeat this bug with a debug version, it run from db_update.cpp:286 in a endless loop,
and the order's id is always 0,
Code: [Select]
(gdb) c
Continuing.

Breakpoint 1, graphene::chain::database::clear_expired_orders (this=0x603fe60) at /home/alt/workspace/dac/bitshares-2/libraries/chain/db_update.cpp:290
290              const force_settlement_object& order = *itr;
(gdb) n
291              auto order_id = order.id;
(gdb) print order
$11 = (const graphene::chain::force_settlement_object &) @0xcc9ea00: {<graphene::db::abstract_object<graphene::chain::force_settlement_object>> = {<graphene::db::object> = {_vptr.object = 0x2cce790 <vtable for graphene::chain::force_settlement_object+16>, static space_id = 0 '\000', static type_id = 0 '\000', id = {
        number = 0}}, <No data fields>}, static space_id = 1 '\001', static type_id = 4 '\004', owner = {static space_id = <optimized out>,
    static type_id = <optimized out>, instance = {value = 32183}}, balance = {amount = {value = 14500000}, asset_id = {static space_id = <optimized out>,
      static type_id = <optimized out>, instance = {value = 113}}}, settlement_date = {utc_seconds = 1451910504}}
(gdb) c
Continuing.

Breakpoint 1, graphene::chain::database::clear_expired_orders (this=0x603fe60) at /home/alt/workspace/dac/bitshares-2/libraries/chain/db_update.cpp:290
290              const force_settlement_object& order = *itr;
(gdb) n
291              auto order_id = order.id;
(gdb) print order
$12 = (const graphene::chain::force_settlement_object &) @0xcc9ea00: {<graphene::db::abstract_object<graphene::chain::force_settlement_object>> = {<graphene::db::object> = {_vptr.object = 0x2cce790 <vtable for graphene::chain::force_settlement_object+16>, static space_id = 0 '\000', static type_id = 0 '\000', id = {
        number = 0}}, <No data fields>}, static space_id = 1 '\001', static type_id = 4 '\004', owner = {static space_id = <optimized out>,
    static type_id = <optimized out>, instance = {value = 32183}}, balance = {amount = {value = 14500000}, asset_id = {static space_id = <optimized out>,
      static type_id = <optimized out>, instance = {value = 113}}}, settlement_date = {utc_seconds = 1451910504}}
(gdb)
I think the order's id shouldn't be 0, or it will never execute
Code: [Select]
342          // Match against the least collateralized short until the settlement is finished or we reach max settlements
343          while( settled < max_settlement_volume && find_object(order_id) )                                                                                         
344          {
345             auto itr = call_index.lower_bound(boost::make_tuple(price::min(mia_object.bitasset_data(*this).options.short_backing_asset,
346                                                                            mia_object.get_id())));
347             // There should always be a call order, since asset exists!
348             assert(itr != call_index.end() && itr->debt_type() == mia_object.get_id());
349             asset max_settlement = max_settlement_volume - settled;

@Bytemаsteг what else infomation do you need?I am keep the gdb env
« Last Edit: January 05, 2016, 02:23:00 pm by alt »

Offline abit

  • Committee member
  • Hero Member
  • *
  • Posts: 4664
    • View Profile
    • Abit's Hive Blog
  • BitShares: abit
  • GitHub: abitmore
I've spent a lot of time investigating this and going over every line of code in the clear_expired_orders().  I can't find the problem from inspecting the code, and I do not have any way to reproduce it reliably.  If someone can provide specific steps or a specific datadir to reproduce it, I might be able to make progress in the investigation.
Today Alt and I did some tests with latest Release build and most recent chain data.
* When witness node is properly shut down, and then start without --replay-blockchain, it will get stuck.
* when start with --replay-blockchain, it won't get stuck.
Maybe something wrong with the 'object_database'.

//Edit: tried some more times but the issue doesn't appear again.
« Last Edit: January 04, 2016, 08:19:40 am by abit »
BitShares committee member: abit
BitShares witness: in.abit

Xeldal

  • Guest
Whats interesting is once the hardfork took place.  It started running smoothly again, no hangs at all since then.

I'm guessing, if they're related, we won't see the witness hanging issue again.

Well, looks like I was wrong.  My windows-cli-tools is back to hanging every 20 seconds.

Code: [Select]
300939ms th_a       fork_database.cpp:60          push_block           ] Pushing
 block to fork database that failed to link: 001d519567d4329c3252cb11f240597c931
23de3, 1921429
300939ms th_a       fork_database.cpp:61          push_block           ] Head: 1
921427, 001d5193c4139878e5ab7863158a9aa7042e44ae
300940ms th_a       application.cpp:524           handle_block         ] Error w
hen pushing block:
3080000 unlinkable_block_exception: unlinkable block
block does not link to known chain
    {}
    th_a  fork_database.cpp:81 graphene::chain::fork_database::_push_block

    {"new_block":{"previous":"001d51948178e5a9af007a5dd9727d2d3c50383b","timesta
mp":"2015-12-19T18:04:48","witness":"1.6.13","transaction_merkle_root":"00000000
00000000000000000000000000000000","extensions":[],"witness_signature":"20310dfe1
8f3fc85f6c8419a43c016a843dacd20a6c0ebbc438a8fc1c7bb76ebb666fba65dedeb732478b1238
9cc5353501f8808be92d28beb08badedb67ecff8c","transactions":[]}}
    th_a  db_block.cpp:201 graphene::chain::database::_push_block
« Last Edit: December 19, 2015, 06:06:46 pm by Xeldal »

Offline pc

  • Hero Member
  • *****
  • Posts: 1530
    • View Profile
    • Bitcoin - Perspektive oder Risiko?
  • BitShares: cyrano
I've spent a lot of time investigating this and going over every line of code in the clear_expired_orders().  I can't find the problem from inspecting the code, and I do not have any way to reproduce it reliably.  If someone can provide specific steps or a specific datadir to reproduce it, I might be able to make progress in the investigation.

I'll pm you a download link for a datadir that reliably causes my node to hang while syncing. It gets stuck around this point in the output:

Code: [Select]
3298620ms th_a       db_market.cpp:468             check_call_orders    ] Feed protected margin call executing (HARDFORK_436_TIME not here yet)
3298620ms th_a       db_market.cpp:469             check_call_orders    ] *call_itr: {"id":"1.8.1069","borrower":"1.2.23568","collateral":"12757500000","debt":180102,"call_price":{"base":{"amount":1215000000,"asset_id":"1.3.0"},"quote":{"amount":30017,"asset_id":"1.3.105"}}}
3298621ms th_a       db_market.cpp:470             check_call_orders    ] *limit_itr: {"id":"1.7.12966","expiration":"2020-12-16T11:08:58","seller":"1.2.89808","for_sale":180100,"sell_price":{"base":{"amount":180100,"asset_id":"1.3.105"},"quote":{"amount":"7292609200","asset_id":"1.3.0"}},"deferred_fee":0}
« Last Edit: December 18, 2015, 03:42:04 pm by pc »
Bitcoin - Perspektive oder Risiko? ISBN 978-3-8442-6568-2 http://bitcoin.quisquis.de

Xeldal

  • Guest
I've spent a lot of time investigating this and going over every line of code in the clear_expired_orders().  I can't find the problem from inspecting the code, and I do not have any way to reproduce it reliably.  If someone can provide specific steps or a specific datadir to reproduce it, I might be able to make progress in the investigation.

I have no idea if this is related. Just guessing.

When the 1209 release was first put out. I noticed the windows-cli-toolsx64.exe I was using(from Nov 11th) started acting strange. It would hang for about 20 seconds and then catch-up, over and over again.  Eventually it would just permanently hang and not respond.   -resync and -replay had no effect.  It would immediately start hanging again.  This behavior was the same with every subsequent release upto and including 1216b.   

Whats interesting is once the hardfork took place.  It started running smoothly again, no hangs at all since then.

I'm guessing, if they're related, we won't see the witness hanging issue again. 

Offline theoretical

I've spent a lot of time investigating this and going over every line of code in the clear_expired_orders().  I can't find the problem from inspecting the code, and I do not have any way to reproduce it reliably.  If someone can provide specific steps or a specific datadir to reproduce it, I might be able to make progress in the investigation.
BTS- theoretical / PTS- PZxpdC8RqWsdU3pVJeobZY7JFKVPfNpy5z / BTC- 1NfGejohzoVGffAD1CnCRgo9vApjCU2viY / the delegate formerly known as drltc / Nothing said on these forums is intended to be legally binding / All opinions are my own unless otherwise noted / Take action due to my posts at your own risk