BitShares Forum

Main => General Discussion => Topic started by: puppies on October 02, 2014, 05:33:34 am

Title: segfault in 0.4.19
Post by: puppies on October 02, 2014, 05:33:34 am
My delegate node just ran into a seg fault and I dropped a block.  Unfortunately I was not running in gdb, and the executable was stripped, so I can't get any debug info.  I am currently rebuilding to see if I can reproduce.  Has anyone else run into any problems?

* Edit by Bytemaster
   - this issue has been fixed in the latest toolkit and DAC Sun Limited has been notified that they should merge the fix and provide an update.
Title: Re: segfault in 0.4.19
Post by: Webber on October 02, 2014, 06:25:07 am
I have same issue, some other delegates have the same problem ,looks like a serious bug,only 4600 blocks now.
Title: Re: segfault in 0.4.19
Post by: liondani on October 02, 2014, 07:05:42 am
% participation dropped to 88% (red alert)
so I suggest to wait a little bit to upgrade until they give further directions (?)
Title: Re: segfault in 0.4.19
Post by: bitcoinerS on October 02, 2014, 07:29:49 am
my node just crashed on 0.4.19

Code: [Select]
>>>
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff87fff700 (LWP 4915)]
std::_Hashtable<bts::net::item_id, bts::net::item_id, std::allocator<bts::net::item_id>, std::__detail::_Identity, std::equal_to<bts::net::item_id>, std::hash<bts::net::item_id>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::find (
    this=this@entry=0x45944, __k=...) at /usr/include/c++/4.8/bits/hashtable.h:1024
1024          std::size_t __n = _M_bucket_index(__k, __code);
(gdb) bt
#0  std::_Hashtable<bts::net::item_id, bts::net::item_id, std::allocator<bts::net::item_id>, std::__detail::_Identity, std::equal_to<bts::net::item_id>, std::hash<bts::net::item_id>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::find (
    this=this@entry=0x45944, __k=...) at /usr/include/c++/4.8/bits/hashtable.h:1024
#1  0x0000000000a4ede5 in find (__x=..., this=0x45944) at /usr/include/c++/4.8/bits/unordered_set.h:517
#2  bts::net::detail::node_impl::process_block_during_normal_operation (this=this@entry=0x5a7c2920, originating_peer=originating_peer@entry=0x7fff659dd210,
    block_message_to_process=..., message_hash=...) at /root/bitsharesx/libraries/net/node.cpp:2828
#3  0x0000000000a5068b in bts::net::detail::node_impl::process_block_message (this=this@entry=0x5a7c2920, originating_peer=originating_peer@entry=0x7fff659dd210,
    message_to_process=..., message_hash=...) at /root/bitsharesx/libraries/net/node.cpp:2880
#4  0x0000000000a51d03 in bts::net::detail::node_impl::on_message (this=0x5a7c2920, originating_peer=0x7fff659dd210, received_message=...)
    at /root/bitsharesx/libraries/net/node.cpp:1598
#5  0x0000000000ac034a in bts::net::detail::message_oriented_connection_impl::read_loop (this=0x7fff65a19490) at /root/bitsharesx/libraries/net/message_oriented_connection.cpp:157
#6  0x0000000000ac271c in operator() (__closure=<optimized out>) at /root/bitsharesx/libraries/net/message_oriented_connection.cpp:100
#7  fc::detail::void_functor_run<bts::net::detail::message_oriented_connection_impl::accept()::__lambda0>::run(void *, void *) (functor=<optimized out>, prom=0x7fff65ab2120)
    at /root/bitsharesx/libraries/fc/include/fc/thread/task.hpp:83
#8  0x00000000006bf323 in fc::task_base::run_impl (this=this@entry=0x7fff65ab2130) at /root/bitsharesx/libraries/fc/src/thread/task.cpp:43
#9  0x00000000006bf9d5 in fc::task_base::run (this=this@entry=0x7fff65ab2130) at /root/bitsharesx/libraries/fc/src/thread/task.cpp:32
#10 0x00000000006bd95b in run_next_task (this=0x7fff7c0008c0) at /root/bitsharesx/libraries/fc/src/thread/thread_d.hpp:415
#11 fc::thread_d::process_tasks (this=this@entry=0x7fff7c0008c0) at /root/bitsharesx/libraries/fc/src/thread/thread_d.hpp:439
#12 0x00000000006bdbe6 in fc::thread_d::start_process_tasks (my=140735273765056) at /root/bitsharesx/libraries/fc/src/thread/thread_d.hpp:395
#13 0x0000000000f4628e in make_fcontext ()
#14 0x00007fff7c0008c0 in ?? ()
#15 0x00007fff7c068be0 in ?? ()
#16 0x0000000000000000 in ?? ()
Title: Re: segfault in 0.4.19
Post by: xeroc on October 02, 2014, 07:45:48 am
I wanted to update a price feed and it segfaulted :(
luckily it's just a 'control' node and not the delegate ... but the hardfork will happen today 19pm UTC .. :-\

- crashing on debian
- crashing on archlinux
- independent of price feed publishing ..
Title: Re: segfault in 0.4.19
Post by: Harvey on October 02, 2014, 08:13:33 am
Mine crash on 0.4.19 too.
Title: Re: segfault in 0.4.19
Post by: emski on October 02, 2014, 08:40:20 am
Crash Confirmed!
Title: Re: segfault in 0.4.19
Post by: lakerta06 on October 02, 2014, 08:43:24 am
Should we panic?
Title: Re: segfault in 0.4.19
Post by: xeroc on October 02, 2014, 08:46:25 am
Should we panic?
na ..

delegates can still run version 0.4.18 ....
actually if the delegates were to choose not to update to 0.4.19 (which they shouldn't do currently) there will not be the hardfork at 640000:)
Title: Re: segfault in 0.4.19
Post by: svk on October 02, 2014, 09:17:43 am
Same here, three segfaults so far, one on the delegate machine, two on the bitsharesblocks machine..
Title: Re: segfault in 0.4.19
Post by: bitcoinerS on October 02, 2014, 09:20:13 am
0.4.19 is not safe to run. It keeps crashing.  I am switching back to 0.4.18
Title: Re: segfault in 0.4.19
Post by: liondani on October 02, 2014, 09:47:47 am
Should we panic?

(http://media.fakeposters.com/results/2013/03/15/10jhhke9gf.jpg)
Title: Re: segfault in 0.4.19
Post by: serejandmyself on October 02, 2014, 09:54:48 am
mine is crashing out all the time also
Title: Re: segfault in 0.4.19
Post by: xeroc on October 02, 2014, 10:06:51 am
Btw .. it seems init delegates also crashed ..
Title: Re: segfault in 0.4.19
Post by: svk on October 02, 2014, 10:07:44 am
I've reduced my number of max connections to 10, running OK so far but monitoring it..
Title: Re: segfault in 0.4.19
Post by: xfund on October 02, 2014, 10:17:54 am
I've reduced my number of max connections to 10, running OK so far but monitoring it..

How to do?
Title: Re: segfault in 0.4.19
Post by: spartako on October 02, 2014, 10:21:28 am
I'm running with  --accept-incoming-connections 0 flag and
> network_set_advanced_node_parameters {"desired_number_of_connections":20, "maximum_number_of_connections":20}

No crash of 0.4.19 until now after 1 hour...monitoring
Title: Re: segfault in 0.4.19
Post by: svk on October 02, 2014, 10:21:49 am
I've reduced my number of max connections to 10, running OK so far but monitoring it..

How to do?

network_set_advanced_node_parameters {"maximum_number_of_connections":10}
Title: Re: segfault in 0.4.19
Post by: xfund on October 02, 2014, 10:23:46 am
Thanks very much +5%
Title: Re: segfault in 0.4.19
Post by: GaltReport on October 02, 2014, 11:30:13 am
I suggest monitor memory frequently.  I have a delegate that got low on memory just running overnight so I rebooted it.  Haven't crashed or missed blocks yet but I would for sure keep a close eye on it.

Edit: Linux command to check memory: free

Title: Re: segfault in 0.4.19
Post by: spartako on October 02, 2014, 11:31:55 am
Just crashed in my "seed" node that accept incoming connections:

Code: [Select]
(wallet closed) >>>
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffb3fff700 (LWP 41682)]
0x0000000000a55777 in std::_Hashtable<bts::net::item_id, bts::net::item_id, std::allocator<bts::net::item_id>, std::__detail::_Identity, std::equal_to<bts::net::item_id>, std::hash<bts::net::item_id>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_find_before_node(unsigned long, bts::net::item_id const&, unsigned long) const () at /usr/include/c++/4.8/bits/hashtable.h:1159
1159          __node_base* __prev_p = _M_buckets[__n];
(gdb) bt
#0  0x0000000000a55777 in std::_Hashtable<bts::net::item_id, bts::net::item_id, std::allocator<bts::net::item_id>, std::__detail::_Identity, std::equal_to<bts::net::item_id>, std::hash<bts::net::item_id>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_find_before_node(unsigned long, bts::net::item_id const&, unsigned long) const () at /usr/include/c++/4.8/bits/hashtable.h:1159
#1  0x0000000000a55840 in std::_Hashtable<bts::net::item_id, bts::net::item_id, std::allocator<bts::net::item_id>, std::__detail::_Identity, std::equal_to<bts::net::item_id>, std::hash<bts::net::item_id>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::find(bts::net::item_id const&) ()
    at /usr/include/c++/4.8/bits/hashtable.h:604
#2  0x0000000000a3dbc3 in bts::net::detail::node_impl::process_block_during_normal_operation(bts::net::peer_connection*, bts::client::block_message const&, fc::ripemd160 const&) ()
    at /usr/include/c++/4.8/bits/unordered_set.h:517
#3  0x0000000000a3f48b in bts::net::detail::node_impl::process_block_message(bts::net::peer_connection*, bts::net::message const&, fc::ripemd160 const&) ()
    at /home/fabiux/ssd/bitsharesx/libraries/net/node.cpp:2880
#4  0x0000000000a40bd3 in bts::net::detail::node_impl::on_message(bts::net::peer_connection*, bts::net::message const&) () at /home/fabiux/ssd/bitsharesx/libraries/net/node.cpp:1598
#5  0x0000000000aaef3a in bts::net::detail::message_oriented_connection_impl::read_loop() () at /home/fabiux/ssd/bitsharesx/libraries/net/message_oriented_connection.cpp:157
#6  0x0000000000ab130c in fc::detail::void_functor_run<bts::net::detail::message_oriented_connection_impl::accept()::{lambda()#1}>::run(void*, fc::detail::void_functor_run<bts::net::detail::message_oriented_connection_impl::accept()::{lambda()#1}>) () at /home/fabiux/ssd/bitsharesx/libraries/net/message_oriented_connection.cpp:100
#7  0x00000000006adc53 in fc::task_base::run_impl() () at /home/fabiux/ssd/bitsharesx/libraries/fc/src/thread/task.cpp:43
#8  0x00000000006ac47b in fc::thread_d::process_tasks() () at /home/fabiux/ssd/bitsharesx/libraries/fc/src/thread/thread_d.hpp:415
#9  0x00000000006ac716 in fc::thread_d::start_process_tasks(long) () at /home/fabiux/ssd/bitsharesx/libraries/fc/src/thread/thread_d.hpp:395
#10 0x0000000000f1f9ce in make_fcontext ()
#11 0x00007fffa80008c0 in ?? ()
#12 0x00007fff8d02a530 in ?? ()
#13 0x0000000000000000 in ?? ()
(gdb)

My delegate node (without incoming connection and limit to 20 connection) is OK until now.
Title: Re: segfault in 0.4.19
Post by: xeroc on October 02, 2014, 12:04:09 pm
if run the client with

 --server --accept-incoming-connections 0 --max-connections 10

however .. I eventually lose ALL connections .. :(
Title: Re: segfault in 0.4.19
Post by: GaltReport on October 02, 2014, 12:12:33 pm
I'm running under this setting:

Code: [Select]
network_set_advanced_node_parameters { "peer_connection_retry_timeout": 10, "desired_number_of_connections": 10, "maximum_number_of_connections": 10 }
if that helps anyone.

Title: Re: segfault in 0.4.19
Post by: cube on October 02, 2014, 12:39:25 pm
I have been running 0.4.19 for a few hours now.  So far being stable. No crash.
Title: Re: segfault in 0.4.19
Post by: bytemaster on October 02, 2014, 12:44:03 pm
Don't upgrade to .19.  Clearly it has issues.   
Title: Re: segfault in 0.4.19
Post by: GaltReport on October 02, 2014, 12:48:20 pm
Don't upgrade to .19.  Clearly it has issues.

What if you already did and it's working?
Title: Re: segfault in 0.4.19
Post by: emski on October 02, 2014, 12:48:36 pm
Don't upgrade to .19.  Clearly it has issues.
Should we use RC1 ? It seems stable.
Title: Re: segfault in 0.4.19
Post by: serejandmyself on October 02, 2014, 12:49:03 pm
Don't upgrade to .19.  Clearly it has issues.

What if you already did and it's working?

or doesnt?  :)
Title: Re: segfault in 0.4.19
Post by: wackou on October 02, 2014, 12:59:50 pm
Don't upgrade to .19.  Clearly it has issues.

any advice on how to downgrade smoothly? Should we just relaunch 0.4.18 on an empty DB and reimport keys? I imagine that as everything is already re-indexed for 0.4.19 there is no way to downgrade this to 0.4.18 directly without wiping some indices first, right?
Title: Re: segfault in 0.4.19
Post by: GaltReport on October 02, 2014, 01:02:27 pm
Don't upgrade to .19.  Clearly it has issues.

any advice on how to downgrade smoothly? Should we just relaunch 0.4.18 on an empty DB and reimport keys? I imagine that as everything is already re-indexed for 0.4.19 there is no way to downgrade this to 0.4.18 directly without wiping some indices first, right?

I would suggest that if it is working for you, you should not have to downgrade but hopefully BM will weigh in.
Title: Re: segfault in 0.4.19
Post by: emski on October 02, 2014, 01:06:20 pm
Don't upgrade to .19.  Clearly it has issues.

any advice on how to downgrade smoothly? Should we just relaunch 0.4.18 on an empty DB and reimport keys? I imagine that as everything is already re-indexed for 0.4.19 there is no way to downgrade this to 0.4.18 directly without wiping some indices first, right?

I would suggest that if it is working for you, you should not have to downgrade but hopefully BM will weigh in.

All delegates should be on the same version. Unless we want forks.
Title: Re: segfault in 0.4.19
Post by: GaltReport on October 02, 2014, 01:07:32 pm
Don't upgrade to .19.  Clearly it has issues.

any advice on how to downgrade smoothly? Should we just relaunch 0.4.18 on an empty DB and reimport keys? I imagine that as everything is already re-indexed for 0.4.19 there is no way to downgrade this to 0.4.18 directly without wiping some indices first, right?

I would suggest that if it is working for you, you should not have to downgrade but hopefully BM will weigh in.

All delegates should be on the same version. Unless we want forks.

Can't they delay planned forks ?  Not sure if that is what you are referring to or not.  I only notice that when we do upgrades their are days when people are on different versions.
Title: Re: segfault in 0.4.19
Post by: wackou on October 02, 2014, 01:09:54 pm
I would suggest that if it is working for you, you should not have to downgrade but hopefully BM will weigh in.

well, 3 crashes in less than 1 hour, reported here: https://github.com/BitShares/bitshares_toolkit/issues/834

good thing is, it helps me debug my monitoring script  ;)
Title: Re: segfault in 0.4.19
Post by: emski on October 02, 2014, 01:11:42 pm
Don't upgrade to .19.  Clearly it has issues.

any advice on how to downgrade smoothly? Should we just relaunch 0.4.18 on an empty DB and reimport keys? I imagine that as everything is already re-indexed for 0.4.19 there is no way to downgrade this to 0.4.18 directly without wiping some indices first, right?

I would suggest that if it is working for you, you should not have to downgrade but hopefully BM will weigh in.

All delegates should be on the same version. Unless we want forks.

Can't they delay planned forks ?  Not sure if that is what you are referring to or not.  I only notice that when we do upgrades their are days when people are on different versions.
Forks are hardcoded in the code for block number. Current v0.4.19 fork is scheduled for block 640000. All v0.4.19 will fork at that block. Any previous versions might not accept blocks produced by v0.4.19. That is why either all (most) of the delegates should upgrade by block 640000 or none (very few) of them should upgrade. Note that v0.4.19-RC1 is different version  -> it should not accept blocks by both v0.4.19 and v0.4.18.
Title: Re: segfault in 0.4.19
Post by: GaltReport on October 02, 2014, 01:15:35 pm
Don't upgrade to .19.  Clearly it has issues.

any advice on how to downgrade smoothly? Should we just relaunch 0.4.18 on an empty DB and reimport keys? I imagine that as everything is already re-indexed for 0.4.19 there is no way to downgrade this to 0.4.18 directly without wiping some indices first, right?

I would suggest that if it is working for you, you should not have to downgrade but hopefully BM will weigh in.

All delegates should be on the same version. Unless we want forks.

Can't they delay planned forks ?  Not sure if that is what you are referring to or not.  I only notice that when we do upgrades their are days when people are on different versions.
Forks are hardcoded in the code for block number. Current v0.4.19 fork is scheduled for block 640000. All v0.4.19 will fork at that block. Any previous versions might not accept blocks produced by v0.4.19. That is why either all (most) of the delegates upgrade by block 640000 or none (very few) of them upgrade. Note that v0.4.19-RC1 is different version  -> it should not accept blocks by both v0.4.19 and v0.4.18.

ugh, looks like we're "f--ked"...:( Seems like approx. 68 active delegates are on 0.4.19.
Title: Re: segfault in 0.4.19
Post by: emski on October 02, 2014, 01:17:21 pm
640000 hasn't come yet (current block is ~637725). Currently all delegates are on the same fork and there are no issues (except the segfault).
By block 640000 all(most) delegates should be on the same version.
Title: Re: segfault in 0.4.19
Post by: bitcoinerS on October 02, 2014, 01:18:31 pm
Don't upgrade to .19.  Clearly it has issues.

any advice on how to downgrade smoothly? Should we just relaunch 0.4.18 on an empty DB and reimport keys? I imagine that as everything is already re-indexed for 0.4.19 there is no way to downgrade this to 0.4.18 directly without wiping some indices first, right?

I downgraded to 0.4.18, but had to delete ~/.BitSharesX/chain to get it working again. 
Title: Re: segfault in 0.4.19
Post by: bitder on October 02, 2014, 01:24:07 pm
I've had a single segfault but 0.4.19 has been stable otherwise.
Since the majority has already upgraded to 0.4.19 it's probably easier to commit to it. The segfault is a hassle but it seems to be harmless otherwise.
The only problem is just restarting your delegate after a segfault but that can be done automatically with HackFisher's expect script (or a slightly modified version of it)
This way we can just wait for patch releases of 0.4.19 as they are made available.
Thoughts?

https://github.com/Bitsuperlab/operation_tools.git
restart/run_wallet.exp
Code: [Select]
#!/usr/bin/expect -f

set timeout -1
set default_port 1776
set port $default_port

### change wallet_name here
set wallet_name "delegate"
send_user "wallet name is: $wallet_name\n"
send_user "wallet passphrase: "
stty -echo
expect_user -re "(.*)\n"
  stty echo
set wallet_pass $expect_out(1,string)

  proc run_wallet {} {
    global wallet_name wallet_pass default_port port
      ### change command line here
      spawn ./bitshares_client --data-dir=delegate --p2p-port $port --server --httpport 9989 --rpcuser user --rpcpassword pass

      expect -exact "(wallet closed) >>> "
      send -- "about\r"
      expect -exact "(wallet closed) >>> "
      send -- "wallet_open $wallet_name\r"
      expect -exact "$wallet_name (locked) >>> "
      send -- "wallet_unlock 99999999\r"
      expect -exact "passphrase: "
      send -- "$wallet_pass\r"
      expect -exact "$wallet_name (unlocked) >>> "
      send -- "wallet_delegate_set_block_production ALL true\r"
      expect -exact "$wallet_name (unlocked) >>> "
      send -- "info\r"
      expect -exact "$wallet_name (unlocked) >>> "
      send -- "wallet_list_my_accounts\r"
      interact
      wait

      if { $port == $default_port } {
        set port [expr $port+1]
      } else {
        set port [expr $port-1]
      }
  }

while true {
  run_wallet
}

Title: Re: segfault in 0.4.19
Post by: emski on October 02, 2014, 01:28:09 pm
Updating to v0.4.19 seems reasonable to me. Segfault by itself isn't that much of a deal.
However bytemaster recommended not to update. Maybe he has other reasons?
There is still plenty of time for an informed decision. I suggest waiting for bytemaster's input and update to v0.4.19 as default option (if he is silent about this).
Title: Re: segfault in 0.4.19
Post by: xfund on October 02, 2014, 01:33:57 pm
I think:
we all  :network_set_advanced_node_parameters {"maximum_number_of_connections":10}
and wait v0.4.20
Title: Re: segfault in 0.4.19
Post by: xeroc on October 02, 2014, 01:34:20 pm
BTW:
Code: [Select]
$ python3 timeofblock.py 640000
block 640000 to appear in <= 6:02:20
UTC time: 2014-10-02 19:36:10

6 hours left
Title: Re: segfault in 0.4.19
Post by: liondani on October 02, 2014, 01:43:25 pm
It was clear enough...  everybody back to v0.4.18.  Anybody not doing it will heart his reliability statistics...  enough time for downgrading... delete your hidden directory   ".Bitsharesx "  also. After all v0.4.18 is proven stable at least since we all run it a few days...

Sent from my ALCATEL ONE TOUCH 997D

Title: Re: segfault in 0.4.19
Post by: svk on October 02, 2014, 01:46:51 pm
Mine's been stable since the initial segfault, running v0.4.19 with 8 connections max.

I think it'll be hard to get everyone over on .18 in time, but I suggest those who want to do so re-publish their version so we can monitor the number of delegates on each side. Currently .19 is in the majority, but I'm ready to switch over if necessary.
Title: Re: segfault in 0.4.19
Post by: GaltReport on October 02, 2014, 01:47:48 pm
Updating to v0.4.19 seems reasonable to me. Segfault by itself isn't that much of a deal.
However bytemaster recommended not to update. Maybe he has other reasons?
There is still plenty of time for an informed decision. I suggest waiting for bytemaster's input and update to v0.4.19 as default option (if he is silent about this).

I agree. everyone downgrading would be very unpleasant.
Title: Re: segfault in 0.4.19
Post by: GaltReport on October 02, 2014, 01:50:15 pm
Average confirmation time is 5 secs.  That's good.
Title: Re: segfault in 0.4.19
Post by: bytemaster on October 02, 2014, 01:59:30 pm
Updating to v0.4.19 seems reasonable to me. Segfault by itself isn't that much of a deal.
However bytemaster recommended not to update. Maybe he has other reasons?
There is still plenty of time for an informed decision. I suggest waiting for bytemaster's input and update to v0.4.19 as default option (if he is silent about this).

I agree. everyone downgrading would be very unpleasant.

There are no blockchain level bugs... we suspect that it may be low-memory machines that are crashing which is why we did not experience crashes in our own tests. 
Title: Re: segfault in 0.4.19
Post by: bytemaster on October 02, 2014, 02:01:45 pm
Because downgrading is so unpleasant and we want the blockchain updates asap 

Dan & Eric are looking into the crash, but those that are experiencing it could you please report the specs of the machine you are running on?
Lets update to 0.4.19...    

Title: Re: segfault in 0.4.19
Post by: liondani on October 02, 2014, 02:01:48 pm
that's a good example why the hard fork must have a bigger block distance from the initial announcement time ...

Sent from my ALCATEL ONE TOUCH 997D

Title: Re: segfault in 0.4.19
Post by: xeroc on October 02, 2014, 02:02:45 pm
hmmm .. 2GB is already low memory .. oha

three machines
XEN DomU - 2GB RAM - debian/archlinux - 64 bit
Title: Re: segfault in 0.4.19
Post by: xfund on October 02, 2014, 02:05:18 pm
0.4.19 ok
Title: Re: segfault in 0.4.19
Post by: bytemaster on October 02, 2014, 02:07:15 pm
hmmm .. 2GB is already low memory .. oha

Yes it is because we haven't taken time to optimize memory usage. 

We are keeping large portions of the blockchain database *IN RAM* for performance reasons and will probably continue to keep it in RAM long term.   Long-term light weight clients will not have the full chain and will therefore have a much smaller memory footprint and delegates can run machines with 128 GB of memory.   This will allow us to grow transaction volume and keep block latencies low.
Title: Re: segfault in 0.4.19
Post by: liondani on October 02, 2014, 02:08:30 pm


Lets update to 0.4.19...

it will be an exciting night in Greece again ! Another pizza session will not heart me I guess  :)


Sent from my ALCATEL ONE TOUCH 997D

Title: Re: segfault in 0.4.19
Post by: Xeldal on October 02, 2014, 02:11:13 pm
experienced segfault
4GB RAM 
Title: Re: segfault in 0.4.19
Post by: bytemaster on October 02, 2014, 02:13:04 pm
experienced segfault
4GB RAM

So much for that theory... 4 GB should be more than enough.  What OS?
Title: Re: segfault in 0.4.19
Post by: svk on October 02, 2014, 02:13:40 pm
Experienced segfaults on two different VPS: 1gb and 2gb of memory with 4gb of swap on both. Ubuntu 14.04
Title: Re: segfault in 0.4.19
Post by: emski on October 02, 2014, 02:14:16 pm
The segfaults I experienced were on the testing machine.
What I observe is about 2 GB memory consumption of v0.4.19 (RC1 had about 800 MB).
My primary configurations didn't have crashes so far.

UPDATE: Increased memory consumption might be related to DB reindexing. If you restart the client after reindexing is completed memory consumption goes back to normal
UPDATE2: Main system crashed also.
Title: Re: segfault in 0.4.19
Post by: Xeldal on October 02, 2014, 02:14:31 pm
Ubuntu linux 14.04 x64
Title: Re: segfault in 0.4.19
Post by: liondani on October 02, 2014, 02:15:04 pm


" ...and delegates can run machines with 128 GB of memory."

128 GB  ?

Sent from my ALCATEL ONE TOUCH 997D

Title: Re: segfault in 0.4.19
Post by: Xeldal on October 02, 2014, 02:18:25 pm
Code: [Select]
# free
             total       used       free     shared    buffers     cached
Mem:       4048312    3068232     980080        332     152304    2587972
-/+ buffers/cache:     327956    3720356
Title: Re: segfault in 0.4.19
Post by: bytemaster on October 02, 2014, 02:18:39 pm


" ...and delegates can run machines with 128 GB of memory."

128 GB  ?

Sent from my ALCATEL ONE TOUCH 997D

This is years from now... when we are processing 1000 transactions per second.
Title: Re: segfault in 0.4.19
Post by: bytemaster on October 02, 2014, 02:19:00 pm
It is not a memory issue, we have fixed the crash and notified DAC Sun.

Title: Re: segfault in 0.4.19
Post by: crazybit on October 02, 2014, 02:19:39 pm
Because downgrading is so unpleasant and we want the blockchain updates asap 

Dan & Eric are looking into the crash, but those that are experiencing it could you please report the specs of the machine you are running on?
Lets update to 0.4.19...

my delegate server is under real time monitoring, the RAM did NOT reached to 80% yet before the client crashed,it may not due to the RAM size.
Title: Re: segfault in 0.4.19
Post by: xeroc on October 02, 2014, 02:21:18 pm
my delegate server is under real time monitoring, the RAM did reached to 80% yet before the client crashed,it may not due to the RAM size.
what software do you use for this? Couldn't get started monitoring my delegate machine(s) in more details
Title: Re: segfault in 0.4.19
Post by: xeroc on October 02, 2014, 02:21:48 pm
It is not a memory issue, we have fixed the crash and notified DAC Sun.
That is good news .. I almost feared weekend hicking tour in fear of segfaults :)
Title: Re: segfault in 0.4.19
Post by: liondani on October 02, 2014, 02:22:32 pm
It is not a memory issue, we have fixed the crash and notified DAC Sun.
read again folks  :)

Sent from my ALCATEL ONE TOUCH 997D

Title: Re: segfault in 0.4.19
Post by: emski on October 02, 2014, 02:24:12 pm
UPDATE: Increased memory consumption might be related to DB reindexing. If you restart the client after reindexing is completed memory consumption goes back to normal
Title: Re: segfault in 0.4.19
Post by: bytemaster on October 02, 2014, 02:28:27 pm
UPDATE: Increased memory consumption might be related to DB reindexing. If you restart the client after reindexing is completed memory consumption goes back to normal

Yes... this is true.  Reindexing fills all of the database caches.
Title: Re: segfault in 0.4.19
Post by: GaltReport on October 02, 2014, 02:29:40 pm
UPDATE: Increased memory consumption might be related to DB reindexing. If you restart the client after reindexing is completed memory consumption goes back to normal

 +5% - matches my experience. After upgrade (kept it running) memory was low the next morning.  Rebooted today and memory usage has not been high.
Title: Re: segfault in 0.4.19
Post by: crazybit on October 02, 2014, 02:30:15 pm
my delegate server is under real time monitoring, the RAM did reached to 80% yet before the client crashed,it may not due to the RAM size.
what software do you use for this? Couldn't get started monitoring my delegate machine(s) in more details

the Alibaba VPS provides monitoring service,i can monitor the CPU,RAM,Disk,P2P connection etc. usage in real time, i will be notified if any of the statistic is abnormal
Title: Re: segfault in 0.4.19
Post by: bitrose on October 02, 2014, 02:32:50 pm
my cloud is fine
Title: Re: segfault in 0.4.19
Post by: emski on October 02, 2014, 02:35:48 pm
Do we expect increased CPU time consumption?
I see some higher values.
Title: Re: segfault in 0.4.19
Post by: bytemaster on October 02, 2014, 02:44:36 pm
Do we expect increased CPU time consumption?
I see some higher values.

We are auditing all resource usage right now.
Title: Re: segfault in 0.4.19
Post by: wackou on October 02, 2014, 02:49:20 pm
It is not a memory issue, we have fixed the crash and notified DAC Sun.

Is it wise to upgrade already to master or should we wait for an announcement from DAC Sun?
Title: Re: segfault in 0.4.19
Post by: sfinder on October 02, 2014, 02:52:10 pm
look like mine is find on cloud with 4gb ram

Do we expect increased CPU time consumption?
I see some higher values.

We are auditing all resource usage right now.
Title: Re: segfault in 0.4.19
Post by: bytemaster on October 02, 2014, 02:55:04 pm
It is not a memory issue, we have fixed the crash and notified DAC Sun.

Is it wise to upgrade already to master or should we wait for an announcement from DAC Sun?

It hasn't been pushed to master yet.
Title: Re: segfault in 0.4.19
Post by: serejandmyself on October 02, 2014, 03:47:13 pm
seems to work now....
Title: Re: segfault in 0.4.19
Post by: vikram on October 02, 2014, 03:54:02 pm
0.4.20 should hopefully fix the problem: https://bitsharestalk.org/index.php?topic=7067.msg124598#msg124598
Title: Re: segfault in 0.4.19
Post by: CoinHoarder on October 02, 2014, 05:33:14 pm
I have job interview in a bit, I just saw we need to upgrade. I will do so ASAP when I get home.

Just FYI in case I can't make it.
Title: Re: segfault in 0.4.19
Post by: xeroc on October 02, 2014, 05:40:31 pm
no issues here yet ...
Title: Re: segfault in 0.4.19
Post by: wackou on October 02, 2014, 06:15:33 pm
0.4.20 seems to hold up fine. Great job on the quick fix, thanks! I will be able to sleep tight tonight  :P
Title: Re: segfault in 0.4.19
Post by: cube on October 02, 2014, 07:23:08 pm
0.4.20 seems to hold up fine. Great job on the quick fix, thanks! I will be able to sleep tight tonight  :P

Just updated to 0.4.20.  It has been stable.  Thanks for the fast response.
Title: Re: segfault in 0.4.19
Post by: CoinHoarder on October 02, 2014, 10:49:20 pm
I have job interview in a bit, I just saw we need to upgrade. I will do so ASAP when I get home.

Just FYI in case I can't make it.

Ok updated to 0.4.20