Author Topic: connect hang up again  (Read 2514 times)

0 Members and 1 Guest are viewing this topic.

Offline xeroc

  • Board Moderator
  • Hero Member
  • *****
  • Posts: 12922
  • ChainSquad GmbH
    • View Profile
    • ChainSquad GmbH
  • BitShares: xeroc
  • GitHub: xeroc

Offline bytemaster

Boost ASIO is handling that level for us so we don't directly manipulate the socket state like that.
For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline alt

  • Hero Member
  • *****
  • Posts: 2821
    • View Profile
  • BitShares: baozi
find something maybe help
http://stackoverflow.com/questions/17009280/why-always-5-connections-with-no-program-attached
Quote
Update 2013-06-08: After upgrading the system to CentOS 6.4, the same problem occurs. Finally I returned to epoll, and found this page saying that set listen fd to be non-blocking and accept till EAGAIN or EWOULDBLOCK error returns. And yes, it works. No more connections are pending. But why is that? The Unix Network Programming Volume 1 says

accept is called by a TCP server to return the next completed connection from the
front of the completed connection queue. If the completed connection queue is empty,
the process is put to sleep (assuming the default of a blocking socket).

So if there are still some completed connections in the queue, why the process is put to sleep?

Update 2013-7-1: I use EPOLLET when adding the listening socket, so I can't accept all if not keeping accept till EAGAIN encountered. I just realized this problem. My fault. Remember: always read or accept till EAGAIN comes out if using EPOLLET, even if it is listening socket. Thanks again to Matthew for proving me with a testing program.

Offline bytemaster

seems lock with mutex here
Code: [Select]
(gdb) thread 4
[Switching to thread 4 (Thread 0x7f132effd700 (LWP 10132))]
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
185     ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S: No such file or directory.
(gdb) bt
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x000000000068b38c in wait<boost::asio::detail::scoped_lock<boost::asio::detail::posix_mutex> > (lock=..., this=0x7f132effce30)
    at /usr/include/boost/asio/detail/posix_event.hpp:80
#2  boost::asio::detail::task_io_service::do_run_one (this=this@entry=0x7f13200012d0, lock=..., this_thread=..., ec=...)
    at /usr/include/boost/asio/detail/impl/task_io_service.ipp:395
#3  0x000000000068c621 in boost::asio::detail::task_io_service::run (this=0x7f13200012d0, ec=...) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:153
#4  0x000000000068c816 in run (this=0x7f1320000e40) at /usr/include/boost/asio/impl/io_service.ipp:59
#5  operator() (this=<optimized out>) at /usr/include/boost/asio/detail/impl/resolver_service_base.ipp:32
#6  boost::asio::detail::posix_thread::func<boost::asio::detail::resolver_service_base::work_io_service_runner>::run (this=<optimized out>)
    at /usr/include/boost/asio/detail/posix_thread.hpp:82
#7  0x0000000000689f42 in boost::asio::detail::boost_asio_detail_posix_thread_function (arg=0x7f1320000ee0) at /usr/include/boost/asio/detail/impl/posix_thread.ipp:64
#8  0x00007f134dc56182 in start_thread (arg=0x7f132effd700) at pthread_create.c:312
#9  0x00007f134cd5730d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

That is blocked on a wait condition for notification from the OS.  A normal state when the system is idle.
For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline alt

  • Hero Member
  • *****
  • Posts: 2821
    • View Profile
  • BitShares: baozi
seems lock with mutex here
Code: [Select]
(gdb) thread 4
[Switching to thread 4 (Thread 0x7f132effd700 (LWP 10132))]
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
185     ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S: No such file or directory.
(gdb) bt
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x000000000068b38c in wait<boost::asio::detail::scoped_lock<boost::asio::detail::posix_mutex> > (lock=..., this=0x7f132effce30)
    at /usr/include/boost/asio/detail/posix_event.hpp:80
#2  boost::asio::detail::task_io_service::do_run_one (this=this@entry=0x7f13200012d0, lock=..., this_thread=..., ec=...)
    at /usr/include/boost/asio/detail/impl/task_io_service.ipp:395
#3  0x000000000068c621 in boost::asio::detail::task_io_service::run (this=0x7f13200012d0, ec=...) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:153
#4  0x000000000068c816 in run (this=0x7f1320000e40) at /usr/include/boost/asio/impl/io_service.ipp:59
#5  operator() (this=<optimized out>) at /usr/include/boost/asio/detail/impl/resolver_service_base.ipp:32
#6  boost::asio::detail::posix_thread::func<boost::asio::detail::resolver_service_base::work_io_service_runner>::run (this=<optimized out>)
    at /usr/include/boost/asio/detail/posix_thread.hpp:82
#7  0x0000000000689f42 in boost::asio::detail::boost_asio_detail_posix_thread_function (arg=0x7f1320000ee0) at /usr/include/boost/asio/detail/impl/posix_thread.ipp:64
#8  0x00007f134dc56182 in start_thread (arg=0x7f132effd700) at pthread_create.c:312
#9  0x00007f134cd5730d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Offline bytemaster

Dan N & Eric are actively on this particular issue.  I have sent them your post for review.
For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline alt

  • Hero Member
  • *****
  • Posts: 2821
    • View Profile
  • BitShares: baozi
sometimes the client just refuse to accept the connect from p2p network.
the connect number will drop to 0.
I have post at here: https://bitsharestalk.org/index.php?topic=5523.msg75899#msg75899
today I meet this again. all new connect request can't be service, the state  is always SYN_RECV.
Code: [Select]
tcp        0      0 106.185.26.162:1982     71.202.130.94:65418     SYN_RECV    -               
tcp        0      0 106.185.26.162:1982     84.227.113.44:50339     SYN_RECV    -               
tcp        0      0 106.185.26.162:1982     54.77.78.162:1776       SYN_RECV    -               
....
tcp       33      0 106.185.26.162:1982     123.6.148.246:51793     ESTABLISHED -               
I still keep the progress for now. here is some message from gdb.
let me know if you want more information.
Code: [Select]
(gdb) info threads
  Id   Target Id         Frame
  12   Thread 0x7f134ca32700 (LWP 10124) "bitshares_clien" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  11   Thread 0x7f1347b93700 (LWP 10125) "bitshares_clien" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  10   Thread 0x7f1347392700 (LWP 10126) "bitshares_clien" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  9    Thread 0x7f1346b91700 (LWP 10127) "bitshares_clien" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  8    Thread 0x7f1346042700 (LWP 10128) "bitshares_clien" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  7    Thread 0x7f1344f20700 (LWP 10129) "bitshares_clien" 0x00007f134cd579a3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
  6    Thread 0x7f132ffff700 (LWP 10130) "bitshares_clien" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  5    Thread 0x7f132f7fe700 (LWP 10131) "bitshares_clien" pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  4    Thread 0x7f132effd700 (LWP 10132) "bitshares_clien" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  3    Thread 0x7f132e7fc700 (LWP 10157) "bitshares_clien" 0x00007f134dc5d3bd in read () at ../sysdeps/unix/syscall-template.S:81
  2    Thread 0x7f131ffff700 (LWP 10394) "bitshares_clien" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
* 1    Thread 0x7f134e686780 (LWP 10120) "bitshares_clien" pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
(gdb) thread 7
[Switching to thread 7 (Thread 0x7f1344f20700 (LWP 10129))]
#0  0x00007f134cd579a3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
81      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0  0x00007f134cd579a3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x000000000068abf5 in boost::asio::detail::epoll_reactor::run (this=0x2416ee0, block=<optimized out>, ops=...)
    at /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:392
#2  0x000000000068b2dd in boost::asio::detail::task_io_service::do_run_one (this=this@entry=0x264c9a0, lock=..., this_thread=..., ec=...)
    at /usr/include/boost/asio/detail/impl/task_io_service.ipp:368
#3  0x000000000068c621 in boost::asio::detail::task_io_service::run (this=0x264c9a0, ec=...) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:153
#4  0x000000000068c7ab in run (this=0x25f1000) at /usr/include/boost/asio/impl/io_service.ipp:59
#5  operator() (__closure=<optimized out>) at /home/alt/workspace/bitsharesx/libraries/fc/src/asio.cpp:102
#6  boost::detail::thread_data<fc::asio::default_io_service_scope::default_io_service_scope()::{lambda()#1}>::run() (this=<optimized out>)
    at /usr/include/boost/thread/detail/thread.hpp:117
#7  0x0000000000c7c4ba in thread_proxy ()
#8  0x00007f134dc56182 in start_thread (arg=0x7f1344f20700) at pthread_create.c:312
#9  0x00007f134cd5730d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) thread 3
[Switching to thread 3 (Thread 0x7f132e7fc700 (LWP 10157))]
#0  0x00007f134dc5d3bd in read () at ../sysdeps/unix/syscall-template.S:81
81      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0  0x00007f134dc5d3bd in read () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f134e25aa7d in rl_getc () from /lib/x86_64-linux-gnu/libreadline.so.6
#2  0x00000000006ad80c in operator() (__closure=<optimized out>) at /home/alt/workspace/bitsharesx/libraries/cli/cli.cpp:1409
#3  fc::detail::functor_run<bts::cli::detail::get_character(FILE*)::__lambda5>::run(void *, void *) (functor=<optimized out>, prom=0x483f920)
    at /home/alt/workspace/bitsharesx/libraries/fc/include/fc/thread/task.hpp:48
#4  0x0000000000643cd3 in fc::task_base::run_impl (this=this@entry=0x483f870) at /home/alt/workspace/bitsharesx/libraries/fc/src/thread/task.cpp:39
#5  0x0000000000644385 in fc::task_base::run (this=this@entry=0x483f870) at /home/alt/workspace/bitsharesx/libraries/fc/src/thread/task.cpp:29
#6  0x000000000064253b in run_next_task (this=0x7f13180008c0) at /home/alt/workspace/bitsharesx/libraries/fc/src/thread/thread_d.hpp:372
#7  fc::thread_d::process_tasks (this=0x7f13180008c0) at /home/alt/workspace/bitsharesx/libraries/fc/src/thread/thread_d.hpp:395
#8  0x000000000063dd44 in fc::thread::exec (this=0x2677cb0) at /home/alt/workspace/bitsharesx/libraries/fc/src/thread/thread.cpp:201
#9  0x000000000063e11c in fc::thread::thread(std::string const&)::{lambda()#1}::operator()() const ()
    at /home/alt/workspace/bitsharesx/libraries/fc/src/thread/thread.cpp:81
#10 0x0000000000c7c4ba in thread_proxy ()
#11 0x00007f134dc56182 in start_thread (arg=0x7f132e7fc700) at pthread_create.c:312
#12 0x00007f134cd5730d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
« Last Edit: July 22, 2014, 04:42:27 pm by alt »