By: Ramdev
May 9, 2011
The BOUND state is the state a socket shows after it is created and the ‘bind()’ call is made, but none of the ‘listen()’, ‘accept()’, ‘connect()’ or ‘close()’ calls have been made. The confusion is that it is not a TCP state, it is a socket state, but it appears in the field that ‘netstat’ usually uses for TCP state info.
BOUND is a transitory state and the application should be doing a ‘listen()’ right after ‘bind()’ succeeds, then wait in ‘accept()’ for incoming data.
A listening server process – a client would typically ‘connect()’ – will ‘bind()’ automatically.
Sometimes this state is shown after closing or killing an application. Either situation is likely to be a problem in the way application was implemented and some adjustment is required in fixing the application code (socket programming).
If the application holds this socket open, it will prevent any other application to bind to the same TCP port number. This can cause services to hang. Having a BOUND state for a long period of time may cause the application to appear hung or be unresponsive.
Depending how the application works, a server could make multiple attempts to contact unavailable clients and this could result in many sockets left in a BOUND state, eventually resulting in exhausting the supply of available sockets.
If problems rebinding to ports are reported periodically, either when killing a daemon, or if the daemon closes a bound socket, and then creates a new one, the new socket cannot be rebound. The process reports the following error:
bind: Address already in use
Eventually, by killing daemons, the BOUND state goes away (kill -15 <pid> , kill -11 <pid> , kill -9 <pid>). However, the socket would be still bound with no process running at that time. There is no way to free the bound ports unless the processes that have bound the socket are killed and that does not always work. A reboot is sometimes in order.
Notice that when the application opens a socket connection it has complete control of the socket until it releases it and that socket connection shows TIME_WAIT in the ‘netstat -an’ output. To try to identify the process, use the ‘pfiles‘ command (Solaris 8 and above). Prior to Solaris 8, the ‘lsof‘ public domain application may be used on the system.
A possible workaround while troubleshooting is to define another socket in /etc/services. For example:
service-name1 5010/tcp
service-name2 5011/tcp
When appropriate, ‘truss’ the application while issuing the ‘kill’ or ‘kill
-9’ commands to get more info as to why it is not closing correctly.
Example of a successful attempt of identification and solution:
% netstat -an | grep BOUND
*.33330 *.* 0 0 24576 0 BOUND
*.33330 *.* 0 0 24576 0 BOUND
%
% su ( type root password )
# cd /proc ; pfiles * | egrep “^[0-9]|sockname” > /var/tmp/pfiles1.txt
# vi /var/tmp/pfiles1.txt
<SNIP>
814: /bin/sh -c dtfile -noview
815: dtfile -noview
819: cachefsd
856: java_vm
sockname: AF_UNIX
sockname: AF_UNIX
sockname: AF_UNIX
sockname: AF_UNIX
sockname: AF_UNIX
sockname: AF_UNIX
sockname: AF_INET6 :: port: 33330
# ps -ef | grep 856
demo 856 811 0 Jan 25 pts/12 2:46 java_vm
root 6619 6400 0 13:58:10 pts/12 0:00 grep 856
#
# kill -15 856
#
# ps -ef | grep 856
root 6630 6400 0 13:59:13 pts/12 0:00 grep 856
#
# netstat -an | grep BOUND
#
Tidak ada komentar:
Posting Komentar