LinuxQuestions.org - in need of a 'select' break with a closed socket

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - in need of a 'select' break with a closed socket (https://www.linuxquestions.org/questions/programming-9/in-need-of-a-select-break-with-a-closed-socket-611601/)

in need of a 'select' break with a closed socket

I'm working on a server-like program which calls select in a loop using multiple stream sockets. All descriptors in the program are non-blocking, so select and a pthread_cond_wait somewhere else are the only blocking points I have.

I'm having a "problem" when a connected client exits and the cooresponding socket descriptor on the server isn't the first in the select list. The select call doesn't break until another descriptor has data ready or until the first descriptor's other end closes. In other words, the only break for an unreadable socket happens when that socket is the first in the list. This keeps the descriptor hanging, but more importantly, keeps the program from accurately updating the status of what's connected.

As an example:

client X connects to the server (placed 1st with FD_SET)
client Y connects to the server (placed 2nd with FD_SET)
client Z connects to the server (placed 3rd with FD_SET)
...
client Y exits and implicitly closes it's connections
the socket for Y is still a part of the select list until either X or Z send data or X exits

I experimented with a timeout for select, but I'd be better off using while (read(...) < 0); or something similarly attrocious because of its processor intensity.

I get the feeling I'm partially lucky to get what I am getting due to undefined behavior. The main goal is to have a blocked server except for when something is happening. Thanks.
ta0kira

PS The exit happens during the blocked select call. I haven't been able to create a situation where the exit happens between the select calls since.

Which set do you check when waiting for connection close?

Quote:

Originally Posted by ta0kira (Post 3013776)

... I experimented with a timeout for select, but I'd be better off using while (read(...) < 0); or something similarly attrocious because of its processor intensity.

I don't quite understand what is atrocious about using a timeout with select(). Doesn't your server thread have to check if there are other clients connecting? How will it do so if it is suspended while waiting for activity from an existing client?

When a client disconnects, select() reports activity with the client's socket. At this point, the server will not know what type of activity has occurred until a read() is performed. If the read() operation returns a value less than zero, then you know that the client has disconnected.

Here's a code snippet of a multi-threaded server I worked on many years ago that handles data incoming from multiple clients. Note that client connections where handled in a separate thread.

PHP Code:




  while (!stop_requested()) 
  { 
    //   Setup file descriptor list for select call 
    // 
    struct timeval timeout = {1, 0};  // one second 
    fd_set         readSet; 
    const uint32_t nfds = m_clientList.getClientSet(read_set); 
 
 
    //   Check if there is any data traffic to be read on any of the 
    //   file descriptors. 
    // 
    if (select(nfds+1, &readSet, NULL, NULL, &timeout) == 0) 
    { 
      dbg[*this] << "select timed out" << el; 
      continue; 
    } 
 
    // The code below to determine the client was actually done in a 
    // a mutex-protected method within the m_clientList object.  I've 
    // shown it here just to show the gist of what needed to get the 
    // handle to the client object (if any). 
    // 
    Client *client = NULL; 
 
    for (Iterator it = m_clientList.begin(); it != m_clientList.end(); ++it) 
    { 
      if (FD_ISSET(it->first, &readSet)) 
      { 
        client = it->second; 
      } 
    } 
 
    if (client == NULL) 
    { 
      dbg[*this] << "Client object could not be found" << el; 
      continue; 
    } 
 
    // Client is known. 
    // 
    bool newMsgAvailable = true; 
 
    while (newMsgAvailable == true) 
    { 
      char    msg[MsgAPI::MAX_MSGLEN] = {NULL}; 
      size_t  msgSize = client->getNextMsg(msg, sizeof msg);  // read() done in here 
 
      switch (msgSize) 
      { 
        case Client::NO_NEW_MSG: 
            dbg[*this] << "No new message received" << el; 
            newMsgAvailable = false; 
            break; 
 
        case Client::MSG_TOO_LARGE: 
        case Client::CLIENT_DISCONNECTED: 
            dbg[*this] << "Msg Too Large or Client Disconnected" << el; 
            m_clientList.removeClient(client); 
            newMsgAvailable = false; 
            break; 
 
        default: 
            dbg[*this] << "Got a message" << el; 
            ... 
            break; 
      } 
    }

Let me know if you have any further questions or comments.

When I used a timeout with select my processor activity jumped to over 90%, whereas keeping it indefinite allowed it to stay near idle.

shutdown on client exit seems to be working. I've been checking the read set. Thanks.
ta0kira

Your processor shouldn't have jumped to 90% because of the select(). It was probably something else on your system (or your application, if it is multithreaded) that is causing the issue.

Compile and run the following program, and verify if indeed the process is taking up a substantial amount of CPU time.

PHP Code:




#include <ctime> 
#include <cstdlib> 
#include <unistd.h> 
 
int main( int argc, char **argv ) 
{ 
  int seconds = (argc > 1 ? atoi(argv[1]) : 30); 
  int usecs   = (argc > 2 ? atoi(argv[2]) : 0 ); 
 
  struct timeval tv; 
  tv.tv_sec  = seconds; 
  tv.tv_usec = usecs; 
 
  // perform delay 
  select( 0, 0, 0, 0, &tv ); 
 
  return 0; 
}

In fact, the connection is closed if read returns less or EQUAL to zero. Quite an important condition.

I guess the high processor usage was because of the not handled the closure and select was not blocking inside, just returning.

Actually, the place where I used select with a timeout was for a bound local socket. The socket was open the entire time, and literally the only change was the addition of a 100ms timeout. Non-blocking accept with a 100ms retry in its place kept processor useage low, so, it wasn't the loop code. The matter of having a closed socket is a different one, though. It did take place within a pthread, so that might be the problem. I'll try to duplicate both conditions: the timeout and the non-first descriptor.

The way I do test for a closed socket is to take the result of select (as long as errno isn't EINTR) and try to read each descriptor (non-blocking.) If I get a 0 read then I close it and remove it from the list. shutdown does seem to force select to break, though.
ta0kira