LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Networking (http://www.linuxquestions.org/questions/linux-networking-3/)
-   -   Parial Write for sockets in LINUX (http://www.linuxquestions.org/questions/linux-networking-3/parial-write-for-sockets-in-linux-880009/)

sanushchacko 05-11-2011 03:54 AM

Parial Write for sockets in LINUX
 
We have a server-client communication in our application. Sockets are used for the communication. We are using AF_INET sockets with SOCK_STREAM(TCP/IP). Also these sockets are in Non Blocking mode (O_NONBLOCK). Application is written in C++ on UNIX.

In our system the Server will write to a socket and Client will read from it. We had written code to handle partial writes. If a partial happens, we will try 30 more times to write the entire data.

Our Server try to write 2464 bytes to the socket. In some cases it could not write entire data. So server will try writing 30 more times to transfer entire data. Most of the times the entire data will be written within 30 tries. But some times the even after 30 reties sever wil not be able to write the entire data. Here it will throw EAGAIN error. Problem happens in the Client Side when it tries to read this partially written data.

Consider the server tried to write 2464 bytes. But after the repeated 30 attempts it could write only 1080 bytes. Server will raise a EAGAIN at this point. Client try to read 2464 bytes. The read command will return 2464 and hence the read itself is ok. But the data we received is a corrupted one (Partially written data only). So client crashes.

Can any one please advise on following,

1) Is it possible to remove only the partially written data by the server itself. Thus the client will not recieve corrupted incomplete data?. (We cannot use a read() function from server to remove this. Consider server successfully written n messages to the socket. Client is in busy state and not able to read them. Then the server try to write the n+1 th message and Partial write occured. If we use read command from the server, the entire n successfull messages alo get removed. We need to remove the Partially witten (n+1 th) message only)

2) Is there any way to identify in client side that we had read a partially written message?.

Please note that we are facing the partial write issue in LINUX(REDHAT 5.4) only. System is working fine in Solaris (In solaris either eh entire data will be written OR NO data witll be written with in 30 tries of write).

Thanks in advance.

ambrop7 05-11-2011 04:54 AM

I suggest you read more on non-blocking sockets and event-driven programming. Basically, you're only supposed to use non-blocking sockets within an application that has an event loop, which uses poll() or similar to wait when sockets are ready. And *NEVER* do crap like retrying in timed intervals or whatever you're doing.

When you determine that you need to send N bytes to the client, you write those N bytes into a buffer of yours. You then do send() (in non-blocking mode) to send data from the buffer until you get EAGAIN/EWOULDBLOCK (or a real error, in which case you remove the client, closing the socket). If you do get EAGAIN/EWOULDBLOCK before you have sent everything, you request the event loop to notify you when this socket is again writable. When it is writable (i.e. event loop calls your event handler), you tell the event loop that you're no longer interested in this event, and you repeat the send()-ing to send more data from the buffer. And you keep doing this send-wait cycle until all of the data from the buffer is sent.

I am developing an open-source project which includes a framework for event-driven network applications. It does however make extensive use of flow-based programming and includes a new programming paradigm to make that easier. See http://code.google.com/p/badvpn/
If you're interested, a simple example of an event-driven program using my framework is in examples/stdin_input.c. This program reads stdin and prints data to the terminal as it receives it, and it also exits when a signal is received.

ambrop7 05-11-2011 05:46 AM

Additionally, if your server needs to send more messages to the client, then make it write those messages too into a buffer (or several buffers), and, in parallel (in a conceptual sense, NOT threads), keep sending buffered messages out to the client. If you run out of buffer, the connection is probably broken, and you should probably disconnect the client, and the client will hopefully reconnect. On the other hand, discarding messages when you ran out of buffer may compromise the communication. But this really depends on the type of communication you're doing.

sanushchacko 05-18-2011 10:50 PM

Thank you very much for your answers. The real problem is i cannot perform the write infinitely, since there are lot of other clients waiting the response from server. Let me try some code changes with select and poll also.

tushs 05-19-2011 01:49 AM

In case of non blocking write track count of bytes written and if next poll for write is success then only issue next write from last count (and new length). 30 times retry is bad idea use timeout for failure of poll for say 2-3 sec. if still there is no ready to write event then may be you need to break connection.

ambrop7 05-19-2011 04:05 AM

Quote:

Originally Posted by sanushchacko (Post 4360542)
The real problem is i cannot perform the write infinitely, since there are lot of other clients waiting the response from server.

Have you even read what I have written? You *CAN* perform the write infinitely - since most of the time the event loop is blocked in poll() or whatever, so if some other client sends a request, or a new client connects, poll() will wake up and you can process that event even though the response to some client is only partially sent.

And then is no need to add any timeout to poll() unless your application needs time-related logic; for example, if you want do disconnect a client after N seconds of having sent no request. If you do need timeouts, then you should not fiddle with timeout argument by hand; instead the event loop should keep an ordered list of active timers, each containing its expire time, and before calling poll(), it would dispatch any already-expired timer, and after there are no more expired timers, call poll(), providing it the timeout until the first timer expires.


All times are GMT -5. The time now is 04:45 PM.