LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)
-   -   Strange behavior from epoll in RHEL 6.5 (https://www.linuxquestions.org/questions/linux-networking-3/strange-behavior-from-epoll-in-rhel-6-5-a-4175507808/)

akonchada 06-12-2014 01:36 AM

Strange behavior from epoll in RHEL 6.5
 
Hello,

Saw some strange behavior from epoll with RHEL 6.5 Kernal ( 2.6.32-431.el6.x86_64).
I have created 2 threads one for reading data from socket and other for sending data on to socket. During my initialization i am creating my epoll FD using "epoll_create1(0)" and after that i will start adding my socket descriptors to it for poll.

While adding my socket to poll list, i am doing
epoll_ctl(m_epoll_fd, operation, socket_fd, &event)
m_epoll_fd : one i created using "epoll_create1(0)"
operation : EPOLL_CTL_ADD or EPOLL_CTL_DEL
socket_fd : Socket FD i created after Socket connection is established
event : "struct epoll_event" with my own data epoll_event.data.ptr and epoll_event.events is "EPOLLIN".

What i am seeing after all this is, after closing the connection and after removing the Socket FD from poll list, i am still getting "EPOLLIN" events for this socket. Also it is not once but varies, sofar i observed highest number of "EPOLLIN" events i received after closing and removing the FD from poll list is 14times.

My question is, is there any know issue in epoll of RHEL 6.5?
Or do i need to do something extra or i am doing something wrong?

Regards,
Ananth

nini09 06-12-2014 02:23 PM

The event cache could cause the behavior.

If you use an event cache or store all the fd's returned from epoll_wait(2), then make sure to provide a way to mark its closure dynamically (ie- caused by a previous event's processing). Suppose you receive 100 events from epoll_wait(2), and in eventi #47 a condition causes event #13 to be closed. If you remove the structure and close() the fd for event #13, then your event cache might still say there are events waiting for that fd causing confusion.

One solution for this is to call, during the processing of event 47, epoll_ctl(EPOLL_CTL_DEL) to delete fd 13 and close(), then mark its associated data structure as removed and link it to a cleanup list. If you find another event for fd 13 in your batch processing, you will discover the fd had been previously removed and there will be no confusion.

akonchada 06-16-2014 12:12 AM

Thanks Nini, let me try this out.


All times are GMT -5. The time now is 08:38 AM.