[SOLVED] process blocks on recv() every 10 seconds
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a multi-process application in which one of the processes blocks on a UDP socket using recv(). When a message is received, the process executes, then loops back to the recv() to block again for the next message. This process runs at max RT priority on a dedicated core (single quad-core CPU, non-hyperthreaded). Often my application will get into a state where, every 10 seconds (exactly), the process waiting on the recv() will block for ~100ms even though there should be a message waiting to be read (according the Wireshark). The messages arrive every 16.667 ms, therefore several will are queued to be read by the (apparently) blocked process.
Another symptom is that the entire processer is blocked, as if it is busy servicing a higher-priority process or ISR. Just no idea what it could be.
I have searched extensively based on the 10-second symptom. The most promising result was the watchdog processes, but I've removed those w/out any effect. Any suggestions as to what else might occur at that frequency? Note that occasionally the symtom is separated by more than 10 seconds, but when this occurs it is always a multiple of 10 (eg, every 20 seconds, every 40 seconds).
Does not appear to be related to the NIC (tried newer driver, and using on-board ports instead). Occurs when running in either X or console mode.
Okay, for anyone who is curious, I have solved this problem (though I still don't fully understand what was going on). After closely monitoring all the the process activity for some time, I noticed that the symptom would only occur when the process kslowd000 or kslow001 was running in the same core. Even though I had noticed those processes were using a lot of time, I never paid much attention to them because their priority was much lower than the process being affected, but as soon as I would bind kslowd* to core 0 the symptom would dissappear. So my solution was to permantly bind those processes to core 0. Not perfect, as I still don't know exactly what was going on, but it works.
Thanks to all who read this and at least scratched their heads.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.