LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   gprof strace blows up on recv (http://www.linuxquestions.org/questions/programming-9/gprof-strace-blows-up-on-recv-4175450919/)

bigearsbilly 02-20-2013 03:37 AM

gprof strace blows up on recv
 
I am reading data from a remote site, lots of data.

I peek at the data, to see how big the frame is: (16 bytes peek)

Code:

recv(g.socket, buffer, bite, MSG_PEEK | MSG_WAITALL );
I now know the frame size so I can slurp it in: (~ 200 bytes)

Code:

recv(g.socket, buffer, frame_sz, MSG_WAITALL);
Now this works lovely, it runs overnight, 500+ frames a second. No problem.


If I strace it or turn on profiling for gprof
It messes the recv up and I don't read the wire correctly and
I get invalid frames.

Has anyone encountered this before?

linosaurusroot 02-20-2013 04:07 AM

I don't know what this is but it might help to state what s/w you are using including versions.
Was your program compiled on the same system where you run it?
Do all forms of strace have this effect; even if you exclude read() and recv()?

Also before getting excited about possible kernel or library bugs I suggest a further look over the bounds checking of your code in case that includes a problem which behaves differently depending on the way it is called.

bigearsbilly 02-20-2013 05:36 AM

Weirdly it turned out to be this:

Code:

    setsockopt(g.socket, SOL_SOCKET, SO_RCVBUF, &oval, sizeof oval);
Double-checked the arguments, look ok.

Commented out and it's all fine and dandy !?!?!?!?



This is pre-production, all my lib functions are wrapped, return values are checked, logging in place,
buffers are all static and massive guy.

Code:

[billy@sierra:0]$ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.5-8' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.4 --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.4.5 (Debian 4.4.5-8)



Doesn't seem to go wrong on FreeBSD.
Code:

[billy@elmer:130]$ gcc -v
Using built-in specs.
Target: amd64-undermydesk-freebsd
Configured with: FreeBSD/amd64 system compiler
Thread model: posix
gcc version 4.2.2 20070831 prerelease [FreeBSD]


NevemTeve 02-20-2013 06:59 AM

You seem to ignore the return value of 'recv', which is a very bad idea.

bigearsbilly 02-20-2013 07:26 AM

Quote:

Originally Posted by NevemTeve (Post 4895894)
You seem to ignore the return value of 'recv', which is a very bad idea.

No I don't actually, I just edited for this post.

NevemTeve 02-20-2013 07:58 AM

Then paste in the relevant lines -- the handling of 'returned_len < requested_len' case might be wrong.

(A note: this could be easily implemented without MSG_PEEK)

bigearsbilly 02-20-2013 12:25 PM

Quote:

Originally Posted by NevemTeve (Post 4895927)
-- the handling of 'returned_len < requested_len' case might be wrong.

(A note: this could be easily implemented without MSG_PEEK)

there is no such case MSG_WAITALL ensures you get all,
and not as easily as my way.
IMHO ;-)

NevemTeve 02-20-2013 12:42 PM

manual:

This flag requests that the operation block until the full request is satisfied. However, the call may still return less data than requested if a signal is caught, an error or disconnect occurs.

Plus: it makes your code linux-specific.

bigearsbilly 02-20-2013 01:51 PM

No, MSG_WAITALL MSG_PEEK are not linux specific, I have run it on FreeBSD too.
http://pubs.opengroup.org/onlinepubs...ions/recv.html
They are even mentioned in Steven's UNP.

I don't care (In this specific case) about disconnects, signals or errors, I elect to simply bomb out and attempt to reconnect.

In normal cicumstances it stays on 24 hours a day, I am reading about 500 frames a second
each being written to file, about 800 files. I am using 3% CPU on a 500Mhz 512M virtual debian.
So I reckon I can program a bit ;)

Only get weird reads under gprof on linux, without profiling it's fine.
I think it's likely static/shared library issues with gprof.


All times are GMT -5. The time now is 05:11 AM.