Hello !
In the process of writing a lightweight media server, I'm investigating ways or reducing context switching and memory copies with Linux.
Here is the kind of code that is consuming a lot of CPU. Basically, it's multicasting data from one TCP socket to many others (thousands).
Code:
recv(src_fd, buf, len);
mangle_my_data(buf, len);
for (i=0; i<dst_count; i++) {
send(dst[i], buf, len);
}
One important point being the buffer manipulation before the data gets written, and as far as I understand splice() it keeps me from using that call because the source must be a file descriptor or pipe.
So it could take advantage of zero-copy syscalls but I can't see the best way to implement that, and I'm felling like there could be other great features less documented in splice() and tee().
Unfortunatelly, vectored AIO do not work on sockets ...
Currently, my best option was to replace the buffer by a pipe, and let tee() feed the clients. It's still making many syscalls that I hope can be vectored by some mean, but it's saving memory !
Code:
recv(src_fd, buf, len);
mangle_my_data(buf, len);
write(pipe_fd, buf, len);
for (i=0; i<dst_count; i++) {
tee(pipe_fd, dst[i], len);
}
// and ultimately pop data from the pipe
So do you guys know the best way to write the same data to 10k sockets using zerocopy and vectored I/O ?
Thanks