LinuxQuestions.org - tcpdump - how to get the PID in the communication?

- Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)

- - tcpdump - how to get the PID in the communication? (https://www.linuxquestions.org/questions/linux-networking-3/tcpdump-how-to-get-the-pid-in-the-communication-914130/)

tcpdump - how to get the PID in the communication?

Is there a way with tcpdump or similar tools to get the process ID that sent or received data? Alternatively, is there a way to have the packets filtered by process ID(s) involved (e.g. that I could start a separate tcpdump for each of the PIDs I am interested in watching network traffic for)?

Quote:

Originally Posted by Skaperen (Post 4526960)

While this and this are posts from others threads on other (related) topics they may hold clues for you, depending on what you're exactly aiming at. And that's unclear. Explain verbosely?

I have several processes (typically 6 to 12 of them) running on each of a few systems. I don't have buildable source code or else I would modify them. They make network connections periodically. What I want to do is track what kind of connections they are doing, as well as their DNS queries, and correlate which connections being made are the result of DNS queries, and which are not (e.g. IP addresses probably obtained from other connections). The traffic is to be observed, not altered. The end result is a list of DNS queries, DNS answers, connections made with whether there is or is not a corresponding DNS query, connections closed, everything timestamped, and volume of traffic for the connections. A system wide tcpdump would be meaningful only if one instance were running, and nothing else on the system does anything with the network. But that's not going to be possible. And virtual machines don't work for this (they just didn't run).

The strace program may help. I think it may be OK to strace these because I did that once, though I am not 100% sure it will do everything right. Right now, I just want to see whether there is any PID association that can be done so I'm not mixing up one process's DNS queries with another's. I think just checking the socket tables (e.g. like netstat has) fast enough could make that association. But that has to be done the instant of packet capture, not later. So for it to be meaningful, I think it would have to be integrated into tcpdump. If this isn't done until after the DNS query answer arrives, the socket would be closed and there would be no clue as to which process did the query.

It would help to know if 0) these processes run under a single or a set of UIDs so you can narrow down what to filter for (-m owner) and redirect and store packets and 1) if the distribution you run can run the auditd service because that would give you access to logging socket calls. IMO strace will give you too much unless you filter out certain sets of syscalls and then still you would have to have a pcap to correlate it with.

Quote:

Originally Posted by unSpawn (Post 4527605)

They are under the same UID and GID. If tcpdump can filter by UID, I'll look into running them separate. I don't think there is a real need to keep them the same other than maintenance convenience.

Quote:

Originally Posted by Skaperen (Post 4527888)

If tcpdump can filter by UID

A process UID is not a concept that remote hosts, the network layer or packet buffer concern themselves with: it is a property of the local kernel. With a Netfilter "owner" match though you should be able to limit sending packets to a queue by GID or UID for storage, and whatever massaging and correlation needs to be done should be done after storage. Efficiency-wise thnx for responding to half my question.

Quote:

Originally Posted by Skaperen (Post 4527888)

If tcpdump can filter by UID

A process UID is not a concept that remote hosts, the network layer or packet buffer concern themselves with: only your kernel knows. With a Netfilter "owner" match though you should be able to limit sending packets to a queue by GID or UID for storage, and whatever massaging and correlation needs to be done should be done after storage. Efficiency-wise thnx for responding to half my question BTW.

Quote:

Originally Posted by unSpawn (Post 4527899)

Yes, only the kernel knows what the UID is. But I'm doing the tcpdump on the same host as the programs are running on. So it should be possible to collect the UID data and associate it with the packets. It might not be easy because the packet capture might take place too far away from when the packet was associated with a socket. Still, a program could make a best effort to look for a matching socket and see what PID did it. I need the PID, not the UID, but if UID can be a means to get to PID by some other way, fine.

I still don't get this suggestion of Netfilter. I don't want to filter any packets. But does netfilter also provide a means to duplicate the packets and send the dups over to somewhere else where they can be captured? All of the traffic between every process and the interfaces the hosts are using must be unaltered. Can you be more specific about netfilter? Because my understanding of netfilter doesn't include any capture capability. Does it have that?

I only see the ability to match UID, GID, PID for OUTPUT packets in iptables. So that doesn't appear to be a complete solution. Packets coming IN are of interest, especially the DNS ones which provide the answers to queries the process sent out. This could be handled by some means of correlation. Netfilter appears to lack a hook at the right place to do this very easily (e.g. after a packet has been narrowed down to a specific socket). I'm still lost on how we get any of the UID or PID info over to tcpdump if it doesn't have its own means to do that. Cat netfilter generate new packets to provide supplementary info?

Traffic and Process Id correlation with audit and ULOG on IA-32 Centos-5.7

What you posted about was addressed already in the links I provided. Besides I never put forward Netfilter as a complete solution. Best not interpret but actually read what it says. Luckily I just finished Traffic and Process Id correlation with audit and ULOG on IA-32 Centos-5.7 so I don't have to explain things further and I suggest you try it before you post any further questions.