ifconfig reports packet drop
Hi, I'm seeing packet drop on a machine for the receiving end. Looking at a guide here, I have increased the RX ring buffer size. However, even then it's still reporting packet drops.
Two questions: 1) What do I do to troubleshoot or solve this? 2) The sender on the other hand isn't reporting any packet drop. Shouldn't the drop reported by the recipient reported by the sender as well as sent packet drop? Code:
[~] # ifconfig |
Routers will drop packets big time if they have no route for them. I'm seeing no errors, no overruns. So it looks like valid packets are going nowhere. Are you being attacked? Is there a firewall busy? Remember things like nmap scanning your open ports will send one packet per port number.Has anyone reported data loss?
Once you get answers to those questions (usually in logs, but you might find your logging inadequate for your curiosity) you will be able to tell us. |
Two thoughts:
* Aren't ifconfig "dropped" packets referring to dropped frames on the link level? (Routing is higher up -- you can have, for example, loss of ping packets without increasing the drop count.) So, I would be inclined to look at the congestion issue, and then local ethernet cabling, cards, and drivers. I noticed this interesting article: http://datacenteroverlords.com/2013/...t-or-pause-it/ * I'd personally avoid enlarging network buffers... this attempt to mask the problem can cause more problems by confusing tcp flow control. |
Hi, thanks for the answer. Here are some additional details ... The connection is PC -> Firewall -> NAS and the ifconfig was captured on the NAS while a transfer was happening from the PC to the NAS. Both links are 1Gbps connections and the whole thing is on a private network so no attacks here and no data loss has occurred. If the file is being transferred, just a lot slower than expected, does the routing issue still apply?
I checked the /var/log/messages log on the NAS but it's empty. Quote:
|
Stateless is probably right about dropped packets being at link level --> congestion or other luink issues.
Divide and conquer. There are a number of situations giving rise to lag. Each rule in the firewall, load on pcs, some network seeing a comms error and dropping back to 10 or 100MB. As that's a 3 machine link, you should be able to check who is seeing lost packets. Have you tried a transfer the other way on that link? |
NAS access could generate a burst traffic to make CPU one core very busy even if it is 1G speed.
|
Quote:
Assuming there are dropped packets on the NAS, the PC should be doing re-transmits? If yes, how can check on the PC end to verify? |
To check for retransmits, send data and check the integrity with diff, sha1sum or md5sum. I would start with ifconfig, note dropped packets, make the transfer, run ifconfig again, and check data integrity.
Congestion you will find with tcpdump, or wireshark. Both are packet sniffers (wireshark needs wine iirc). Both are messy things to use, but they form large logs and you can check these for errors. You are doing stuff that requires a good deal of network knowledge, and reading is in order if you are going to solve anything. Google is your friend. |
3 Attachment(s)
Quote:
I did run ifconfig and note the packet, and while doing the transfer, continuously doing ifconfig and saw the count increasing. Quote:
You can see from the attached screen shots the following: Windows Explorer reported 10.9MByte/s WireShark reported 71.1Mbit/s WireShark reported no packet drop |
Huh, that was actually dumb of me. Wireshark only sees received packets. It would love to see the ones that don't make it, but can't.
|
The dropped pakets, on if config, happends when you have a conectivity problem with the cable. Every time when I had this problem it was a imperfect connection at the RJ-45, or cable fault.
You sould check this 1st. There could be CRC (Cyclic redundancy check) errors as well. Try to find out if there are CRC errors. Hope it helps. |
Check following staff.
Beginning with kernel 2.6.37, it has been changed the meaning of dropped packet count. Before, dropped packets was most likely due to an error. Now, the rx_dropped counter shows statistics for dropped frames because of: Softnet backlog full Bad / Unintended VLAN tags Unknown / Unregistered protocols IPv6 frames when the server is not configured for IPv6 If any frames meet those conditions, they are dropped before the protocol stack and the rx_dropped counter is incremented. Care should be taken to confirm that frames are not being legitimately dropped. A quick way to test this (WARNING: this test does not work for bonding interfaces) is to start a packet capture: host:~# tcpdump And then watching the rx_dropped counter. If it stops incrementing while the tcpdump is running; then it is more than likely showing drops because of the reasons listed earlier. If frames continue to be dropped while running tcpdump, investigation should take place to determine root cause. |
Quote:
Also what is Softnet backlog full? |
The tcpdump change interface setting, such as promiscuous mode. In promiscuous mode, some checks are disabled.
|
Hi, okay got it. I have done the test via ifconfig. And as you have suggested no additional rx packet drop has occurred. So looks like it is due to one of the following reasons? So does this mean it's a network error or what? How can I determine what is causing this? And where can I read more about this change in behavior? I'd imagine a lot of network monitoring tools will suddenly report packet drop out of no where after a Linux OS upgrade. Isn't it?
Softnet backlog full Bad / Unintended VLAN tags Unknown / Unregistered protocols IPv6 frames when the server is not configured for IPv6 1) Started iperf in server mode on the NAS. Code:
[~] # /opt/bin/iperf -s -m Code:
[~] # /opt/sbin/tcpdump src 192.168.60.2 -w tcpdump.log Code:
RX packets:1066620 errors:0 dropped:71253 overruns:0 frame:0 Code:
iperf.exe -c 192.168.60.5 Code:
[~] # /opt/sbin/tcpdump src 192.168.60.2 -r tcpdump.log Code:
RX packets:1106075 errors:0 dropped:71253 overruns:0 frame:0 |
All times are GMT -5. The time now is 10:56 PM. |