LinuxQuestions.org - eth0: Memory squeeze, dropping packet

- Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)

- - eth0: Memory squeeze, dropping packet (https://www.linuxquestions.org/questions/linux-networking-3/eth0-memory-squeeze-dropping-packet-217480/)

eth0: Memory squeeze, dropping packet

Hi,

I installed debian testing on my old Pentium 166. It has a 20 Gb disk that serves only for backups. I compiled the latest 2.6.7 kernel for it and use NFS to transfer files to this disk (ReiserFS). Now the problem: when I copy files to this machine using a crossed cable (so there's a 100 Mbit connection) I get these weird errors on the console and dmesg log:

eth0: Memory squeeze, dropping packet.
nfsd: page allocation failure. order:0, mode:0x20
[<c0129d83>] __alloc_pages+0x233/0x320
[<c0129e8f>] __get_free_pages+0x1f/0x40
[<c012cd3f>] kmem_getpages+0x1f/0xd0
[<c012d779>] cache_grow+0x79/0x1c0
[<c012da0c>] cache_alloc_refill+0x14c/0x1f0
[<c012dedf>] __kmalloc+0x5f/0x70
[<c032148d>] alloc_skb+0x3d/0xe0
[<c02c7879>] rtl8139_rx+0xc9/0x2d0
[<c02c7c28>] rtl8139_poll+0x38/0xc0
[<c0325e5f>] net_rx_action+0x5f/0xf0
[<c0115763>] __do_softirq+0x83/0x90
[<c0115796>] do_softirq+0x26/0x30
[<c0106225>] do_IRQ+0xc5/0xe0
[<c0104a08>] common_interrupt+0x18/0x20
[<c01ddbb0>] nfsd_open+0x0/0x170
[<c01de578>] nfsd_commit+0x68/0xb0
[<c02c7c28>] rtl8139_poll+0x38/0xc0
[<c0115763>] __do_softirq+0x83/0x90
[<c0106225>] do_IRQ+0xc5/0xe0
[<c0104a08>] common_interrupt+0x18/0x20
[<c01e58e1>] nfsd3_proc_commit+0x81/0xd0
[<c01dab67>] nfsd_dispatch+0xb7/0x190
[<c038d75e>] svc_process+0x4ce/0x630
[<c01da93f>] nfsd+0x16f/0x2e0
[<c01da7d0>] nfsd+0x0/0x2e0
[<c01020a5>] kernel_thread_helper+0x5/0x10

The copy does succeed so there doesn't seem to be a problem but since I want to trust all my data to this machine I don't like the look of errors. I was also wondering what speed (other then the theoretical) I can expect using this config. Using nfsmount with optimal (= tested via trial and error) wsize and rsize as 32768 I transfer a 256 Mbye file in 2minutes and 8 secs. Shouldn't this be a lot faster (theoretical about 21 secs)?

thank you for your help,

Steven

You are running out of memory. How much is on the machine? Can you stop running anything? Is swap configured. You may have to run fewer nfsd processes.

Remember that you are still limited by the speed of you hard drives as well. How fast is the file copy locally?

It does not have a lot of memory to spend, only 32 Mb but it has loads of swap (512 Mb). I had hoped reducing the memory usage by not using X but I guess 32 is just too little. I seem to have 8 nfsd process running. Can you tell me how I can reduce that number (and what are the consequences of this reduction)?

I tried the local copy and you were absolutely right. The dd if=/dev/null ... took 2min and 21 secs to finish. That's mighty disappointing
considering the hard drive brochure sais: 'Fast data transfer speed, up to 100 MB/sec' (not sure if those are bits or bytes).
The motherboard can't dance the ATA/100 but I can hardly belief ATA/66 would cause a speed reduction of about 670 %.

anyway, thank you for your advice,

Steven

Swap won't help here, because nfsd is a kernel mode program, and kernel memory can't be swapped.

You are going to have to take a look at your kernel and see what you can take out. No extra modules, no extra filesystems. No tmpfs or anything. 32 megs should be possible, but I haven't tried since my 486 died :) What other programs are running?

I believe the nfsd instances would control how many concurrent IO operations can be going on. If your not serving out to a bunch of clients, you should be able to reduce this. Try reducing it to 4 or 2 and see if anything changes. You'll have to edit your init scripts that start the nfsd process. It's argument is how many threads to startup. Check /etc/init.d/nfsd or something akin to that.

Less than 2MB/s, eh? Your drive is in PIO mode not UDMA. Check your bios, to make sure it is setting up Ultra DMA for that drive. 'dmesg|grep ide0' on a freshly booted system ought to tell you if it is configured for DMA. 'hdparm /dev/hda' will show you the settings for your IDE primary/master.

Good Luck,
chris

edit: Good Luck, god's got nothing to do with this. It ain't that bad, yet.

well I followed your advice, I tried activating DMA in the bios but that wasn't possible (only PIO modes available). Compiling the generic chipset and bus master support into the kernel enabled the maximum for this motherboard being DMA (not UDMA) mode 2. hda /t gave me 10 MB/s, needless to say, there was much joy in my heart. Immediately testing the local transfer using dd gave me a lousy 1.8 MB/sec, end of joy. I took the opportunity to slim down the kernel as much as possible and using 2 NFSD threads I can avoid the error (with 4 the error remains). Unfortunately it now takes 2min 49 secs to transfer 256 Mb over 100 Mbit connection. Do you have some other sugestions that could speed things up a little more? I give you the list of processes that are running on the machine, maybe something can be closed.

thank you for your help,

root 1 0 0 19:58 ? 00:00:00 init [2]
root 2 1 0 19:58 ? 00:00:00 [ksoftirqd/0]
root 3 1 0 19:58 ? 00:00:00 [events/0]
root 4 3 0 19:58 ? 00:00:00 [khelper]
root 19 3 0 19:58 ? 00:00:00 [kblockd/0]
root 38 3 0 19:58 ? 00:00:00 [aio/0]
root 37 1 0 19:58 ? 00:00:01 [kswapd0]
root 145 1 0 19:58 ? 00:00:00 [kseriod]
root 291 3 0 19:58 ? 00:00:00 [reiserfs/0]
daemon 356 1 0 19:58 ? 00:00:00 /sbin/portmap
root 477 1 0 19:58 ? 00:00:00 /sbin/syslogd
root 480 1 0 19:58 ? 00:00:02 /sbin/klogd
root 489 1 0 19:58 ttyS0 00:00:00 /usr/sbin/gpm -m /dev/ttyS0 -t ms -Rms3
root 494 1 0 19:58 ? 00:00:00 /usr/sbin/inetd
root 513 1 0 19:58 ? 00:00:00 /sbin/rpc.statd
root 517 1 0 19:58 ? 00:00:00 /usr/sbin/cron
root 525 1 0 19:58 tty1 00:00:00 -bash
root 527 1 0 19:58 tty2 00:00:00 -bash
root 530 1 0 19:58 tty3 00:00:00 /sbin/getty 38400 tty3
root 532 1 0 19:58 tty4 00:00:00 /sbin/getty 38400 tty4
root 534 1 0 19:58 tty5 00:00:00 /sbin/getty 38400 tty5
root 536 1 0 19:58 tty6 00:00:00 /sbin/getty 38400 tty6
root 550 527 1 20:02 tty2 00:00:18 top
root 577 1 4 20:10 ? 00:00:36 [nfsd]
root 578 1 3 20:10 ? 00:00:34 [nfsd]
root 580 1 0 20:10 ? 00:00:00 [lockd]
root 581 1 0 20:10 ? 00:00:00 [rpciod]
root 584 1 0 20:10 ? 00:00:00 /usr/sbin/rpc.mountd
root 595 3 0 20:13 ? 00:00:00 [pdflush]
root 599 3 0 20:14 ? 00:00:00 [pdflush]
telnetd 605 494 0 20:16 ? 00:00:00 in.telnetd: RedMaster
smaenho 606 605 0 20:16 pts/1 00:00:00 -bash
smaenho 611 606 0 20:25 pts/1 00:00:00 ps -ef

hda /t gave me 10 MB/s ... local transfer using dd gave me a lousy 1.8 MB/sec

I'm not sure what's going on here. If anyone has experience with low mem configs, please pipe in.

hdparm -t /dev/hda says 10 and dd says 2, hmm... Is that reading, writing or both. Ususally dd would be at least %50 of the hdparm numbers. Maybe hdparm isn't flushing the buffer cache properly. Try reading a big file off one partition, and then testing another? Try hdparm -T -t /dev/hda (to test buffer cache as well).

Are there any other errors in your log?
What filesystem are you using? Can you try it with ext2?

What's the memory situation looking like now (free)?

While you are running dd run "vmstat 10". i.e. "vmstat 1 > vmstat.txt& dd if=/dev/zero of=zerofile bs=4k count=16k" to test write performance. This will tell some averages of swapping and how many blocks are going to disk (plus cpu usage), printing out every ten seconds. There souldn't be any swapping here.

Your dd test only needs to be bigger than ram, so you should be able to just do 32MB-64MB (rather than waiting 2 and a half minutes). Keeping this rule of thumb for nfs would mean testing files greater than the memory of either machine. (use vmstat 1 to see the system averages.)

Don't worry about network performance until disc performance is better.

I notice that you are using 2.6. Could you try a 2.4 kernel? (This is my equivalent of a MS reboot. I have no idea about the memory or hardware support differences between 2.4 and 2.6.)

What's the intended use of this box?

Hope this helps,
chris

I guess you found the problem: the filesystem. When I use ext2 the dd command gives me 14.37 Mb/secs (writing). When I use ext3 I get 9.687 and Reiserfs is the most crappy (in terms of speed) with 1.8 Mb/secs. Memory is looking good by the way: 28780k total, 9452k used, 19328k free, a copy over nfs on the other hand uses all available memory. Speed over nfs is now 7.32 MB/sec (using ext2) which I consider acceptable. Unfortunately, worse problems are manifesting themselves.

During a copy I get messages on the console and in dmesg like this:

EXT2-fs error (device hda1): ext2_free_blocks: Freeing blocks not in datazone - block = 2899116032, count = 1
Remounting filesystem read-only
EXT2-fs error (device hda1): ext2_free_blocks: Freeing blocks not in datazone - block = 2902149509, count = 1
Remounting filesystem read-only

The spicy detail here is that I was copying to partition /dev/hda3 and that /dev/hda1 is my root partition. I had similar problems with this machine (the reason why I wanted to use reiserfs) and used smartctl to monitor the hard drive. Posting the errors on the smartctl mailing list gave me the response that this problem was due to bad memory chips and that the hard drive is ok (it only has 69 hours of up time). I replaced the memory and the error went away. No it's back... I tested the current memory chips using memtest86 and no problems were reported. I guess I'm just tumbling into another problem here...