LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (https://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   Debian - Networking doesn't work after upgrading RAM to more than 2GB? (https://www.linuxquestions.org/questions/linux-hardware-18/debian-networking-doesnt-work-after-upgrading-ram-to-more-than-2gb-4175501009/)

Mavman 04-08-2014 01:13 PM

Debian - Networking doesn't work after upgrading RAM to more than 2GB?
 
I'm running 64-bit Debian Squeeze (6.0.8). It had 2x 1GB sticks in it previously, and I tried to upgrade to 2 different sets of 2x 2GB sticks, totally 8GB RAM. The only difference between the two sets is one has slightly different latency (5x5x5x15 vs 6x6x6x16).

Anyways, if I put all 8GB in, the BIOS detects it, the OS detects it, but networking doesn't work. Through much troubleshooting, I found that regardless of setup, the only time networking actually responds is with no more than 2GB RAM. I can choose any individual 2GB RAM DIMM and it works fine. With all 8GB RAM, things seem to run fine, but I have no networking. Sometimes I can DHCP an address with more than 2GB, the link auto-negotiates speed/duplex perfectly, but I can't ping anything on my local network. Same thing if I set it statically.

All 8GB of RAM were in a previously known good machine, none bad.

This is driving me crazy. Can anyone tell me why this is?

jefro 04-08-2014 02:57 PM

Put the 8 gig in and run memtest for a day or so.

I get the feeling that maybe the ram is bad or worse, could be phoney.
Could be your board doesn't support it.
Not sure but could be a pae issue.

Not sure why memory would affect network adapter. Does it still show up in lspci/lsusb or such?

cizzi 04-08-2014 03:07 PM

Quote:

Originally Posted by Mavman (Post 5148887)
I'm running 64-bit Debian Squeeze (6.0.8). It had 2x 1GB sticks in it previously, and I tried to upgrade to 2 different sets of 2x 2GB sticks, totally 8GB RAM. The only difference between the two sets is one has slightly different latency (5x5x5x15 vs 6x6x6x16).

Anyways, if I put all 8GB in, the BIOS detects it, the OS detects it, but networking doesn't work. Through much troubleshooting, I found that regardless of setup, the only time networking actually responds is with no more than 2GB RAM. I can choose any individual 2GB RAM DIMM and it works fine. With all 8GB RAM, things seem to run fine, but I have no networking. Sometimes I can DHCP an address with more than 2GB, the link auto-negotiates speed/duplex perfectly, but I can't ping anything on my local network. Same thing if I set it statically.

All 8GB of RAM were in a previously known good machine, none bad.

This is driving me crazy. Can anyone tell me why this is?

Networking functionality shouldn't have anything to do with how ram is installed. Like the previous user posted try running memtest86 overnight with all installed RAM this will tell you if any errors are detected (you'll know by a RED status bar on your screen the next morning). Have you installed the right networking drivers based on your network hardware? lspci/lsusb will tell you what network hardware you are using, also, some hardware require firmware from your linux distribution for the hardware to work correctly. Do some google searches on this, but what puzzles me is that it works 2GB or less. Keep me posted.

Mavman 04-08-2014 03:23 PM

Yeah, I know network software is independent of non-network hardware, I'm just totally baffled as to why it won't work with over 2GB. Like I said, all this RAM came out of a known good PC, I can throw it back in that PC and it works fine. I can use any individual 2GB stick and it all works, but if I add any more then it just fails. I wouldn't be surprised if more than networking is actually affected, but most of my services run over networking so trying to test them is irrelevant anyways.

I did consider flashing my BIOS, it's just a bit of a pain to do it because of my physical setup (hooking up monitors & keyboard, shuffling around my little space, etc). But that's probably the best option at this point. I'll give that a shot tonight.

johnsfine 04-08-2014 03:42 PM

The whole thing sounds very unlikely. But within the unlikely, it sounds like a bug in a network driver.

That would be almost plausible if this were some very new or obscure network driver not likely used by enough other people for someone else to have seen the bug already.

What info do you have on exactly what network driver you are using?

I think BIOS is very very unlikely. I would not bother trying something as unlikely as flashing the BIOS.

Some bus issue (such as capacitance) with multiple ram sticks is a possibility, very unlikely to somehow affect just the network, but still worth investigating with memtest as others suggested.

If it is a hardware problem (rather than a driver bug) it would probably go away if the ram were underclocked (given a slower clock and/or more wait states than they officially need). Some BIOS's have the flexibility to let you slow things down (lower clock for the CPU forcing a lower ram clock or other methods of slowing the ram without changing the CPU clock). See what choices you have. If that fixes things, you probably don't want to keep it that way. You would rather fix it right. But it is still diagnostic. If it is fixed by slowing down the ram, a driver bug is far less likely and hardware problem more likely. Given what we already know (any 2GB works but more does not) if slowing the ram does not fix it, that makes a hardware problem much less likely and driver bug almost certain.

Mavman 04-08-2014 04:23 PM

Hmm. That's some food for thought anyways. Prior to the upgrade it was running 2GB RAM in the form of 2x 1GB sticks, so I don't think it's a multiple stick issue.

Forgot to include the lspci entry:
Code:

06:0c.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit Ethernet Controller (rev 13)
I had already tried setting the RAM to slower speeds in BIOS. It did detect all 8GB, but still no network.

cizzi 04-08-2014 05:24 PM

Based on google searches I found the driver you need to be "sk98lin", can you confirm this is the driver you are using? either with lsmod command or looking in your kernel configuration .config file or make menuconfig.

Mavman 04-08-2014 05:36 PM

Quote:

Originally Posted by cizzi (Post 5149028)
Based on google searches I found the driver you need to be "sk98lin", can you confirm this is the driver you are using? either with lsmod command or looking in your kernel configuration .config file or make menuconfig.

I did an "lsmod | grep sk98lin" and didn't return anything. Forgive me, I'm not fluent by any stretch of the imagination with lsmod.

Here's my lsmod output altogether.
Code:

Module                  Size  Used by
iptable_filter          2258  0
ip_tables              13915  1 iptable_filter
x_tables              12845  1 ip_tables
loop                  11767  0
snd_hda_codec_atihdmi    2251  1
snd_hda_codec_analog    64562  1
radeon                574380  0
snd_hda_intel          19827  0
ttm                    39986  1 radeon
snd_hda_codec          54308  3 snd_hda_codec_atihdmi,snd_hda_codec_analog,snd_hda_intel
drm_kms_helper        20337  1 radeon
snd_hwdep              5380  1 snd_hda_codec
drm                  142928  3 radeon,ttm,drm_kms_helper
snd_pcm                60487  2 snd_hda_intel,snd_hda_codec
snd_timer              15598  1 snd_pcm
i2c_algo_bit            4209  1 radeon
i2c_nforce2            5280  0
snd                    46510  6 snd_hda_codec_analog,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_pcm,snd_timer
soundcore              4598  1 snd
parport_pc            18855  0
psmouse                49937  0
i2c_core              15803  5 radeon,drm_kms_helper,drm,i2c_algo_bit,i2c_nforce2
snd_page_alloc          6201  2 snd_hda_intel,snd_pcm
parport                27954  1 parport_pc
evdev                  7352  2
serio_raw              3752  0
processor              29903  0
asus_atk0110            7686  0
pcspkr                  1699  0
button                  4650  0
ext3                  106502  1
jbd                    37157  1 ext3
mbcache                5050  1 ext3
sg                    24053  0
usbhid                33196  0
hid                    63241  1 usbhid
sd_mod                29889  3
crc_t10dif              1276  1 sd_mod
sr_mod                12602  0
cdrom                  29351  1 sr_mod
ohci_hcd              19279  0
ata_generic            3239  0
fan                    3346  0
sata_nv                19150  0
pata_amd                9869  2
skge                  34212  0
floppy                49087  0
thermal                11674  0
thermal_sys            11942  3 processor,fan,thermal
ehci_hcd              31937  0
libata                133616  3 ata_generic,sata_nv,pata_amd
scsi_mod              126677  4 sg,sd_mod,sr_mod,libata
usbcore              123175  4 usbhid,ohci_hcd,ehci_hcd
nls_base                6567  1 usbcore


cizzi 04-08-2014 05:44 PM

In your case your kernel module is loaded correctly, its called "skge" and I see it in your lsmod result set. I was mislead by the first driver I mentioned earlier. So right now you are running on 2GB ram and your Marvel based networking is working correctly?

johnsfine 04-08-2014 05:53 PM

When I googled skge to discover it was your Marvell ethernet driver (because I hadn't yet seen cizzi's reply) I saw discussions of driver (maybe firmware) bugs. But I didn't really understand those discussions and don't know if they are relevant.

Mavman 04-08-2014 05:54 PM

Quote:

Originally Posted by cizzi (Post 5149039)
In your case your kernel module is loaded correctly, its called "skge" and I see it in your lsmod result set. I was mislead by the first driver I mentioned earlier. So right now you are running on 2GB ram and your Marvel based networking is working correctly?

Correct.

cizzi 04-08-2014 06:00 PM

You're BIOS flash update idea doesn't sound like a bad idea/option at this point,

johnsfine 04-08-2014 06:09 PM

One item google found was:

https://bugzilla.kernel.org/show_bug.cgi?id=61291

I don't begin to understand the bugzilla bookkeeping details of that message, but I do understand the diff of the C code. It is entirely about dma addressing and exactly the kind of difference one would expect for a bug causing DMA to fail (in just the skge driver) on addresses larger than 2GB.

I'm far from certain, but I think this means there was a bug in skge that fits your described symptoms and it was fixed.

I don't know the Linux kernel versioning and terminology well enough to tell you how to determine if your kernel version is before, during, or after that bug was created, found, fixed.

But I think a different kernel version (implying a different skge version) is a better shot than flashing the BIOS.

Do you happen to have a liveCD (or DVD or USB) copy of any MUCH newer Linux? See if that can use the network with 8GB ram in that system. That should confirm the issue is a driver bug that is fixed in newer Linux.

cizzi 04-08-2014 06:14 PM

Good searching, use the latest kernel you can find for your distro or go to kernel.org and get it there and compile it yourself i you know how,

Mavman 04-08-2014 06:18 PM

1 Attachment(s)
Quote:

Originally Posted by cizzi (Post 5149049)
You're BIOS flash update idea doesn't sound like a bad idea/option at this point,

Quote:

Originally Posted by johnsfine (Post 5149052)
One item google found was:

https://bugzilla.kernel.org/show_bug.cgi?id=61291

I don't begin to understand the bugzilla bookkeeping details of that message, but I do understand the diff of the C code. It is entirely about dma addressing and exactly the kind of difference one would expect for a bug causing DMA to fail (in just the skge driver) on addresses larger than 2GB.

I'm far from certain, but I think this means there was a bug in skge that fits your described symptoms and it was fixed.

I don't know the Linux kernel versioning and terminology well enough to tell you how to determine if your kernel version is before, during, or after that bug was created, found, fixed.

But I think a different kernel version (implying a different skge version) is a better shot than flashing the BIOS.

Do you happen to have a liveCD (or DVD or USB) copy of any MUCH newer Linux? See if that can use the network with 8GB ram in that system. That should confirm the issue is a driver bug that is fixed in newer Linux.

See attachment.

Quote:

Originally Posted by cizzi (Post 5149060)
Good searching, use the latest kernel you can find for your distro or go to kernel.org and get it there and compile it yourself i you know how,

I'm willing to learn.


All times are GMT -5. The time now is 01:12 AM.