FC7: eth2 & eth3 come up as __tmp# 60% of the time
Linux - NetworkingThis forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
FC7: eth2 & eth3 come up as __tmp# 60% of the time
I have been troubleshooting this problem for hours now, and the best I can come up with is a workaround.
I am trying to understand why 3/5 reboots results in ETH2 & ETH3 being named (and consequently useless by my software) something like "__tmp509338517".
To give you an example, just so you can also get an idea of the networking hardware specs, etc...
Now, I have come across various different workarounds, that mostly involve renaming the interface or statically defining the MAC address in the ifcfg-eth? file... unfortunately I'm a bit more curious than the rest of the world, and want to identify the actual problem.
From what I can tell, the only place that an interface can get a name like "__tmp509338517" from is in the source for "rename_device" (this assumption could be my first problem). The problem is that I cannot find where, after modprobe (in rc.sysinit), and before ifup, this change can occur.
And to make matters worse, I have (as of yet) been unable to follow the code-path that allows the workaround "put MAC address in ifcfg-eth?" to even work... as it is suggesting that either the __tmp-like name exists for every interface (to begin with) and successfully changes to eth0/eth1 OR it is suggesting that a good eth2/eth3 are renamed to __tmp *because* they didn't statically define the MAC address...
I have (through use of a superfluous number of echoes) determined that prior to calling "/sbin/start_udev" the only interface present is "lo". This is actually completely expected...
Narrowing my focus, I have tracked the interface recognition to the following (part of /sbin/start_udev):
The wait_for_queue() function allows, from what I can tell, the devices to run through the udev rules in whatever order they are queued.
Code:
wait_for_queue() {
local timeout=${1:-0}
local ret=0
if [ $timeout -gt 0 ]; then
/sbin/udevsettle --timeout=$timeout
else
/sbin/udevsettle
fi
ret=$?
if [ $ret -ne 0 ]; then
echo -n "Wait timeout. Will continue in the background."
fi
return $ret;
}
Unfortunately there is the little mystery of udevsettle, and why it is not idempotently naming my Ethernet interfaces. I do notice though (a reference back to my original post) that the "/etc/udev/rules.d/60-net.rules" file specifically references the "/lib/udev/rename_device"... although as of yet, I am not an expert on udev-rules...
Support was dropped for F7 quite a while ago (couple of years?). You might be better off using a supported version(F10 and F11 right now). The versions of the software you are looking at MAY be better able to handle your issues.
Thanks for the input, but I think I would have skipped asking the question had I the option to just upgrade... but anyway
In any case, I think I have determined the problem and, best of all, the reason WHY the solution (putting HWADDR lines in ifcfg files) is appropriate.
First a good run:
1 – The PCI bus sees the two interface cards in the order Broadcom, and then Intel
2 – udev detects, loads drivers, & facilitates the creation of eth0/eth1 for Broadcom (because it got there first) and eth2/eth3 for Intel (because it got there second)
3 – because on detection of eth2/eth3, there were no eth2/eth3, the names are intact & everybody is happy
Now the bad run:
1 - The PCI bus sees the two interface cards in the order Intel, and then Broadcom
2 – udev detects, loads drivers, & facilitates the creation of eth0/eth1 for Intel (because it got there first) and eth2/eth3 for Broadcom (because it got there second)
3 – upon creation of the Intel interfaces, the application “rename_device” is unable to match a SysFS Hardware Address on “eth0” to the “eth0” MAC address listed in the ifcfg-eth0 file (Note that the MAC address in that file is actually for the Broadcom device)
3 – assuming that the ifcfg-eth* files are Law (which it does) the “rename_device” application promptly renames the conflicting device to “__tmp*” allowing the first Broadcom interface to take up position as eth0
4 – repeat 2 & 3 for eth1, changing Intel eth1 to “__tmp*” and allowing Broadcom to take its place as eth1
The issue (as far as I can see) is simply a hardware race condition. Also, the “rename_device” code is far from robust (in FC7 at least), and does not attempt to make better names for the “__tmp*” interfaces. From this point of view, the solution of putting the MAC addresses in the ifcfg-eth* files is actually the correct approach because it simply says “regardless of what order your device arrives in on the PCI bus, you should be named X if you have MAC address Y”…
the shear numbers of changes from fedora 7 to fedora 11
even the fedora devs state that a FRESH install is the BEST way
running preupgrade from yum to upgrade fedora 10 to fedora 11 dose not always work
so going from 7 to 11 will not work .
if you want to spend a few days straining out a busted system, then go ahead and try . But don't expect it to run or boot .
The upgrade is out... either because of a really long explanation, or a really short one... lets try the short one.
This is part of a product that my company ships. Way newer kernel, mostly updated networking stuff like DHCP, but almost other tidbits are FC7 stock. A Dell hardware change alerted us to this quirk, but as Dell has hard EOLs that do not necessarily follow other companies software release schedules... Hey, c'est la vie!
If you are running a current kernel(and networking stuff) on F7 you are not running F7. There is no way to predict what kind of strange interactions you have going on.
Any idea why they did not choose a long term support distro (Centos/RHEL come to mind) to base the product on? It would have greatly simplified your life.
Without knowing the details, I can only speculate. I believe the newer products all standardize around RHEL, but this whole problem only surfaced when trying to support old code on new boxes anyway...
i suspect that it is new hardware that came out after fedora 7
your only answere is this
if you stay with fedora 7 then do a full rewrite of the fedora 7 code base to get it compatible with the new hardware
this is the MAIN problem of the VERY POOR business idea of using fedora in the first place WITH OUT upgrading it every 6 months to the new version and installing the hundreds of "updates" that every fedora release has.
Now the bad run:
1 - The PCI bus sees the two interface cards in the order Intel, and then Broadcom
2 – udev detects, loads drivers, & facilitates the creation of eth0/eth1 for Intel (because it got there first) and eth2/eth3 for Broadcom (because it got there second)
3 – upon creation of the Intel interfaces, the application “rename_device” is unable to match a SysFS Hardware Address on “eth0” to the “eth0” MAC address listed in the ifcfg-eth0 file (Note that the MAC address in that file is actually for the Broadcom device)
3 – assuming that the ifcfg-eth* files are Law (which it does) the “rename_device” application promptly renames the conflicting device to “__tmp*” allowing the first Broadcom interface to take up position as eth0
4 – repeat 2 & 3 for eth1, changing Intel eth1 to “__tmp*” and allowing Broadcom to take its place as eth1
The issue (as far as I can see) is simply a hardware race condition. Also, the “rename_device” code is far from robust (in FC7 at least), and does not attempt to make better names for the “__tmp*” interfaces. From this point of view, the solution of putting the MAC addresses in the ifcfg-eth* files is actually the correct approach because it simply says “regardless of what order your device arrives in on the PCI bus, you should be named X if you have MAC address Y”…
I just wanted to thank you for your troubleshooting and follow-through. I think most folks don't go back to a place they'd posted a question and document the solution they found. After reading your solution, I was able to track it down on my system here. Prior to reading your post, the only information I'd found related to wireless cards and reloading modules, something we don't have in our datacenter (wireless, that is). And I'm only counting machines that my own department works with...we've got many systems in datacenters around the world. Just keeping the systems up to date with the latest versions of *everything* could be a full-time job for a not-too-small department.
With regard to those that promote the upgrade route, well, that's something that's easy to do in a home or smaller environment but when you're running a Production Spec machine, any upgrade involves QA and Development testing and then the upgrading itself. In your environment, you have customer machines to take into account as well.
In our environment, we're roughly 50/50 Solaris/Linux. Our Linux machines numbered 574 at last count, and there are probably a couple hundred machines in one of our new datacenters that aren't in production yet so aren't in that count. We don't have the manpower to devote to upgrading/testing the Dev/QA machines (194 of 'em), much less the existing Production servers and then many, many new builds. I'm only counting machines that my own department manages...we've got many systems in datacenters around the world. Just keeping the systems up to date with the latest versions of *everything* could be a full-time job for a not-too-small department, with on-hand employees around the world.
In a perfect world, we'd all be able to keep our firmware (something most don't think about until there's a problem) and software up to date, but when the manpower doesn't exist for a project of that scale, it's hard to convince Management to spend the funds until there's a major issue which *requires* an upgrade. Patching, yes. OS reloads, not as easy to do.
</SOAPBOX>
Thank you for your follow-through and thank you for the information.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.