Share your knowledge at the LQ Wiki.
Go Back > Forums > Linux Forums > Linux - Networking
User Name
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.


  Search this Thread
Old 05-06-2007, 11:10 PM   #1
LQ Newbie
Registered: May 2007
Distribution: CentOS, Ubuntu
Posts: 7

Rep: Reputation: 0
Question I wish "dead gateway" would honour traffic it is routing.

I wish "dead gateway" would honour traffic it is routing.

I am 99% there but can't quite get the last bit to work. Please help!

The problem:

A client has two networks. For High Availability (HA) reasons they have two routers/firewalls between the networks, both active with different IPs addr and rely on "dead gateway" detection by PCs and Srvs on each side to work out if the primary router is down and switch over. This more or less works (Linux boxes are quick but Windows takes a minute or two).

However client has some devices that have a very limited/closed TCP/IP stack and as such only support a single default gateway (or any addition to the routing table). Unfortunately they are the most important devices to the business.

To work around this I put a new Linux box on the same side of the routers as the problematic devices (and applied a whole bunch of HA to it). I then pointed these problematic devices gateways at the new Linix box. The new Linux Box has both router gateways setup. The idea being this effectively gives these single gateway devices dual default gateways and a way to participate in the existing dead gateway detection fail over the client has going here.

The new Linux box has a single NIC with ipforward=1 (on) and all the "send_redirects"=0 (off). In the past I have had a lot of success with this router on a stick approach (but then I have not relied on dead gateway before).

While everything is up this works. According to "iptraf" traffic from the "devices" goes to the new Linux box and then gets routed through the default router/firewall to servers in the other network.

However when the default router is failed all other PC and Server eventually detect the dead gateway and switch over to the secondary router/firewall, including traffic to and from the new Linux box. However the single gateway problematic devices still don't work. Their traffic gets sent to the new Linux box, the new Linux box has worked out via dead gateway detection and switched over for its own traffic but will not honour traffic being routed through it by passing it on to the secondary (active) router/firewall (see below for detail).

The why are you doing this way question:

Yes it is a patch to a bad design (legacy design). In the short term I can't change the basic setup. The two firewalls/routers have been in for years and have 100's of odd looking rules that would take months to untangle. The client is a genuine 24/7 operation. I was hoping this way I could leave the system as much as possible as is and just fix this one issue with minimum impact/change to the business. I am open to other suggestions but there are politics and other issues that would make this posting way longer if I left it that open.

I could script something but I want Linux inbuilt functionality, as is, to work (so it is not on my neck if script causes issues).

The config:

Example Problematic device: with default gateway

New Linux Box: eth0 with default gateways and (both metric 0). Sysctl.conf also sets up ip_forward=1 but send_redirects=0 x all + default + eth0.

Primary Router/Firewall : eth1, eth0 (for testing in lab these are un-firewall-ed routers)
Secondary router/Firewall : eth1, eth0

Example Server : with dual gateways and (W2K3 and metric both 0).

What else have I tried:

The testing has all been done in a lab with clean built machines to first show it is possible (or not in this case).

I tried two NICs on the Linux Box, but both nics still in the same subnet. The problematic devices default gateway to one nic and the other with routes to the two firewalls/gateways. Same result.

As the new problematic Linux box I have tried both Ubuntu 7.02 (server) and CentOS 4.4 (minimum install). Same result.

I know ping is not enough for “dead gateway detection” so have auto ftp scripts forcing tcp traffic to assist the gateway switch.

"ip route flush table cache" does not clear the problem.

The errors:

First up everything works if the primary router/firewall is active. Secondly the new Linux box its self does detect the dead gateway and switch over to the secondary route/firewall for any of its own traffic. Thirdly everything works if I fail the primary router AND manually delete the route via on the new Linux box (but of cause I want a automatic fail over).

However once the primary router/firewall is down, ftp (or ping, telnet etc) from problematic devices to server in the other network reports:

"From icmp_seq=x Destination Host Unreachable" (even though a ping from works its self).

On the new linix box "ip route show table cache" shows:

" from via dev eth0 src cache <src-direct> expires -xxsec mtu..." (i.e still holding the old route).

but also in cache table it reads:

" from via dev eth0 ..." (i.e. for its own traffic it has changed gateways so why not for other devices routing through?).


I could be barking up the wrong tree but it looks to me - reading the "ip route show..." results, like "dead gateway detection" spots the primary gateway is down and switches gateway for its own traffic but does not do so for any traffic routed through its self. Is there a way to get "dead gateway detection" to fail over all traffic or am I asking too much of “dead gateway detection” with a single NIC?
Old 05-07-2007, 01:12 AM   #2
LQ Newbie
Registered: May 2007
Distribution: CentOS, Ubuntu
Posts: 7

Original Poster
Rep: Reputation: 0

...continuing with my R&D I built a Windows 2003 box with RRAS activated and it works. It fails over (slowly) and routes traffic that is not its own through the new route.

So now I at least have a option that works however I still would prefer a working linux salution (Windows makes a expensive and resource hungry router).

Last edited by RobynWoodall; 05-08-2007 at 05:25 PM.
Old 05-08-2007, 05:25 PM   #3
LQ Newbie
Registered: May 2007
Distribution: CentOS, Ubuntu
Posts: 7

Original Poster
Rep: Reputation: 0
... and continuing more R&D CentOS 5 does not work either.


availability, dead, detection, failover, gateway, high, routing

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Unix: Pretty Spry for "Dead." LXer Syndicated Linux News 0 12-30-2006 07:33 PM
I get "[F1] for setup, [F2] to load default settings." And the Keyboard is dead. michael! Linux - Hardware 1 10-12-2006 05:40 AM
"vsFTPD is dead but syskey locked" in RH9 services. SPo2 Linux - Networking 3 06-27-2006 12:16 AM
have to ping gateway to "kickstart" net connection and routing bPrompter Linux - Networking 0 01-19-2005 03:56 PM > Forums > Linux Forums > Linux - Networking

All times are GMT -5. The time now is 06:05 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration