LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices


Reply
  Search this Thread
Old 07-14-2016, 08:15 AM   #1
edmonstone
LQ Newbie
 
Registered: Jul 2016
Location: New England, north of the border
Posts: 11

Rep: Reputation: Disabled
Newb's first post. Simple bridge works great, but IP routing to bridge is not stable, needs pings.


Hello all,

I'm a long time lurker, first time poster. I've usually found what I need with searches but I'm in a situation that is still defying me so I've come to ask the forum for help.

My background is more in the board support, drivers, bring up, etc. I have done layer 2 content filtering, vlans, access control, etc. I'm a jack of all trades and master of almost none.

Here is background on the problem I'm having.

Linux kernel is 2.6.34.8. The processor is a Freescale imx287 ARM. I have configured and built the kernel with 802.d Bridging enabled. I have downloaded, cross-compiled and installed brctl.

The hardware configuration is two processor boards, one a carrier board and one a daughter board. Both boards are running the same kernel on the same kind of processor. In this stack configuration, both processors' eth1 are connected together.

Due to a design error in one application the daughter board has no connection to the network. So my task is to create a bridge across eth0 and eth1 of the carrier and allow the daughter board to access the internet. The carrier eth0 is the network path. You know, the old "we'll fix it with software".

Here is my bridge configuration on the carrier processor:

ifconfig eth0 down
ifconfig eth1 down

brctl addbr br0

brctl addif br0 eth0
brctl addif br0 eth1

brctl stp br0 off

ifconfig eth0 0.0.0.0 up
ifconfig eth1 0.0.0.0 up

echo 1 > /proc/sys/net/ipv4/ip_forward

udhcpc -i br0

Then, on the daughter board I just bring up eth1:

udhcpc -i eth1

The bridge works as it should for the daughter board. I can telnet or ssh into it from the network and these sessions have stayed up and active over 24 hours.

The problem I have is that ssh or telnet will not connect to the carrier processor where the bridge is configured unless I ping the gateway from both the daughter and carrier. If the pings work I can then establish a ssh session to carrier eth0, but it gets dropped quickly and no longer works. Even if I ping around again.

I did find a thread that is almost exactly this same issue. In that thread the solution was to add the following to the route table

ip route add default via 172.20.213.1 dev br0

In my configuration this returned ip: RTNETLINK answers: File exists because when I did the dhcp it installed the default gateway route.

Here is the output of route:
# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
172.20.213.0 * 255.255.255.0 U 0 0 0 br0
default 172.20.213.1 0.0.0.0 UG 0 0 0 br0

I have also check aging and learning by watching brctl showmacs br0. The two Ethernet ports on the carrier each have a static mac entry. I can see the mac from the daughter board's eth1 age out and get re-learned.

I don't know what else to check/configure. I would be most appreciative for any help people can offer.
 
Old 07-15-2016, 01:03 AM   #2
jnihil
Member
 
Registered: Dec 2012
Location: inside the matrix
Distribution: Debian, Xubuntu, Gentoo, Antergos
Posts: 90

Rep: Reputation: 27
If the mac table for br0 looks kay, then how is 'arp -a' looking?
Also, have you tried capturing what is reaching the carrier eth0 when the ssh/telnet connection fails using tcpdump/tshark?
 
Old 07-15-2016, 08:09 AM   #3
edmonstone
LQ Newbie
 
Registered: Jul 2016
Location: New England, north of the border
Posts: 11

Original Poster
Rep: Reputation: Disabled
Hello jnihil,

Thanks for the input. I am planning to get WireShark running today to look at the traffic. I don't think I mentioned this before but I do have rs232 console to both processors so I can poke about. This is a very stripped down file system in an embedded application and so certain utilities such as tcpdump are not there.

This morning I checked the state of the bridge and ssh session to the processor on the other side of the bridge. The ssh session was still up and very responsive when I did a ps. As usual, I could not open ssh to the bridge processor.

Here is arp -a at that time:
# arp -a
? (172.20.213.1) at 00:14:1b:4e:90:00 [ether] on br0

I pinged the gateway from both processors and then could establish ssh to the bridge processor. Here is arp -a at that time with the ssh session still active. The first entry is the laptop I'm working on.
# arp -a
? (172.20.213.63) at 5c:26:0a:68:b0:d2 [ether] on br0
? (172.20.213.1) at 00:14:1b:4e:90:00 [ether] on br0

The ssh session to the bridge processor stopped working after 10-15. It has lost the gateway entry.
# arp -a
? (172.20.213.63) at 5c:26:0a:68:b0:d2 [ether] on br0

I can still ping the laptop at this point, but it took several seconds to start working. After this there is still only the laptop in the arp table.

Then I ping the gateway, which replaces the entry for that in the arp table. At that point I can open a new ssh session with the bridge processor. The gateway arp entry comes back in the table, but cannot establish ssh to bridge processor. Then I ping the laptop, which is still in the arp table, and then I can establish the ssh session and log in for a few seconds.

edit: The gateway mac is still aging in the mac table at this point.
# brctl showmacs br0
port no mac addr is local? ageing timer
1 00:14:1b:4e:90:00 no 0.77


Here is the error message on the laptop.
Attached Thumbnails
Click image for larger version

Name:	sshFail.jpg
Views:	16
Size:	25.8 KB
ID:	22467  

Last edited by edmonstone; 07-15-2016 at 08:13 AM.
 
Old 07-21-2016, 10:39 AM   #4
edmonstone
LQ Newbie
 
Registered: Jul 2016
Location: New England, north of the border
Posts: 11

Original Poster
Rep: Reputation: Disabled
Sorry to drop off for a while, other fires to fight.

Today I was able to get my two boards and laptop on a private subnet. The short version is that broadcast macs are not getting to the bridged processor and so it never replies to arps. The broadcasts do make it across the bridge just fine. So now I know what isn't working but I'm still looking for a solution for this.
 
Old 07-22-2016, 07:39 AM   #5
edmonstone
LQ Newbie
 
Registered: Jul 2016
Location: New England, north of the border
Posts: 11

Original Poster
Rep: Reputation: Disabled
It seems to me that this bridge is operating as an unmanaged switch, and it is a switch with only two ports. So it appears reasonable that broadcast packets don't automatically go to the IP stack on the bridged processor. Based on this output there is nothing to tell the bridge to forward broadcasts to the stack.

# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.16.0 * 255.255.255.0 U 0 0 0 br0
# brctl showmacs br0
port no mac addr is local? ageing timer
1 00:14:1b:4e:90:00 no 3.94
1 00:14:22:44:7c:8c no 58.60
1 00:1e:c9:5d:c0:ae no 21.44
1 00:20:9f:12:60:4b yes 0.00
2 00:20:9f:12:60:4c yes 0.00
1 00:21:9b:4e:84:79 no 23.39
1 00:22:90:e9:20:77 no 1.08

Am I close, or getting colder?
 
Old 07-22-2016, 10:39 AM   #6
jnihil
Member
 
Registered: Dec 2012
Location: inside the matrix
Distribution: Debian, Xubuntu, Gentoo, Antergos
Posts: 90

Rep: Reputation: 27
I'm not sure what you mean by 'broadcast packets don't automatically go to the IP stack'.
Both the bridge and the IP routing/FIB is in the kernel. If the broadcast is an ARP or an IP multicast group to which the host belongs to, it *will* be process by the host.
So what was the conclusion of your investigation with packet capture? Did the host respond to the ARP when you ssh/telnet to the host?
 
Old 07-22-2016, 12:08 PM   #7
edmonstone
LQ Newbie
 
Registered: Jul 2016
Location: New England, north of the border
Posts: 11

Original Poster
Rep: Reputation: Disabled
Hello jnihil,

Quote:
Originally Posted by jnihil View Post
Did the host respond to the ARP when you ssh/telnet to the host?
No, they don't and I believe this is the problem. The bridged processor never responds to the broadcast arp packet for the br0 IP address. This is also why it doesn't respond to pings on its' own.

The only way to get the bridged processor to respond to pings is if the other end is also pinging back. For example, from a fresh boot I configure the bridge. I ping the laptop from the bridged processor and get no replies. Then I ping the processor on the other side of the bridge and get no replies.

Now, if I go back and ping the laptop from the bridged processor AND ping the bridged processor from the laptop, I will then begin to get replies in both directions. Once these pings are successful I can open an ssh session with the bridged processor if I act quickly. But this connection doesn't stay up very long, less than a minute, about enough time to do ps -ax a couple of times.

So in the traces I see many arps for 16.18 but never a response without the bi-directional pings.

4 3.019610 Dell_68:b0:d2 Broadcast ARP 42 Who has 192.168.16.18? Tell 192.168.16.5

I'm also wondering if the bridge could be getting the broadcast arp but confused as to which MAC it should respond with, eth0 or eth1, and so it just doesn't respond.
 
Old 07-22-2016, 07:22 PM   #8
jnihil
Member
 
Registered: Dec 2012
Location: inside the matrix
Distribution: Debian, Xubuntu, Gentoo, Antergos
Posts: 90

Rep: Reputation: 27
Quote:
Once these pings are successful I can open an ssh session with the bridged processor if I act quickly. But this connection doesn't stay up very long, less than a minute
Sounds like something is aging out and you can't quite catch it.
How about setting up a bunch of commands in a loop so that you can see what is happening, like so:

Code:
while :
do 
  ip route
  brctl showmacs br0
  arp -a
  sleep 1
  echo ""
done
 
Old 09-22-2016, 09:42 AM   #9
edmonstone
LQ Newbie
 
Registered: Jul 2016
Location: New England, north of the border
Posts: 11

Original Poster
Rep: Reputation: Disabled
I didn't want to leave this thread hanging so I'll post what final solution was, and it wasn't bridging.

I finally convinced the client that a bridge wasn't going to give them full access to the hidden board like they wanted. The situation is very similar to a home setup with a gateway and multiple PCs behind it. On the eth0 side of the carrier board is the network. One the eth1 side of the carrier board is the private subnet. So it becomes a simple network address translation exercise. The iptables package for ARM has been installed in the file system and the kernel was rebuilt to use Netfilter and xtables support.

With an IP solution everything just works, pings, telnet, ssh, etc.

At run time on the carrier:
1. enable ip forwarding
echo 1 > /proc/sys/net/ipv4/ip_forward

2. bring up two interfaces on eth0
ifconfig eth0 192.168.16.20
ifconfig eth0:1 192.168.16.30 up

3. bring up an interface on eth1 that is the gateway for the hidden board.
ifconfig eth1 192.168.10.1 up

4. set up the iptables filter and actions
iptables -t nat -A PREROUTING -d 192.168.16.30 -j DNAT --to-destination 192.168.10.2
iptables -t nat -A POSTROUTING -s 192.168.10.2 -j SNAT --to-source 192.168.16.30

On the hidden board:
1. ifconfig eth0 192.168.20.2 up

2. route add default gw 192.168.10.1

I'm no expert and I have to wonder if there was a way to do this with ip routes instead of iptables. For an embedded application, there was a lot of stuff added to the file system for just a few lines of filter. :-) But, it does work well.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Intel SNA Performance Of Sandy Bridge, Ivy Bridge, Haswell LXer Syndicated Linux News 0 10-13-2013 11:02 PM
bridge / qemu - bridge is natting multicast traffic eantoranz Linux - Networking 1 12-31-2012 06:46 PM
LXer: RC6 To Be Flipped On For Sandy Bridge, Ivy Bridge LXer Syndicated Linux News 0 12-11-2011 06:12 AM
Fedora Core 5: Bridge, only allows pings to specific IP address. No other traffic big_ginge21 Linux - Networking 7 01-04-2007 07:49 PM
Internet, routing and bridge Gorchi Linux - Networking 3 03-01-2003 09:29 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Networking

All times are GMT -5. The time now is 09:22 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration