Hi all,
This is a copy of a post I made on the openWRT forum - sorry for spamming you, too; it is quite a generic question, though. Hope you can help!
I'm running OpenWRT Backfire 10.03.1 on a Netgear WNDR3800, which is running Linux Kernel 2.6.32.27.
I'm building an analysis tool, so I can affect network traffic between WAN and Wifi. I've unbridged wifi and lan and I'm running a TBF followed by netem on the wlan interface which gives me bandwidth limiting and all sorts of other goodies.
I have a bad problem, though:
If I change the bandwidth (rate) parameter on the TBF in situ, the netem component is destroyed.
This causes a hole in the traffic, as all the netem buffered packets are lost, too (and I have to re-instate the netem). Surely this shouldn't happen on just a parameter change?
Has anyone else had this problem and found a workaround?
Thanks!
Oh: here's an illustration if you're very interested
Here's the inital state:
root@OpenWrt:~# tc qdisc show
qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc mq 0: dev wlan1 root
qdisc mq 0: dev mon.wlan1 root
1. I add a token bucket filter. This is my root qdisc. I am replacing the default qdisc (mq):
tc qdisc replace dev wlan1 root handle 1:0 tbf rate 1024kbit latency 50ms burst 5120
Here's what it looks like:
root@OpenWrt:~# tc qdisc show
qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc tbf 1: dev wlan1 root refcnt 5 rate 1024Kbit burst 5Kb lat 50.0ms
qdisc mq 0: dev mon.wlan1 root
2. I add netem, the network emulator:
tc qdisc add dev wlan1 parent 1:1 handle 10 netem delay 80ms
Here's what it looks like now. The netem is now linked to tbf so traffic passes through it, on the way to netem and out to wlan:
root@OpenWrt:~# tc qdisc show
qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc tbf 1: dev wlan1 root refcnt 5 rate 1024Kbit burst 5Kb lat 50.0ms
qdisc netem 10: dev wlan1 parent 1:1 limit 1000 delay 80.0ms
qdisc mq 0: dev mon.wlan1 root
3. I make a change to the existing token bucket filter, changing the bandwidth to 500kbit.s-1
tc qdisc change dev wlan1 root handle 1:0 tbf rate 500kbit latency 50ms burst 5120
And here's what it looks like:
root@OpenWrt:~# tc qdisc show
qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc tbf 1: dev wlan1 root refcnt 5 rate 500000bit burst 5Kb lat 50.0ms
qdisc mq 0: dev mon.wlan1 root
WHERE'S NETEM ???!
Conclusion:
The 'change' has cause the netem component to be torn down.
Why is this a problem?
All of the traffic shaping stuff is based on buffering in order to hold on to data packets as they move through the system.
The change is meant to occur as the user is moving a slider to regulate bandwidth; it is meant to effect a change to a parameter in a constructed entity, such that it simply holds on to the packets for a longer or shorter time. We should simply see a change in the rate at which packets are released into the netem component.
What is actually happening is that the netem component is being torn away. I have no idea what has happened to it, or the data which it is buffering.
So, instead of a dip in the rate at which packets are arriving, connected clients are probably seeing a large hole as a buffer full of packets suddenly disappears.
What is affected?
The combination of bandwidth limiting, with any other emulation.
Things I've tried:
1. swapping the order, so that tcf comes after netem. I get 'unsupported operation' when I try to do this
2. changing only the bandwidth parameter of tbf (in case the burst or latency parameters cause a buffer reset, etc). I get a ' you must specify burst' error.
3. changing to a different filter from tbf - unknown as to whether it will work. Parameters are different; modes of operation are different. I have to change a lot of infrastructure to accommodate (long story)
Thanks for reading!