I am looking for ideas and suggestions regarding a bandwidth issue I have. I started at a company in January of this year, and in May our T-1 utilization shot up to max usage all business day. Prior to this we would hit max usage a few times throughout the day, but not all day.
So basically the usage graphs look like this:
| | | |
| | | |
|| | | | |
|| | | | |
So anyways the data usage plateus all day long. Itpicks up in the AM and drops off as people leave. Additionally average daily utilization is double what it had been i.e. we used maybe Avg 50% of bandwidth daily, now we use 97%-99%. The rub lies in the fact that this started before we moved to a new SAP portal, so operations in January in terms of Internet usage were the same as they were in May when this started. It was not until mid June when our new Web based Vendor portal was implemented. ANd this did not have any effect on the all ready saturated T-1 line.
I worked with our T-1 provider ATT who were able to demonstrate to me that unplugging the LAN made the data stop, and really nothing else. I have worked with our router vendor to try and see what data was coming in but our Juniper SSG 140 Netscreen OS does not provide that kind of realtime monitoring.
So I have some data collection servers in place, I have implemented bandwidthD, netop, cacti, mrtg, ipband, snort, iftop, and placed a squid proxy server in place then used Active Directory policy to set all users on the proxy, with a couple exceptions I have made myself.
What has all this data told me? It looks like it is legitamite traffic, The stats colelcted by cacti and mrtg match the ATT usage stats, the firewall logs and dropped packets count from the router do not indicate any foul play. bandwidthD does not show the same amount of traffic, but does show high utilization. ipband shows me the top data users when a threshhold I set is passed for more than five minutes, The data all sems to be legitimate. Snort shows no suspicious activity that is not expalined by my activities and almost none of the activity involves external IP addresses. netop shows high usage and is relatively consistent with ATT and cacti graphs. iftop number do not match with general utilization on two second average often being 500-600 kb out of a possible 1500, it does spike from time to time, and occasionally will spike up to 9-10Mb/s for a very brief moment but then normalizes. Finally the proxy I have put that in place, and it now eats most of the bandwidth itself as it is proxying for most machines, but sites being visited etc... are normal with the highest ones being, as they should, the vendor portal sites we access. The bandwidth is still near max capacity, though just a hair less than it had been before the proxy.
I have come to the conclusion that we are just eating up the bandwidth, I mean a T-1 is not that high capacity. What exactly caused the jump I cannot pin point though. While this does not bother me so much as I have no other explanation. My co-worker who works in IT with me, who has been here longer, and is a supervisory capacity to me, will not settle for that, he wants an answer, I cannot provide hm with one, I spent weeks and weeks on this issue. Unfortnately I have no data from before the issue started, so I cannpt do a comparison to see the differences. He is positive something is wrong in that there has to be a specific reason our bandwidth maxed out all of a sudden. He will not settle for no answer as to waht that cause is. He is also a bit incredulous to the idea we have maxed out our T-1 period, to paraphrase he says that we don't use that much data on the internet.
So thank you if you have stayed to read this far my question to all of you is what should I do from here, is there really anyway to get what he wants, any ideas as to what the issue may be, other than legitamite circuit max utilization? Perhaps you have an idea of where I can look in the logs to find something I missed. I really need an answer as this is causing friction here. As I am suggesting adding a low cost, high speed DSL to the router and then routing general traffic over it and sending email, http, ftp, and VPN traffic over the T-1. He almost flat out refuses to OK the DSL with our Boss if I cannot find the cause. He all ready thinks we should dump ATT as teh T- provider because they are not doign more to assist us. Though my opinion is that there is little else they can do to assist us.
Well thank you for your time and any ideas or suggestion you may have. Sorry to drone on so long.