How do I detect/diagnose a NIC suspected of "coming and going"?
Linux - HardwareThis forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
How do I detect/diagnose a NIC suspected of "coming and going"?
I have a machine that I suspect may have a hardware problem. Do you have any suggestions on how to detect that the card is loosing connection for tiny fractions of a second? I don't know if the switch, the nic, the cable, or either of the RJ45 connectors are bad. I'd love to hear something like, "That's easy just set this logging on and every time the nic does a handshake it will get logged."
Please no suggestions like, "well, why not try to ping it continuously".
Let me be clear, this server (MySQL) is under HIGH net traffic and a few times a day, the app servers log that the connection was lost. It happens for a few connections NOT a few minutes or even seconds, connections (like milliseconds).
I will reply with an lspci & uname in the AM.
Thanks in advance!
Last edited by RichardBronosky; 06-25-2008 at 08:48 AM.
Distribution: Red Hat CentOS Ubuntu FreeBSD OpenSuSe
Posts: 252
Rep:
[QUOTE=
Please no suggestions like, "well, why not try to ping it continuously".
[/QUOTE]
Well thats the basic part. Do it. If there'a a timeout, check if there are collisions on the switch, on your server you can check the interface if there are errors on RX and TX packets.
As your mysql error says "connection loss" , this includes a diagnoses of all tcp layers
1)Chek mysql permissions from server to client
2) chek whether some unauthorised users are trying to connect to mysql , this is because if i type "mysql -h your_mysql_host -u root -p" , i will be able to establish a connection , but unless i authenticate i will not have access to DB..this will be logged in server logs
3)The version of db connectivity drivers used app servers (you have not mentioned), might be an issue
4)the maximum number of connections that mysql is able to bear (if its unable to handle more client requests try adding one more NIC to the server box)
If there's a problem with the NIC or even in the physical layer it will show up in the errors section of the output.
if the carrier number is climbing then that means the interface is bouncing, the link pulse has been lost and restored similar to unplugging the cable, and plugging it back in. This is on by default there is no need to enable logging to see these numbers.
On a decent managed switch you can also view the switchport statistics to gather similar information.
If they are all ZERO as in my output then you need to look elsewhere for your connection issues, possibly at mysql as suggested .
if it's the NIC you wish to look at then just look at ifconfig
...
If they are all ZERO as in my output then you need to look elsewhere for your connection issues, possibly at mysql as suggested .
That's the kind of thing I needed, thanks! Unfortunately, it seems that is not catching issues even when I artificially create them. I tried "watch -n1 ifconfig" while pulling the ethernet cable several times on my sandbox. The only thing that changed was the /[RT]X/ numbers.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.