Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I can't understand what's wrong... I have four RHEL 5 servers I'm administering. I went to run tracepath on Server 1 and it gave me the error
Code:
tracepath: error while loading shared libraries: libresolv.so.2: cannot open shared object file: No such file or directory
On Server 2, tracepath works fine. On both servers, libresolv.so.2 is located in /lib and is a symlink to /lib/libresolv-2.5.so.
Both .so files have the same owners and privileges. Both have the same md5 hash. Both servers' /etc/ld.so.conf files are identical. Both have identical running processes. Why can't Server 1 run tracepath / load libresolv.so.2?
Do all servers:
1) have the same hardware, CPU, etc? E.G., are some 64-bit, while others are 32-bit, or is some other difference?
2) run the same point release of RHEL 5?
3) have an identical collection of packages installed?
All servers are running as virtual machines. They are operating on a cluster of 4 physical servers through VSphere.
They are all running the same version of RHEL 5.
I assume they have an identical collection of installed packages... how can I check?
Pardon me if I go into too much detail, I don't know of what information you are aware.
My general suspicions would be along the lines of something like, either a package is missing, or a package deployment wasn't completely successful, or there's an issue with different architectures on different virtual machines, or there is some esoteric problem with access permissions. All too often when I try run a program and I get a message such as you got, there is no information on what form of library the attempt to load was made. E.G., if I'm running a 64-bit Operating System which can also run 32-bit programs, was the attempt made to load a 64-bit library, or a 32-bit library? If as would often tend to be the case, most of what I'm running are 64-bit programs, I might first expect that the library is 64-bit. But if some programs tend to be distributed as only 32-bit, there will tend to need to be a 32-bit version of the library. This command:
Code:
uname -a
run on each virtual machine should show you the architecture each machine is configured to implement. If the servers are implemented as virtual machines then the question becomes, are they supposed to be configured the same way, have the same amount of memory allocated to each, the same number of CPU's or limits of CPU usage, the same architecture, etc., are they supposed to have the same collection of packages? If they are supposed to be conceptually identical, and if you are using standard Red Hat package management, then on server1 for example:
Code:
rpm -qa | sort > server1_package_list.txt
will get you a useful list of packages. You can get such a list from each server, gather all the lists together on a single machine, then:
should not produce any differences if the machines are the same architecture and if they are supposed to have the same packages. I've worked places where load balancing was done between multiple machines which were supposed to be conceptually identical to provide the same services to customers; instead, somehow a deployment of an update failed to one machine and wasn't caught. Customers would complain, tests would be done and seemingly be successful, if done through load balancing, because the load balancer would happen to connect to a machine that had the correct collection of packages. I used this method to check packages and so found the deployment failure; when a test was performed directly against the one machine with the wrong collection of packages, the error was isolated.
Additional info: Ooops! Sorry, I missed that the md5 sum was the same on two servers. If you run these two commands:
Code:
file /bin/tracepath
ldd /bin/tracepath
on Server 1 and Server 2, are the results on both Server 1 and Server 2 the same?
HTH.
Last edited by rigor; 08-30-2015 at 08:35 PM.
Reason: more complete explanation
Wow, that's kind of an attack, isn't it? No, I didn't hide a single thing from you except for the server's address. If you want something that's not here, ask me for it.
hashed only means the shell already found it and remembered. That is from this point of view meaningless.
You are administering that host, so probably you can install strace on it.
what will locate libresolv respond (on both hosts)?
How LD_LIBRARY_PATH was set?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.