Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
We have server environment having the following configuration
Linux : Red Hat Enterprise Linux AS release 3 (Taroon Update 5)[ 2.4.21-32.ELsmp #1 SMP ]
WAS : 6.0.2.11
CPU : 2 x 3.0 GHz
Memory: 6 GB Ram
The server also hosts a file system which has 800GB of data. We are facing issue when any cron job which runs to take the backup of the files is running in the system. There is contention between the Linux process and the JVM process, which leads to crash of the JVM (which means the application server goes down). Following are my questions in this regards.
1. How can we see the usage of a cron job, basically the time, it has taken to complete the process and when it has started and end? Is this data persistent in any log file?
2. How is the priority of the cron job over the JVM process? Consider the cron job is not configured to run as root, but as one of the user in the system.
3. How can we monitor if there is content between the Linux process and java process. Basically I am looking for any information which could give me information of the status of the java process and Linux process when there is a contention.
4. Is their any ideal configuration or suggestion to avoid above situation?
We have server environment having the following configuration
1. How can we see the usage of a cron job, basically the time, it has taken to complete the process and when it has started and end? Is this data persistent in any log file?
I've had a similar situation, I found that the only way to get the above info (starting time and ending time) is to write your own. I. e. in the script / binary that gets called by cron to actually do your backup, have some kind of record keeping code. When I want to find start and end times for my cronjob I was struggling with (it was also intended to make backups) I inserted code that emailed me when it would start, and when it finished. Thus I ended up with two emails, the first with the starting time of the cronjob, the second with the ending time of the cronjob. Subtract the two times and you have runtime for the job.
Quote:
2. How is the priority of the cron job over the JVM process? Consider the cron job is not configured to run as root, but as one of the user in the system.
Hmm... you are correct, unless you change something the crontab should run as the user who "crontab-e" 'ed the crontab to create it. The thing is that the Sun JVM is SLOW and usually running Java code is demanding on any system, due to Java virtualisation overhead. Are you sure that your JVM is not automatically running with high-priority to get it to run as fast as possible? This of course could not cause your problem, since if this is the case the user with higher nice will just wait longer - i. e. the JVM should still run fine, the backup will just go slower.
Quote:
3. How can we monitor if there is content between the Linux process and java process. Basically I am looking for any information which could give me information of the status of the java process and Linux process when there is a contention.
Hmm well you can find out if a file is being contended. For example, on my system, to find all the files being used by Apache I can do:
Note sure if it will help. Substitute for the "java" process on your system. Does it in general keep -data- files open for long periods? As you can see aboe, apache has mainly libraries and its binary that are open at any given time. It depends on what your Java application does with the data files that your backup process is after...
Quote:
4. Is their any ideal configuration or suggestion to avoid above situation?
Well, my situation is that I have an online webserver that is taking lots of file uploads each 24 hours. I wrote a cronjob that uses the time of LEAST activity (around 00:00 each day) to do the backup of these files to an offsite backup webserver. This seems to work, and it does limit contention, since very few people (if anybody at all!) is using the primary file upload site at 00:00 local time... not sure if this will help or is applicable to your situation.
Additionally, you seem to be running a very old kernel version? Have you considered upgrading to the 2.6 kernel series? I'm guessing they have improved file locking and mutex / semaphores and stuff that will limit contention or concurrency issues as you seem to be experiencing.
Besides that, how exactly is your Java application written? Does it tend to keep files open for long periods? A workaround might be to have it open a file, work with it, and then close it. I. e. this might be slower (you'll constantly be opening and closing files - it might even be bad practice and make the JVM unstable), but might help with not somehow locking a data file for a write all the time when it is only needed to be open in certain situations for a write...?
4 Is their any ideal configuration or suggestion to avoid above situation?
The clients are accessing the application(s) around the world and hence there is no ideal time when the load on the server is zero or very less. But in weekends the load is minimal. Yes we are running a low version of kernal also, we have plans to upgard to latest version, but that is not possible in this year budget. This version of the OS had known issue in memory management. But still one more question remains
Consider the scenario in which the job take the backup of data and dump to offline storage like a tape. Since the amount of data is huge 800GB this process will take a lots of time, and in between if any JVM processes is running and if the load on the server increase then that will have drastic impact on the process. Since both are critical in business point of view. You cannt mess with back process nor u cannt slow down users accessing?
The clients are accessing the application(s) around the world and hence there is no ideal time when the load on the server is zero or very less.
I see your point. However, even mega-sites like Facebook are occasionally slow or not available (I've experienced myself) - surely five or ten minutes of down time is acceptable? Or half an hour, on a Sunday morning, for example, of slow performance can surely be acceptable.
Quote:
But in weekends the load is minimal. Yes we are running a low version of kernal also, we have plans to upgard to latest version, but that is not possible in this year budget. This version of the OS had known issue in memory management. But still one more question remains
Ok, I did not know that particular fact about your kernel... this could explain a lot. What form does the memory management issue take? If it can lead to excessive virtual memory use, the slowdown is self-explanatory, i. e. as the kernel runs out of RAM it'll start using swap (virtual memory) and a HDD is often hundreds of thousands if not millions of times slower than main RAM - thus the slow performance?
Quote:
Consider the scenario in which the job take the backup of data and dump to offline storage like a tape. Since the amount of data is huge 800GB this process will take a lots of time, and in between if any JVM processes is running and if the load on the server increase then that will have drastic impact on the process. Since both are critical in business point of view. You cannt mess with back process nor u cannt slow down users accessing?
I see your problem... This is an unavoidable fact of your backup strategy - i. e. accumulate / backup accumulate/ backup in 24 hour (or whatever) cycles. It is inevitable that, for the period that you are busy with the backup, the system will be slower.
However, we took a different approach at a government concern I once worked at (the local fire & rescue services' "911-like" control center). There were two servers, with the secondary "watching" the primary for any transactions, and immediately copying any that occur. This, of course, implied a performance hit for each and every single transaction on the entire system - but it gave us instant recoverability (vs. "only up to moment of last backup" in your kind of situation) - BUT it also imparted a user expectation. I. e. users of the system were used to it having a certain speed, and it would -always- run at that speed - never slower, or faster - while instant backups were taking place every single transaction.
Thus a user expectation of "what the !@#$!@ are the idiots up to now" when a backup was running and the system is slow, never occurred - because everybody was ALREADY used to the system speed, and it was ALREADY doing a backup of each transaction...
Shaping user expectations might help to give you a good reputation with your users.
However, it seems unlikely that you will be getting a backup system, so you'll have to put up with the strategy you have now. But, if you plan future expansion or purchases, try the above approach. If I'm not mistaken, Facebook (for example) uses a much more involved version of the above basic concept in their farms, but the effect is the same. They are down almost never (99.9% uptime seems true for them) and most of the time (vs the times I refer to above) their performance is linear - instead of dipping badly while an apparent backup is running.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.