LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 10-26-2009, 12:53 AM   #1
sanu_sasidharan
LQ Newbie
 
Registered: Aug 2009
Location: Bangalore India
Posts: 3

Rep: Reputation: 0
Java process VS Linux process.


We have server environment having the following configuration

Linux : Red Hat Enterprise Linux AS release 3 (Taroon Update 5)[ 2.4.21-32.ELsmp #1 SMP ]
WAS : 6.0.2.11
CPU : 2 x 3.0 GHz
Memory: 6 GB Ram

The server also hosts a file system which has 800GB of data. We are facing issue when any cron job which runs to take the backup of the files is running in the system. There is contention between the Linux process and the JVM process, which leads to crash of the JVM (which means the application server goes down). Following are my questions in this regards.

1. How can we see the usage of a cron job, basically the time, it has taken to complete the process and when it has started and end? Is this data persistent in any log file?
2. How is the priority of the cron job over the JVM process? Consider the cron job is not configured to run as root, but as one of the user in the system.
3. How can we monitor if there is content between the Linux process and java process. Basically I am looking for any information which could give me information of the status of the java process and Linux process when there is a contention.
4. Is their any ideal configuration or suggestion to avoid above situation?
 
Old 10-26-2009, 03:38 AM   #2
rylan76
Senior Member
 
Registered: Apr 2004
Location: Potchefstroom, South Africa
Distribution: Fedora 17 - 3.3.4-5.fc17.x86_64
Posts: 1,552

Rep: Reputation: 103Reputation: 103
Quote:
Originally Posted by sanu_sasidharan View Post
We have server environment having the following configuration
1. How can we see the usage of a cron job, basically the time, it has taken to complete the process and when it has started and end? Is this data persistent in any log file?
I've had a similar situation, I found that the only way to get the above info (starting time and ending time) is to write your own. I. e. in the script / binary that gets called by cron to actually do your backup, have some kind of record keeping code. When I want to find start and end times for my cronjob I was struggling with (it was also intended to make backups) I inserted code that emailed me when it would start, and when it finished. Thus I ended up with two emails, the first with the starting time of the cronjob, the second with the ending time of the cronjob. Subtract the two times and you have runtime for the job.

Quote:
2. How is the priority of the cron job over the JVM process? Consider the cron job is not configured to run as root, but as one of the user in the system.
Hmm... you are correct, unless you change something the crontab should run as the user who "crontab-e" 'ed the crontab to create it. The thing is that the Sun JVM is SLOW and usually running Java code is demanding on any system, due to Java virtualisation overhead. Are you sure that your JVM is not automatically running with high-priority to get it to run as fast as possible? This of course could not cause your problem, since if this is the case the user with higher nice will just wait longer - i. e. the JVM should still run fine, the backup will just go slower.

Quote:
3. How can we monitor if there is content between the Linux process and java process. Basically I am looking for any information which could give me information of the status of the java process and Linux process when there is a contention.
Hmm well you can find out if a file is being contended. For example, on my system, to find all the files being used by Apache I can do:

1. Get httpd process' PID:

Code:
[rylan@development generic]$ ps -A | grep httpd
 1977 ?        00:00:00 httpd
 1984 ?        00:00:02 httpd
 1986 ?        00:00:00 httpd
 1987 ?        00:00:03 httpd
 1988 ?        00:00:13 httpd
 2322 ?        00:00:01 httpd
 3897 ?        00:00:03 httpd
 3900 ?        00:00:00 httpd
 3986 ?        00:00:00 httpd
 3995 ?        00:00:03 httpd
 4392 ?        00:00:00 httpd
[rylan@development generic]$
2. See the files being used by httpd:

Code:
[rylan@development generic]$ lsof | grep 1977
httpd     1977      root  cwd   unknown                               /proc/1977/cwd (readlink: Permission denied)
httpd     1977      root  rtd   unknown                               /proc/1977/root (readlink: Permission denied)
httpd     1977      root  txt   unknown                               /proc/1977/exe (readlink: Permission denied)
httpd     1977      root  mem       REG       3,66   198840    385821 /usr/lib/libidn.so.11.5.19
httpd     1977      root  mem       REG       3,66     7944   8644036 /lib/libcom_err.so.2.1
httpd     1977      root  mem       REG       3,66   174252    385829 /usr/lib/libgssapi_krb5.so.2.2
httpd     1977      root  mem       REG       3,66    30788    385826 /usr/lib/libkrb5support.so.0.1
httpd     1977      root  mem       REG       3,66   157196    385827 /usr/lib/libk5crypto.so.3.0
httpd     1977      root  mem       REG       3,66   100992   8644033 /lib/libnsl-2.5.so
httpd     1977      root  mem       REG       3,66    76392   8644035 /lib/libresolv-2.5.so
httpd     1977      root  mem       REG       3,66   121652   8644021 /lib/ld-2.5.so
httpd     1977      root  mem       REG       3,66  1577052   8644022 /lib/libc-2.5.so
httpd     1977      root  mem       REG       3,66    99252    366230 /usr/lib/libsasl2.so.2.0.22
httpd     1977      root  mem       REG       3,66    25420    385864 /usr/lib/libgdbm.so.2.0.0
httpd     1977      root  mem       REG       3,66    16528   8644023 /lib/libdl-2.5.so
httpd     1977      root  mem       REG       3,66   125564   8644024 /lib/libpthread-2.5.so
httpd     1977      root  mem       REG       3,66    75284    368726 /usr/lib/libz.so.1.2.3
httpd     1977      root  mem       REG       3,66   240476    385869 /usr/lib/libldap-2.3.so.0.2.15
httpd     1977      root  mem       REG       3,66   247920    366320 /usr/lib/libcurl.so.3.0.0
httpd     1977      root  mem       REG       3,66   133120   8644029 /lib/libexpat.so.0.5.0
httpd     1977      root  mem       REG       3,66   153224    385802 /usr/lib/libpng12.so.0.10.0
httpd     1977      root  mem       REG       3,66    44088   8644025 /lib/librt-2.5.so
httpd     1977      root  mem       REG       3,66   136036    369551 /usr/lib/libjpeg.so.62.0.0
httpd     1977      root  mem       REG       3,66  1238928   8644037 /lib/libcrypto.so.0.9.8b
httpd     1977      root  mem       REG       3,66   551468    385828 /usr/lib/libkrb5.so.3.2
httpd     1977      root  mem       REG       3,66   280688   8644038 /lib/libssl.so.0.9.8b
httpd     1977      root  mem       REG       3,66    27836   8644046 /lib/libcrypt-2.5.so
httpd     1977      root  mem       REG       3,66  1011024   8644048 /lib/libdb-4.3.so
httpd     1977      root  mem       REG       3,66   672838   1737457 /usr/local/apache2/bin/httpd
httpd     1977      root  mem       REG       3,66 55517136    363908 /usr/lib/locale/locale-archive
httpd     1977      root  mem       REG       3,66    46740   8642345 /lib/libnss_files-2.5.so
httpd     1977      root  mem       REG       3,66  1750855    365616 /usr/local/lib/libxml2.so.2.6.19
httpd     1977      root  mem       REG       3,66   708002    365041 /usr/local/lib/libfreetype.so.6.3.8
httpd     1977      root  mem       REG       3,66    53988    375569 /usr/lib/liblber-2.3.so.0.2.15
httpd     1977      root  mem       REG       3,66  7123909   1737464 /usr/local/apache2/modules/libphp5.so
httpd     1977      root  mem       REG       3,66   208344   8644028 /lib/libm-2.5.so
httpd     1977      root  mem       REG       3,66   189993   1735768 /usr/local/apache2/lib/libapr-0.so.0.9.6
httpd     1977      root  DEL       REG        0,8               6620 /dev/zero
httpd     1977      root  mem       REG       3,66   128900   1735805 /usr/local/apache2/lib/libaprutil-0.so.0.9.6
httpd     1977      root NOFD                                         /proc/1977/fd (opendir: Permission denied)
[rylan@development generic]$
Note sure if it will help. Substitute for the "java" process on your system. Does it in general keep -data- files open for long periods? As you can see aboe, apache has mainly libraries and its binary that are open at any given time. It depends on what your Java application does with the data files that your backup process is after...

Quote:
4. Is their any ideal configuration or suggestion to avoid above situation?
Well, my situation is that I have an online webserver that is taking lots of file uploads each 24 hours. I wrote a cronjob that uses the time of LEAST activity (around 00:00 each day) to do the backup of these files to an offsite backup webserver. This seems to work, and it does limit contention, since very few people (if anybody at all!) is using the primary file upload site at 00:00 local time... not sure if this will help or is applicable to your situation.

Additionally, you seem to be running a very old kernel version? Have you considered upgrading to the 2.6 kernel series? I'm guessing they have improved file locking and mutex / semaphores and stuff that will limit contention or concurrency issues as you seem to be experiencing.

Besides that, how exactly is your Java application written? Does it tend to keep files open for long periods? A workaround might be to have it open a file, work with it, and then close it. I. e. this might be slower (you'll constantly be opening and closing files - it might even be bad practice and make the JVM unstable), but might help with not somehow locking a data file for a write all the time when it is only needed to be open in certain situations for a write...?

Last edited by rylan76; 10-26-2009 at 03:40 AM.
 
Old 10-27-2009, 04:14 AM   #3
sanu_sasidharan
LQ Newbie
 
Registered: Aug 2009
Location: Bangalore India
Posts: 3

Original Poster
Rep: Reputation: 0
Quote:
4 Is their any ideal configuration or suggestion to avoid above situation?
The clients are accessing the application(s) around the world and hence there is no ideal time when the load on the server is zero or very less. But in weekends the load is minimal. Yes we are running a low version of kernal also, we have plans to upgard to latest version, but that is not possible in this year budget. This version of the OS had known issue in memory management. But still one more question remains

Consider the scenario in which the job take the backup of data and dump to offline storage like a tape. Since the amount of data is huge 800GB this process will take a lots of time, and in between if any JVM processes is running and if the load on the server increase then that will have drastic impact on the process. Since both are critical in business point of view. You cannt mess with back process nor u cannt slow down users accessing?
 
Old 10-27-2009, 07:47 AM   #4
rylan76
Senior Member
 
Registered: Apr 2004
Location: Potchefstroom, South Africa
Distribution: Fedora 17 - 3.3.4-5.fc17.x86_64
Posts: 1,552

Rep: Reputation: 103Reputation: 103
Quote:
Originally Posted by sanu_sasidharan View Post
The clients are accessing the application(s) around the world and hence there is no ideal time when the load on the server is zero or very less.
I see your point. However, even mega-sites like Facebook are occasionally slow or not available (I've experienced myself) - surely five or ten minutes of down time is acceptable? Or half an hour, on a Sunday morning, for example, of slow performance can surely be acceptable.

Quote:
But in weekends the load is minimal. Yes we are running a low version of kernal also, we have plans to upgard to latest version, but that is not possible in this year budget. This version of the OS had known issue in memory management. But still one more question remains
Ok, I did not know that particular fact about your kernel... this could explain a lot. What form does the memory management issue take? If it can lead to excessive virtual memory use, the slowdown is self-explanatory, i. e. as the kernel runs out of RAM it'll start using swap (virtual memory) and a HDD is often hundreds of thousands if not millions of times slower than main RAM - thus the slow performance?

Quote:
Consider the scenario in which the job take the backup of data and dump to offline storage like a tape. Since the amount of data is huge 800GB this process will take a lots of time, and in between if any JVM processes is running and if the load on the server increase then that will have drastic impact on the process. Since both are critical in business point of view. You cannt mess with back process nor u cannt slow down users accessing?
I see your problem... This is an unavoidable fact of your backup strategy - i. e. accumulate / backup accumulate/ backup in 24 hour (or whatever) cycles. It is inevitable that, for the period that you are busy with the backup, the system will be slower.

However, we took a different approach at a government concern I once worked at (the local fire & rescue services' "911-like" control center). There were two servers, with the secondary "watching" the primary for any transactions, and immediately copying any that occur. This, of course, implied a performance hit for each and every single transaction on the entire system - but it gave us instant recoverability (vs. "only up to moment of last backup" in your kind of situation) - BUT it also imparted a user expectation. I. e. users of the system were used to it having a certain speed, and it would -always- run at that speed - never slower, or faster - while instant backups were taking place every single transaction.

Thus a user expectation of "what the !@#$!@ are the idiots up to now" when a backup was running and the system is slow, never occurred - because everybody was ALREADY used to the system speed, and it was ALREADY doing a backup of each transaction...

Shaping user expectations might help to give you a good reputation with your users.

However, it seems unlikely that you will be getting a backup system, so you'll have to put up with the strategy you have now. But, if you plan future expansion or purchases, try the above approach. If I'm not mistaken, Facebook (for example) uses a much more involved version of the above basic concept in their farms, but the effect is the same. They are down almost never (99.9% uptime seems true for them) and most of the time (vs the times I refer to above) their performance is linear - instead of dipping badly while an apparent backup is running.

Last edited by rylan76; 10-27-2009 at 07:50 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Finding the Process ID of a Process While Initiating the Process senthilmuthiah Linux - Newbie 7 04-02-2009 10:37 AM
Java Linux Process memory footprint with and without -Xmx rolandofghent Linux - General 5 09-27-2007 09:08 AM
process and child process in java xhi Programming 4 03-28-2006 10:36 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 07:44 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration