LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   How to see the History of processes that has been completed/killed? (https://www.linuxquestions.org/questions/linux-newbie-8/how-to-see-the-history-of-processes-that-has-been-completed-killed-4175448989/)

premkumar.st 02-07-2013 04:32 AM

How to see the History of processes that has been completed/killed?
 
Hi,

Can someone please advice me? How to get the history of processes that has been completed or killed. This is required for our tracking purposes and also this can help us to check the performance of system for which application it occupies more resources from server.
I know we can keep a basic script to track this through Cron jobs. But even I'm looking for that too.

OS: RHEL 5.8 64 bit
System type: Virutual
Processes required: ALL application related.

Thanks in Advance!

Prem

jv2112 02-07-2013 04:53 AM

You could try running Top in batch mode and send to a log then sort in order of the process so you can see when it stops.


Code:


top -b > log.txt


premkumar.st 02-07-2013 07:07 AM

Thanks for your time.

But this will get the current and future processes details. Actually i'm looking for a way to get old history processes details (I know it should be made at the time of application deployment/need).

Ok. I'll tell the actual need, we noticed the /tmp FS is almost full (100%) and we have been just tracking for a process that holds more space under /tmp. But unforunately, that has been fixed automatically (which it means /tmp back to normal not sure how it is). But the app.team suspect us, that we deleted the files/processes to bring back the /tmp normal. So, we are looking for the best way to prove that the /tmp is PURELY depends on application functionality and it is used whenever the application related process needs an tmp space and it will automatically releases the space and all it is based on application.

So,for that we just seeking your help to get the details for history of processes that uses more space /tmp FS.

Thanks in Advance!

Prem

unSpawn 02-07-2013 09:13 AM

Quote:

Originally Posted by premkumar.st (Post 4886200)
Actually i'm looking for a way to get old history processes details (..) Ok. I'll tell the actual need,

Some advice for next time: make it a habit to present information in a detailed and complete way. I hope you understand the efficiency of that.


Quote:

Originally Posted by premkumar.st (Post 4886200)
(I know it should be made at the time of application deployment/need).

That indicates you should review requirements wrt testing and roll-out procedures. And does it really make sense to ask for historical data anyway if you know haven't implemented data collection beforehand?..


Quote:

Originally Posted by premkumar.st (Post 4886200)
But unforunately, that has been fixed automatically (which it means /tmp back to normal not sure how it is).

Then find out.


Quote:

Originally Posted by premkumar.st (Post 4886200)
we noticed the /tmp FS is almost full (100%) and we have been just tracking for a process that holds more space under /tmp. (..) we are looking for the best way to prove that the /tmp is PURELY depends on application functionality and it is used whenever the application related process needs an tmp space and it will automatically releases the space and all it is based on application.

Since you need detailed process statistics a generic SAR (Atsar, Atop, Collectl, Dstat, sar) probably won't do. There's several ways to trigger logging including inotify, FUSE LoggedFS, the audit service, systemtap or (ancient) dnotify. What you use depends will be a trade-off between what the system already offers, invasiveness of alternatives and the granularity with which stats must be logged. You could start with say inotifywait triggering a shell script that collects information using 'lsof'.

shivaa 02-07-2013 09:28 AM

Which processes - all system plus user processes OR all processes generated by comamnds invoked by users?

If your purpose is just to check history of commands/processes for a single user, you can:-
Code:

~$ history
Or check corresponding user's .bash_history file for all commands fired by the user
Code:

~$ cat .bash_history
Else you will need to set a log file as jv2112 already said above. It will keep updating.

chrism01 02-07-2013 07:09 PM

history only shows processes initiated; OP wants to know completion times.

As per unSpawn, probably inotify/inotifywait, although adding instrumentation to the app if you can pin down which one(s) it is would be good.

jpollard 02-07-2013 09:27 PM

You want "process accounting", which records all the information about a process as it runs, and terminates. Enabling it causes the kernel to provide pacct (process account) records.

chrism01 02-07-2013 10:35 PM

Darn, should have thought of that ;)
Make sure you have enough disk space and/or logrotation setup.

unSpawn 02-08-2013 05:40 AM

Quote:

Originally Posted by jpollard (Post 4886719)
You want "process accounting", which records all the information about a process as it runs, and terminates.

What 'top', generic SAR and pacct have in common is that they only record and display executed commands by users. But as far as I am aware none of them have or ever had the granularity to link up or differentiate between command line command execution and command execution via say an 'at' script and neither are they capable of exposing application internals like say "give me \( all unlink syscalls where user is nobody and mount point is /tmp \)".

So, if I read the OPs question right, then pacct could help answer the part where he needs "evidence", or prove by omission of, that he nor any other logged in user deleted files using commands executed solely in their shell session (/bin/rm, /bin/unlink, /sbin/service, etc, etc.).

But neither 'top', SAR or pacct would show /tmp usage specifically for that application if the application is a binary. Strace can (except it's cumbersome having to attach it each time a process starts), the audit service can (example rule and output here), inotify-tools can and FUSE LoggedFS can (example rule and output here).

jpollard 02-08-2013 07:05 AM

Top itself doesn't record. It takes repeated snapshots, and hence will miss any short lived process. SAR records every process termination.

The records should include (though I haven't looked at them for quite a while) recording the "control tty" (if attached) and process group lead, that should allow reconstruction of entire sessions (though it is very slow at doing so). Some implementations used this to do chargeback accounting for "jobs" (the reconstructed session).

The advantage SAR has is that it records accumulated usage... The disadvantage SAR has is that this usage is counted/averaged over the lifetime of the process - and that doesn't record the actual usage where peaks (such as temporary high memory usage) can be recognized, and used for system tuning.

unSpawn 02-08-2013 07:21 AM

If I read the OPs question right and assert that "application" here means a binary then, with all due respect, you're still missing the point IMHO. Neither history, top or psacct is capable of recording and displaying /tmp usage by an application like the OP says he wants, and neither would reconstructing sessions help with that in any way, simply because these tools do not record bare system call usage like for example strace or audit would. The only other way, like chrism01 already suggested, would be to add instrumentation to the application itself and make it emit debug nfo.

jpollard 02-08-2013 07:43 AM

Disk usage would handled by the disk quota subsystem. Normally that is attributed to the user rather than to the process. Also the inotify method has a documented race condition that prevents it from being fully reliable. It can miss tagging file activity in a directory if that activity occurs immediately after the directory is created as there is no time for the inotify process to tag the new directory before the other file activity. Now the logging fuser trace covers that - but at a severe performance penalty (both the basic fuse overhead plus the logging activity).

One of the problems with attempting such a level of tracking is that a file may be deleted from the file system, but continue to be used (file is still open). This allows data to be written with the explicit plan that if the process terminates, the usage will be discarded.

Disk quotas will report the usage, and appropriately attribute it to the user doing it. Not so sure of the inotify method, as once a file is deleted it cannot be tracked... And not sure of the fuse logging in that situation either as there is no path associated with the file.

unSpawn 02-08-2013 07:59 AM

Quote:

Originally Posted by jpollard (Post 4886999)
Disk usage would handled by the disk quota subsystem. Normally that is attributed to the user rather than to the process.

I'm not going to ask you to explain how quota data would show when (time stamp) or in what way (file name records, file access) an application uses say /tmp because it simply can't.


Quote:

Originally Posted by jpollard (Post 4886999)
Also the inotify method has a documented race condition that prevents it from being fully reliable. It can miss tagging file activity in a directory if that activity occurs immediately after the directory is created as there is no time for the inotify process to tag the new directory before the other file activity. Now the logging fuser trace covers that - but at a severe performance penalty (both the basic fuse overhead plus the logging activity).

While it's sure nice to know these intricacies none of that nitpicking however detracts from the fact that psacct can't provide while auditd, inotify or FUSE LoggedFS can.

jpollard 02-08-2013 08:09 AM

Didn't say quota can - it can only attribute usage to a user.

And I identified a case where inotify can fail, and another where the FUSE loggedFS may also fail.

Now the auditd approach looks interesting... though you might not be able to track usage to a specific file (especially after it is deleted). It does look like the most reliable.


All times are GMT -5. The time now is 11:21 PM.