-   Linux - Kernel (
-   -   Understanding /proc/pid/sched (

DKSL 09-28-2012 11:52 PM

Understanding /proc/pid/sched
I am trying to understand the output of the /proc/pid/sched. Dose any one have a good idea about what these fields mean?

se.exec_start : 682057.318915
se.vruntime : 561276.188450
se.sum_exec_runtime : 238.118766
nr_switches : 393
nr_voluntary_switches : 44
nr_involuntary_switches : 349
se.load.weight : 1024
policy : 0
prio : 120
clock-delta : 1291

Specially I would like to know about se.sum_exec_runtime and clock-delta.

suttiwit 10-01-2012 10:43 AM

What are you trying to achieve?

DKSL 10-01-2012 11:59 AM

I want to measure process CPU usage time and wondering if se.sum_exec_runtime gives that value?

suttiwit 10-01-2012 10:31 PM

what is a cpu usage time?

DKSL 10-02-2012 12:18 AM

when you execute the time it sends on the CPU, maybe its called the CPU time-slice.. I periodically take a record of the /proc/pid/sched file with different priority levels. The ultimate goal is to see if a high priority would result in a high se.sum_exec_runtime value. will se.sum_exec_runtime give the CPU usage time?

DKSL 10-02-2012 01:01 AM

what is the units of this field(se.sum_exec_runtime)? is it nano seconds?

suttiwit 10-02-2012 04:00 AM


Originally Posted by DKSL (Post 4794614)
what is the units of this field(se.sum_exec_runtime)? is it nano seconds?

Well, if you mean that: the amount of time that the CPU processes something then, it is possible its nanosecond.

DKSL 10-02-2012 09:30 AM

Have any idea about how to calculate this value theoretically? when priority (nice) values are changed, so I can compare with the experimental value and do a comparison. Also to begin things I found in another link that the se.sum_exec_runtime can be calculated using utime(time spent in userspace) & stime(time spent in kernel space). Any idea on how to do this?

sundialsvcs 10-02-2012 02:41 PM

If you really want to get to the bottom of these things, you simply have to turn to the one truly-authoritative source: the kernel source-code, which can easily be browsed on-line e.g. at

In this case, the operative routine is proc_sched_show_task which is found in /kernel/sched/debug.c. (It is called from /fs/proc/base.c.

The Achilles Heel of your current approach is that "the presence of the experimenter changes the outcome of the experiment." Your (of course, very active...) monitoring program is consuming CPU cycles, too. In fact, it is competing with the very thing that you are attempting to measure; thereby invalidating any results obtained.

The actual CPU-allocation performance of any process on the system is, from nanosecond to nanosecond, completely affected by every other process that surrounds it; by their exact (and unpredictable) execution-time demands, and by everything ("else") that the system may be doing at the time.

The "nice" value has at best only an indirect influence upon the scheduler's behavior ... as does every other factor that the scheduler is programmed to consider. In fact, the scheduler is programmed to strike a balance between several conflicting goals ... "We want to favor this task but we must not starve that one," and so on.

If you want to fairly gauge the progress of a task, you should build that instrumentation into that task, and you should do so in terms of what 'that task' is doing, i.e. "for the business" and in terms of "business-oriented bright line rules." For example, many decades ago(!) I had to assess the performance of a batch-processing system in order to determine whether new hardware needed to be purchased to support it. To do so, I postulated that "90% of class-A jobs must begin execution within 60 seconds of being submitted, and must complete within 120 seconds thereafter." (Two separate rules.) I didn't care about dispatching algorithms or time slices or any of that ... I framed my inquiry in terms of "out-the-door, important-to-clients results (or lack thereof)." And I was able to base my assessment entirely upon historical data. I plotted whether-or-not these jobs actually met those standards. "Yes," or "No" ... "and don't bother to ask for excuses."

The notion here is the same: "What you really want to know is whether-or-not the system 'passed' or 'failed' your test, whatever 'your test' may happen to be." You set up an experiment and gather data. Then, you change one experiential parameter and repeat the experiment. You gather data in a way that doesn't influence the outcome because it remains constant throughout. And, you structure the experiment, not in terms of the ("who cares, anyway ...?") Linux scheduling algorithms ... but strictly in terms of what you need to know in order to address the underlying business problem ... and in such a way that it clearly points the way to a strategy that you can implement (and test!) right now.

All times are GMT -5. The time now is 07:45 PM.