If you really want to get to the bottom of these things, you simply have to
turn to the one truly-authoritative source: the kernel source-code, which can easily be browsed on-line e.g.
In this case, the operative routine is proc_sched_show_task
which is found in /kernel/sched/debug.c
. (It is called from /fs/proc/base.c
The Achilles Heel of
your current approach is that "the presence of the experimenter changes
the outcome of the experiment." Your (of course, very active...) monitoring program is consuming CPU cycles, too. In fact, it is competing
with the very thing that you are attempting to measure; thereby invalidating any results obtained.
The actual CPU-allocation performance of any process on the system is, from nanosecond to nanosecond, completely affected by every other process that surrounds it; by their exact (and unpredictable) execution-time demands, and by everything ("else") that the system may be doing at the time.
The "nice" value has at best only an indirect
influence upon the scheduler's behavior ... as does every other factor
that the scheduler is programmed to consider. In fact, the scheduler is programmed to strike a balance between several conflicting goals ... "We want to favor this task but we must not starve that one,"
and so on.
If you want to fairly gauge the progress of a task, you should build that instrumentation into that task,
and you should do so in terms of what 'that task' is doing,
i.e. "for the business" and
in terms of "business-oriented
bright line rules." For example, many decades ago(!) I had to assess the performance of a batch-processing system in order to determine whether new hardware needed to be purchased to support it. To do so, I postulated that "90% of class-A jobs must begin execution within 60 seconds of being submitted, and must complete within 120 seconds thereafter."
separate rules.) I didn't care about dispatching algorithms or time slices or any of that ... I framed my inquiry in terms of "out-the-door, important-to-clients results (or lack thereof)."
And I was able to base my assessment entirely upon historical
data. I plotted whether-or-not these jobs actually met those standards. "Yes," or "No" ... "and don't bother to ask for excuses."
The notion here is the same: "What you really
want to know is whether-or-not the system 'passed' or 'failed' your test, whatever 'your test' may happen to be." You set up an experiment and gather data. Then, you change one
experiential parameter and repeat the experiment. You gather data in a way that doesn't influence the outcome because it remains constant throughout. And,
you structure the experiment, not
in terms of the ("who cares, anyway ...?")
Linux scheduling algorithms ... but strictly in terms of what you need to know in order to address the underlying business problem ... and in such a way that it clearly
points the way to a strategy that you can implement (and test!) right now