LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-19-2012, 05:40 AM   #1
ankit,garg
LQ Newbie
 
Registered: Jan 2012
Location: Noida,India
Posts: 20

Rep: Reputation: Disabled
Threads, CPU and Memory


Hello,

I have a multi threaded application and I want to see the cpu usuage and memory consumed by a single thread during the program execution.

Is it possible to do it by using any system command or embedding code in my program?

Thanks
Ankit
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 01-19-2012, 08:21 AM   #2
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
What do you mean by "memory consumed by a single thread"?

All the threads in a process share the same address space. Memory allocated by any thread is usable by any other thread.
 
Old 01-19-2012, 11:54 PM   #3
ankit,garg
LQ Newbie
 
Registered: Jan 2012
Location: Noida,India
Posts: 20

Original Poster
Rep: Reputation: Disabled
Suppose my process is consuming 3176 KB and there are 10 threads running inside this process so I want the memory consumed by a single thread from 3176 KB.

Same for CPU Usage.
 
Old 01-20-2012, 03:00 AM   #4
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948
Read /proc/self/task/tid/statm where tid is the thread ID (use gettid() as pthreads uses a different thread identifier). See man 5 proc for the descriptions of the fields in statm (and the other files, in case you decide you need further info). To get the values in bytes, you'll need to multiply by sysconf(_SC_PAGE_SIZE) as most values are in pages. You'll get much more data if you parse /proc/self/task/tid/stat instead.

Note, however, that the stack size is fixed for each thread you start, and the above will not tell you how much of the stack each thread is using. For that, you need to measure it yourself, perhaps using something like I described in my post in your previous thread.
 
Old 01-20-2012, 03:32 AM   #5
ankit,garg
LQ Newbie
 
Registered: Jan 2012
Location: Noida,India
Posts: 20

Original Poster
Rep: Reputation: Disabled
Thanks for the help. I will check it as described by your and come back if I have anything else related to this.
 
Old 01-20-2012, 09:38 AM   #6
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,659
Blog Entries: 4

Rep: Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939
The essential difference between a thread and a process is that ... a process is the thing that "owns" resources (such as files and memory), and multiple threads can run within the context of a single process.

Therefore, the notion "how much memory is used by a single thread" has no meaning at all.

"I can see from the electric meter that the occupants of that building consumed 21 kWh of electricity today ... but I have no way to know which one of you turned on which electrical appliance."

Last edited by sundialsvcs; 01-20-2012 at 09:39 AM.
 
Old 01-20-2012, 10:52 AM   #7
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by sundialsvcs View Post
the notion "how much memory is used by a single thread" has no meaning at all.
I tried to explain that, but in post #3, it is clear that attempt merely bounced off the OP's pre conceptions.

Quote:
"I can see from the electric meter that the occupants of that building consumed 21 kWh of electricity today ... but I have no way to know which one of you turned on which electrical appliance."
That is a bad analogy, because it describes a measurement difficultly difficulty, not a lack of definition of what you would like to measure.

Two people are watching a TV show while a third is sort of watching while doing something else in the same room. Who is using the electricity consumed by that TV? The person who last turned the TV on, (but left the room when someone else changed the channel)? The person who last selected the channel? Some apportioned value among those in the room depending on whether they are really watching?

Attributing memory use within a process by thread is less well defined than who is using the electricity in my TV example. It is a definition question more than a measurement question.

The use of stack memory is kind of a special case. Threads might interact in such a way that data on one thread's stack is also used by other threads, but that is rare and doesn't necessarily invalidate the idea that each thread is solely responsible for memory use on its stack. But the stacks are typically a small part of the total memory use of a multi threaded process. So attributing the stack use still leaves most of the memory use not attributed to specific threads.

Last edited by johnsfine; 01-20-2012 at 10:55 AM.
 
Old 01-20-2012, 01:40 PM   #8
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948
Quote:
Originally Posted by sundialsvcs View Post
Therefore, the notion "how much memory is used by a single thread" has no meaning at all.
I disagree.

I agree that the notion "how much memory is allocated by a single thread" has no meaning. /proc/self/task/*/statm and the other per-task status files are essentially identical for all threads. From the kernel's point of view, all allocations are done by the process; it does not care a whit which thread does it.

However, it is possible to track "how much memory was first accessed by each thread". Perhaps not exactly, but approximately.

When a process (whichever thread, does not matter) allocates memory, the kernel usually sets up only the virtual memory, not the actual RAM. You can request the kernel to populate the pages, too, but it is counterproductive in most situations.

When the process first accesses a page, a page fault is generated. If the page fault is on a page that the kernel has already set up, it will map an actual page in RAM, filled with zeroes, there, and let the process continue. There are other types of mappings, like file-backed mappings, in which case the kernel may e.g. load the file contents there. If there is no mapping at all, you'll get segfault or bus error.

These page faults are counted for each thread separately on Linux. The tenth field in /proc/self/stat does describe the number of minor page faults for the entire process, but /proc/self/task/tid/stat fields describe them for each thread separately.

This means that you can estimate the amount of memory allocated by each thread by checking on how many minor page faults the thread has caused.

Obviously, this is very imprecise. Library functions use temporary allocations, so those affect the counts. Allocations are done in larger chunks, and initially there is almost always some available in an already obtained page. Small allocations do not therefore show up. The GNU C library at least does not release allocations back to the kernel immediately. Many times the released memory is used to satisfy a later allocation instead. Because those pages are already faulted in, they're not accounted for in the minor page faults. Also, if you are tight enough on memory that some of the pages are swapped out, the fault counts will probably get a bit haywire.

On the other hand, if you use mmap(NULL,size,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,(off_t)0)/munmap() instead of malloc()/free() to allocate memory, those will be accounted very reliably in the minor page fault counts. (Swapping will mess with those fault counts too, though. Avoid swapping.)

Assuming you have a Linux kernel 2.6.26 or later, this data can also be obtained using getrusage(RUSAGE_THREAD,ptr); for the current thread. Here is some example code I used to verify (at least on my machines) my opinions above:
Code:
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <string.h>
#include <errno.h>
#include <pthread.h>
#include <semaphore.h>

#include <stdio.h>

size_t first_accessed(void)
{
    struct rusage   u;
    size_t          page;
    int             result, saved_errno;

    saved_errno = errno;

    do {
        result = getrusage(RUSAGE_THREAD, &u);
    } while (result == -1 && errno == EINTR);
    if (result == -1)
        return (size_t)0;

    page = sysconf(_SC_PAGE_SIZE);

    errno = saved_errno;
    return page * u.ru_minflt;
}

sem_t worker_semaphore;

void *worker(void *payload)
{
    const long   bytes = (long)payload;
    char        *data = NULL;
    int          result;

    if (bytes > 0) {
        data = malloc(bytes);
        if (data)
            memset(data, 0, bytes);
    }

    result = sem_wait(&worker_semaphore);
    if (result)
        return (void *)( (long)errno );

    return (void *)( (long)first_accessed() );
}


int main(int argc, char *argv[])
{
    pthread_t       *thread_id     = NULL;
    pthread_attr_t  *thread_attr   = NULL;
    long            *thread_arg    = NULL;
    int              threads       = 0;

    long             value;
    char             dummy;
    void            *retval;
    int              arg, result;

    if (argc < 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
        fprintf(stderr, "Usage: %s bytes [ bytes ... ]\n", argv[0]);
        return 0;
    }

    thread_id     = malloc((size_t)argc * sizeof (pthread_t));
    thread_arg    = malloc((size_t)argc * sizeof (long));
    thread_attr   = malloc((size_t)argc * sizeof (pthread_attr_t));
    if (!thread_id || !thread_arg || !thread_attr) {
        fprintf(stderr, "Not enough memory.\n");
        return 1;
    }

    result = sem_init(&worker_semaphore, 0, 0);
    if (result == -1) {
        fprintf(stderr, "Cannot initialize worker semaphore: %s.\n", strerror(errno));
        return 1;
    }

    for (arg = 1; arg < argc; arg++) {

        if (sscanf(argv[arg], "%ld %c", &value, &dummy) != 1) {
            fprintf(stderr, "%s: Invalid number of bytes.\n", argv[arg]);
            return 1;
        }
        if (value < 0L) {
            fprintf(stderr, "%s: Invalid number of bytes.\n", argv[arg]);
            return 1;
        }

        thread_arg[threads] = value;

        result = pthread_attr_init(&(thread_attr[threads]));
        if (result) {
            fprintf(stderr, "Cannot initialize thread attributes: %s.\n", strerror(result));
            return 1;
        }

        result = pthread_attr_setstacksize(&(thread_attr[threads]), (size_t)65536);
        if (result) {
            fprintf(stderr, "Cannot set thread stack size attribute: %s.\n", strerror(result));
            return 1;
        }

        result = pthread_create(&(thread_id[threads]), &(thread_attr[threads]), worker, (void *)thread_arg[threads]);
        if (result) {
            fprintf(stderr, "Cannot create thread: %s.\n", strerror(result));
            return 1;
        }

        threads++;
    }

    if (threads < 1) {
        fprintf(stderr, "Nothing to do.\n");
        return 1;
    }

    for (arg = 0; arg < threads; arg++)
        if (sem_post(&worker_semaphore) == -1) {
            fprintf(stderr, "Cannot post worker semaphore: %s.\n", strerror(errno));
            return 1;
        }

    fflush(stderr);

    for (arg = 0; arg < threads; arg++) {

        result = pthread_join(thread_id[arg], &retval);
        if (!result) {
            printf("Thread %d of %d: %ld bytes allocated, %ld bytes used (minor page faults).\n",
                   arg + 1, threads, thread_arg[arg], (long)retval);
            fflush(stdout);
        } else {
            fprintf(stderr, "Failed to join thread %d of %d: %s.\n", arg + 1, threads, strerror(result));
            fflush(stderr);
        }
    }

    if (sem_destroy(&worker_semaphore) == -1) {
        fprintf(stderr, "Cannot destroy worker semaphore: %s.\n", strerror(errno));
        return 1;
    }

    return 0;
}
If you save the above code as minorfaults.c you can compile and run a couple of tests using
Code:
gcc minorfaults.c -Wall -O3 -lpthread -o minorfaults

./minorfaults 1 1000 1000000
    Thread 1 of 3: 1 bytes allocated, 12288 bytes used (minor page faults).
    Thread 2 of 3: 1000 bytes allocated, 4096 bytes used (minor page faults).
    Thread 3 of 3: 1000000 bytes allocated, 1003520 bytes used (minor page faults).

./minorfaults 1000000 1000 1
    Thread 1 of 3: 1000000 bytes allocated, 1015808 bytes used (minor page faults).
    Thread 2 of 3: 1000 bytes allocated, 4096 bytes used (minor page faults).
    Thread 3 of 3: 1 bytes allocated, 0 bytes used (minor page faults).

./minorfaults 50000 200 40000 800000 40
    Thread 1 of 5: 50000 bytes allocated, 61440 bytes used (minor page faults).
    Thread 2 of 5: 200 bytes allocated, 8192 bytes used (minor page faults).
    Thread 3 of 5: 40000 bytes allocated, 40960 bytes used (minor page faults).
    Thread 4 of 5: 800000 bytes allocated, 802816 bytes used (minor page faults).
    Thread 5 of 5: 40 bytes allocated, 0 bytes used (minor page faults).
To get the corresponding information on any thread in any process you have access to, read the tenth field in /proc/pid/task/tid/stat and multiply by sysconf(_SC_PAGE_SIZE) . Note that tid is the Linux task id, not POSIX threads ID; you need to use gettid() or e.g. ps -o tid ... , I don't know of any way to derive the tid from a pthread_t variable.

If your library does not provide gettid(), use
Code:
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>

pid_t gettid(void)
{
    return (pid_t)syscall(SYS_gettid);
}
I hope you found this stuff as interesting as I did.

@sundialsvcs and johnsfine: I did not know all the details above before this thread. I did have a fuzzy notion, but nothing specific. Because of this thread, I checked -- and I'm glad I did. If I ever need to check if my worker threads have more or less balanced memory use, I know how to do it now.
 
2 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
CPU sharing between process and threads..? manohar Programming 1 03-29-2011 05:56 PM
Why is wine using both of my cpu threads? darkstarbyte Linux - General 2 12-28-2010 03:36 AM
CPU usage of each threads ufmale Programming 2 10-30-2009 05:50 PM
Limit Cpu Use Of Threads yakotey Programming 1 10-03-2005 09:12 AM
Help Configuring the Memory Used by a Process in RedHat? (Cache Memory on CPU) geogecko Linux - General 3 02-23-2005 03:32 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 11:53 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration