LinuxQuestions.org - Thread-safe global variables in C

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - Thread-safe global variables in C (https://www.linuxquestions.org/questions/programming-9/thread-safe-global-variables-in-c-579974/)

Thread-safe global variables in C

I'm making a C library, and in it, I want to be able to create a global variable (probably more later, too) that can be used in multiple programs and multiple threads of those programs safely. Of course, the naïve way to do it would be to put this in a header file:

extern int Error;

and this in a C file of the library:

int Error;

The problem with doing this: what if one thread calls a function that sets an error code, then another thread calls a function that also sets an error code? One will interfere with the other. I'd rather not do passing-by-pointer since I think code would look cleaner in the global form. What would be a good way of going about this?

Also, I guess it's okay to use errno for my own function errors, right? Either way, I might want to make my own global variables in the future, and it'd be neat to know how to do them in a thread-safe manner. Thanks!

Quote:

Originally Posted by JoeyAdams (Post 2871335)

Either way, I might want to make my own global variables in the future, and it'd be neat to know how to do them in a thread-safe manner. Thanks!

There is no way that a variable can contain two different values at the same time. You cannot use Error to contain more than one error code at a time. So you will need a separate memory location, global or otherwise, for each thread whose error code you want to save.

--------------
Steve Stites

I was thinking of having macros that refer to functions that access copies of the "global variables" stored either in a buffer provided by the library or in the program's address space. Perhaps it'd be something like this (bear with the made-up functions and oversimplified array accesses):

Code:

typedef struct LibraryGlobals

{

  int V_Error;

} LibraryGlobals;



LibraryGlobals *GlobalValues; //initialized by the library

void **threadIdentifiers; //initialized by the library

//a thread identifier is a unique value of some sort for each thread



LibraryGlobals *GetPtrToGlobals(void)

{

  void *thread_ident=GetCallingThreadIdentifier();

  unsigned int i;

  lock(); //don't let other processes call this function until this instance is done

  for (i=0;i<dim(threadIdentifiers);i++)

  {

    if (thread_ident == threadIdentifiers[i])

      break;

  }

  if (i>=dim(threadIdentifiers))

  { //if this is a new thread, create a space for its globals

    redim(threadIdentifiers,i+1);

    threadIdentifiers[i]=thread_ident;

  }

  unlock(); //allow this to be called again by another process

  return GlobalValues+i;

}



#define Error (GetPtrToGlobals()->V_Error)

The problem is, I don't know if there is a way to get a value of some sort (be it a pointer or a number) that distinguishes it from another thread, and I don't know a good way to lock access to GlobalValues and threadIdentifiers during a call to GetPtrToGlobals. Plus, doing this gets slower and slower as more threads use the library, and the memory isn't released at the end.

It turns out that the folks who developed the POSIX threads standard have already thought of this requirement. Surprise, surprise.

In the following discussion, any variable names in red are arbitrary, and you can use almost any names you wish.

Fasten your seatbelts. Here we go.

Before starting to code these steps, begin to think of all the "global" variables you'll need for which you want one instance per thread. (In your example, there'll be only one such variable: Error.) Imagine these variables all in a struct, because that's where you'll be putting them.

The steps are simple. If you have only one function in your library, these steps should be done in that function; if you have more than one function in your library, these steps should be done in each library function you write; all the associated library functions in the same library should refer to the same pair of globals wilma and fred I discuss below.

At the beginning of each function, call pthread_once() (and check the returned status, of course). As the name implies, only the first attempt to call this function in your program, in any function and in any thread, will actually do anything significant; the other attempts will wait if necessary until the first call is done, and then skip the call (but return a 0 status, which means "ok"). The first parameter to pthread_once() should be a pointer to a global (yes, really global) variable of type pthread_once_t, which is initialized at compile time thus:

Code:

pthread_once_t wilma=PTHREAD_ONCE_INIT;

The second parameter to pthread_once() is the name of a function you have written. That function's declaration would be something like this:

Code:

void once_function(void);

As you can see, the function takes no parameters and returns nothing. Its only job is to deal with a second global variable. Your "once function" will be called only once, by the first thread that calls pthread_once() and specifies wilma in the call.

What should you put in this function? All it should do (for these purposes) is call pthread_key_create(), and check for an error return. The first parameter to pthread_key_create() is a pointer to a global (yes, really global) variable of type pthread_key_t. We'll call this variable fred. Unlike wilma, you don't initialize fred at compile time. The second parameter to pthread_key_create() is the name of a function which you will write. It will act as a destructor. Let's call that destructor betty(); we'll discuss it after step 2.

And that's all that your once function does.
After each of your library functions calls pthread_once (and checked the return code, of course), it should call
pthread_getspecific(). The parameter is fred. Let's call the returned pointer barney. If barney is not NULL, then barney is a pointer to the struct you designed which contains all the "global" variables for this thread for this library, and the function can then use *barney freely, and you should ignore the rest of step 2 in that case.

But if barney is indeed NULL, then your library code (in all of its functions, if there is more than one function in your library) has not allocated the "global" struct for this thread. Now is the time to do so. Call malloc() for the struct (and check for the possibility of getting NULL as the returned value, of course). Don't worry; malloc() is thread safe. Just store the returned pointer in barney.

Then call pthread_setspecific(). Its first parameter is fred. Its second parameter is barney.

After checking that the return value from the call to pthread_setspecific() is 0, the function can then use all the content of *barney, knowing that later calls in this thread to this library function or any related library function will continue to use the same data.

Now, what about the destructor we discussed at the end of step 1?

When a thread terminates, the destructor that you write should be used to deallocate the struct that contains the data specific to this thread. The destructor should be defined something like this:

Code:

void

betty(void *stuff_to_be_destroyed)

{

  free(stuff_to_be_destroyed);



} /* betty() */

That's really all there is to it.

To cover the all-important fine print, be sure to read all the apropos man pages.

And if you're going to do any serious work with POSIX threads, be sure to buy David R. Butenhof's excellent book Programming with POSIX Threads. Accept no substitutes. My copy of the O'Reilly book on the same subject does not begin to cover the material with the same depth of insight.

Hope this helps.

Hi, JoeyAdams -

You might also want to look at "thread local storage" (TLS):

http://linux.web.cern.ch/linux/scien...ead-local.html
http://linux.die.net/man/2/set_thread_area
http://linux.die.net/man/2/get_thread_area

Here's a discussion for TLS (and threading in general) under Windows:

http://courses.washington.edu/css443...gWithWin32.pdf

Wow, thanks for the awesome replies. I wonder what would be smarter to use, since the first link about TLS suggests that it might not be widely available due to the extensive support by the linker and libraries (and it's probably only on GCC), but pthread isn't present on Windows (or is it?).

I have a simple question about the pthread stuff you posted above. Where should

Code:

pthread_once_t wilma=PTHREAD_ONCE_INIT;

appear? Should it be in a C source file of the library and an extern pthread_once_t... in the header? Also, pthread_once needs to be called only once at the entrance of all threads, but it can be safely called multiple times, right?

Four things.

Thing one:

errno is one of those weird animals, a per-thread global. Each thread gets its own.

Thing two:

Code:

pthread_once_t wilma=PTHREAD_ONCE_INIT;

should be compiled only once while building your library. But each compiled module in the library should see at least:

Code:

pthread_once_t wilma;

Some environments relax this if you initialize every instance to the same thing. For maximum compatibility, don't rely on any such relaxation.

If you want to relax this requirement, though, keep the following things in mind:

There is nothing different between types pthread_once_t and, say, int in this regard. If you're going to be building an industrial-strength library, it would be good to experiment with how your compiling environment handles multiple definitions of global

Code:

int foo=5;

in different modules in the same library, when both modules are linked into a runnable program which prints out the value of foo. Then try the same thing with conflicting definitions to see what happens. If the environment lets you get away with building that, then switch the definitions around to see what gets output this time.

Then take careful notes as to what you did, and what the results were, so you can replicate the experiment on each platform for which you ship your library.

Much easier to just obey the rules. (grin)

Thing three:

pthreads is, and is not, available for Windows. For a gloriously complex answer, google this:

Code:

pthread windows

Thing four:

The whole point of pthread_once() is that you can call it as many times in each thread as you want, and it calls your once-function exactly once per thread. So you can safely put all that stuff at the beginning of each of your library functions (or perhaps put it in a common function that each of your top-level functions calls).

Thread-local storage (TLS) is the way to go. Many languages provide a built-in facility for giving easy access to it, so don't overlook all possibilities: make things easy on yourself.

The essential idea behind TLS is simple: every thread is defined, in the system, by a so-called "thread control block" (TCB) and within that TCB there are a handful of otherwise-unused slots... TLS. And what you (or your language) do with them is to use one of 'em as a pointer to whatever you want to store. Just be sure that, when the thread terminates, it cleans up whatever it has allocated.

If what you need is a "thread-safe global pool" of (anything), then you have to use mutexes of some kind to protect it.

Thanks. I probably won't be making thread-safe globals in my library any time soon, but when I do, this will be a lot of help. I think I have just one more question though:

Quote:

But each compiled module in the library should see at least:

Code:

pthread_once_t wilma;

Couldn't I just use "extern pthread_once_t wilma;" in the header and have pthread_once_t wilma; in one C source file of the library?

Yes, that would work, because each compiled module in the library will (I hope) "see" that line of code, since it's in the header.

I'm uncertain if it will raise a problem in the one C source file that contains the PTHREAD_ONCE_INIT, since that C source file will also see the uninitialized version. But you can work that out if the problem arises, I'm sure.

POSIX therad that terms pthread, can keep value of variables of each thread for the same thread, the name of this technique is specific-data , you can google with specific-data pthread C

Because the C library uses global variables such as "errno", they only way to truly have a thread safe C library is if the underlying operating system keeps a separate memory address space for each thread. The address of "errno" would then be the same for each thread, but refer to a different physical memory location. One can play tricks such as defining C macros to replace global variables like "errno", but then the global variables will not behave as normal variables.

EDIT
I obviously misunderstood the question. The question is about writing a library using the C language. I thought it was asking how to write "the" standard C library.

All the other answers are good suggestions. Which one to use will depend on the operating systems and C compilers used with the library.

Using current Linux kernels and development tools, one can use thread-local storage simply by adding the __thread keyword to the variable definition. It, or its variant, are available in practically all current C environments, including Windows, although the details vary a bit.

In Linux, if you want to have your own variable with errno-like semantics, just use

Code:

extern __thread int my_errno;

in the header file, and

Code:

__thread int my_errno = 0;

in the library implementation. Each thread will see their own private version.