Thread-local globals

2023 June 7

Global variables get a bad rap. They have one killer benefit: easy access to shared state from any context without having to drill that state down the call stack through parameters. This one benefit is the reason programmers feel an urge to use globals at all.

In reality, we're already stuck with global state. The operating system and everything it controls is global:

memory (malloc and free)
the filesystem (open, close, stat, mkdir, ...)
sockets (socket, bind, connect, ...)
standard streams (stdin, stdout, stderr)
...

Why then are we depriving ourselves of the amazing benefit of global variables for everything else? What's the objection? The only one I find convincing^[1] is that global variables makes it impossible to parallelize tests with threads. Different tests should not be sharing the same global state. Not everyone will care about this problem because they're not parallelizing tests with threads, but I don't like closing off that possibility as a cost of using globals.

Can we get the benefit of globals without this cost? I have an idea: thread-local globals.

Global state is pointed to by a single thread-local variable. A scheduler controls access to a thread pool. Before the scheduler calls a function (i.e. starts a job) in a thread, it assigns in that thread the global state pointer to be used by that function. When callers schedule a new function, they include a pointer to the global state that should be used by that function. By default, it is the global state pointer for the calling function.

The scheduler doesn't need to know the type of the global state; it can just deal in void pointers. This way, the scheduler can be a global singleton shared by all contexts (just like the operating system), even if they use different thread-local globals. As a bonus, callers can swap out the global state for their callees easily by just changing the thread-local pointer, which requires no synchronization.

In the example below, A and B are the types of two pieces of global state, combined in a single Globals struct, that are used in the contexts of functions f and g.

// a.h
struct A { ... };

// b.h
struct B { ... };

// globals.h
#include <a.h>
#include <b.h>
struct Globals {
    A a;
    B b;
};

// scheduler.h
extern thread_local void* globals_;
struct Scheduler {
    void push(Callable&& job, void* globals);
    void push(Callable&& job) {
        return push(job, globals_);
    }
    ...
};

// f.cpp
#include <globals.h>
#include <scheduler.h>
A& f() {
    return static_cast<Globals*>(globals_)->a;
}

// g.cpp
#include <globals.h>
#include <scheduler.h>
B& g() {
    return static_cast<Globals*>(globals_)->b;
}

// main.cpp
#include <globals.h>
#include <scheduler.h>
#include <f.h>
#include <g.h>
int main() {
    Globals g1, g2;
    Scheduler sch;
    sch.push([]{ f(); }, &g1);
    sch.push([]{ g(); }, &g2);
    ...
    return 0;
}

The only problem now is that every context that wants to use a global must declare a dependency on all globals. See how f.cpp includes globals.h which includes b.h, even though f uses only A. This can create circular dependencies in modules: imagine if f() is actually part of the module that defines A. In the worst case, the globals module includes every other module, which all include the globals module.

The fix is for each module to declare but not define a global function that returns that module's shared state. Module dependents use these global functions with no knowledge of the composed Globals type. Then the globals module defines these functions as getters on the Globals type.

// a.h
struct A { ... };
A& a();

// b.h
struct B { ... };
B& b();

// globals.h
#include <a.h>
#include <b.h>
struct Globals {
    A a;
    B b;
};

// scheduler.h
extern thread_local void* globals_;
struct Scheduler {
    void push(Callable&& job, void* globals);
    void push(Callable&& job) {
        return push(job, globals_);
    }
    ...
};

// f.cpp
#include <a.h>
A& f() {
    return a();
}

// g.cpp
#include <b.h>
B& g() {
    return b();
}

// globals.cpp
#include <globals.h>
#include <scheduler.h>
A& a() {
    return static_cast<Globals*>(globals_)->a;
}
B& b() {
    return static_cast<Globals*>(globals_)->b;
}

// main.cpp
#include <globals.h>
#include <scheduler.h>
#include <f.h>
#include <g.h>
int main() {
    Globals g1, g2;
    Scheduler sch;
    sch.push([]{ f(); }, &g1);
    sch.push([]{ g(); }, &g2);
    ...
    return 0;
}

Footnotes

If you have other good reasons not to use globals, please share them in the comments. ↩︎