Toward standardization of dynamic libraries

Matt Austern <austern@apple.com>
25 Sep 2002
N1400=02-0058

Motivation

Essentially all modern operating systems support dynamic shared libraries. Programmers rely on this feature. The C++ Standard has nothing to say about it, and OS vendors who have implemented dynamic libraries tend to have little to say about C++. There is wide variation between systems.

Vendors tend to justify this variation by saying that any program that uses dynamic libraries is outside the scope of the C++ Standard. Unfortunately,this appears to be true. The Standard says what a program is (section 3.5) and how it's put together (section 2.1). These definitions are inapplicable to a modern environment that includes dynamic linking. We run the risk of having a standard that doesn't apply to any of the programs people write.

My goal is to make it possible for users to write at least some portable programs that use dynamic libraries. We don't have to solve all problems, and we can't: there's too much variation between OSs. What we can hope to do is standardize a large enough subset of dynamic library semantics so that it's possible to write programs that are both interesting and portable, and to describe which aspects may vary from system to system. I do not intend to address component models (e.g. CORBA) or cross-language issues.

This is an informal document. The next step is an issues list, and the step after that, if we can come to an agreement on basic directions, is formal standardese.

Terminology

I'd like to avoid the word "program", since, in the context of dynamic libraries, it can mean several things. (None of which is quite what it means in the C++ standard.) Instead, some new terms:

A load unit is a series of translation units linked together by a static linker. A load unit may be an executable or a dynamic library. A load unit may contain unresolved symbols: it may depend on those symbols being defined in some other load unit.

A load unit's direct dependencies are the load units that the static linker sees when the load unit is being built, and that it explicitly depends on. A load unit's set of dependencies is the transitive closure of the direct dependencies. It can be defined recursively as follows: for a load unit with no direct dependencies its set of dependencies is the empty set, and otherwise its indirect dependencies are the union of the its direct dependencies and each of their dependencies. A load unit's indirect dependencies are its dependencies that are not direct dependencies.

A load set is an executable, and zero or more dynamic libraries, linked together. The load units that make up a load set are linked together by a dynamic linker, typically at, or even after, program start time. The dynamic linker loads a load set by starting with an executable and then automatically finding all of the executable's indirect dependencies.

A loadable library is a dynamic library that is part of a load set but that is not an indirect dependency of the executable. It is loaded into the load set by an explicit function call under the control of the programmer. These kinds of libraries are often called "plugins" or "bundles". Some OSs, but not all, enforce a rigourous distinction between loadable libraries and libraries used as load unit's dependencies.

Usage models

We can identify several different ways in which dynamic libraries may be used:

Model 0: The user doesn't ask for dynamic libraries at all. (The implementation may choose to use them, but that's an invisible detail.)
Model 1: The static linker that creates an executable sees all of the dynamic libraries that the executable depends on. When the dynamic linker links this executable against the libraries to create a load set, it doesn't see anything more than the static linker did.
Model 2: The static linker that creates an executable sees libraries that could act as the executable's dependencies. However, at load set launch time, the libraries that get loaded may be different. Example: the static linker links against a third-party library that provides some feature. When a a user starts up the executable five years later, the dynamic linker sees a newer version of the library. The source and object files that the static linker used to create the executable are long gone.
Model 3: Loadable libraries. The load set includes at least one dynamic library that the static linker never saw. Example: the code for some rarely performed action is in a loadable library instead of in the main executable. When a user requests that action, perhaps hours after the load set first started running, that library is explicitly loaded. It might be explicitly unloaded long before load set termination.

General principles:

Model 0 is what we've got now, so, obviously, the existing guarantees in the Standard apply.
There should be few surprises going from model 0 to model 1.
There should be few surprises going from model 1 to model 2.
It should be possible for a C++ program to take advantage of features that are specific to dynamic libraries.

It won't always be possible to satisfy these goals simultaneously.

Issues

Linkage

On all systems I know of that support dynamic libraries, a load unit can serve as an extra layer of scope, intermediate between a translation unit and the load set as a whole. So, for example, a symbol can be external in that it's shared between all of the translation units in a load unit, but private in that it can't be seen from other load units. I believe the Standard must take this feature into account. First, it's universally available. Second, it's a major reason that programmers organize their code into dynamic libraries.

The first issue is one of vocabulary: what words do we use to make this distinction? The three most obvious choices, public, external, and exported, are already used for other purposes! I suggest that we call a symbol global if it is intended to be visible to other load units than the one where it is defined. A symbol can only be global if it has external linkage.

(One reason "global" is a poor choice is that in some systems there's an asymmetry between making a symbol available for use in other load units, and using a symbol from a different load unit. Names like "import" and "export" reflect that asymmetry better.)

Second: how do programmers control whether or not symbols are global. The answers on existing systems are language extensions (pragmas, attributes, or new keywords), extralinguistic mechanisms (linker flags, export files), or both. For the Standard, my opinion is that the only sensible answer is a new keyword that can be used in some kinds of definitions. I propose global. I propose that symbols are always non-global by default.

This is a somewhat controversial issue, since existing practice varies widely. Either way, we need a source construct to control this on a symbol-by-symbol level. I suggest that non-globals be the default for two reasons. First, a load unit's set of global symbols is part of its interface and its set of non-global symbols is part of its implementation, and implementations are usually larger than interfaces. Second, there are existing implementations with source constructs to make symbols global selectively.

I propose that a definition must use the global keyword to allow the entity that's being defined to be accessed from other load units, but that there's no special syntax to use an entity defined in another load unit.

Third: how do programers specify that a load unit should expect to find a symbol's definition in a different load unit. There are two general classes of answers: either by leaving the symbol undefined, or by using a source construct along the same lines of extern.

Fourth, granularity of control: what kinds of definitions can be defined as global? I suggest the following:

A variable with external linkage can be defined as global.
A function with external linkage can be defined as global.
A class can be defined as global. This means that all of its static and nonstatic member functions, along with whatever internal machinery the compiler introduces (typeinfo information, vtables, thunks,...) are global.
A class template or function template can be defined as global. This means that all specializations will be global.
An explicit instantiation, or a (partial or complete) specialization, can be defined as global.
There is no mechanism for declaring some member functions of a nonglobal class to be global, or for declaring some member functions of a global class to be nonglobal. (Rationale: the semantics would be too complicated to be useful.)
There is no mechanism for declaring some specializations of a global template to be nonglobal. (Rationale: if you want some specializations to be global and some to be nonglobal, then just declare the template to be nonglobal and globalize some specializations selectively.)

Fifth, what exactly does this mean for things like class or template definitions, which typically show up in headers? Do we require textual differences in the header depending on which load unit it appears in, perhaps by macro hackery, or do we require that the definition use the global keyword in all load units? I propose the later. (Rationale: it's already the case that a class definition may appear in multiple translation units, so allowing it to appear in multiple load units doesn't seem like much of a stretch. I'm uncomforable with the ODR implications of saying that a class definition must be textually different in two different contexts.)

Sixth: how the global/nonglobal distinction interacts with the type system. My proposed answer: it doesn't. There's no distinction between a pointer to a global object and a pointer to a nonglobal object, it's impossible to overload on global/nonglobal, etc. (Rationale: this reflects the reality of existing practice.)

Seventh: how the global/nonglobal distinction interacts with other, similar facilities that we have in the language already. The three that I can think of are linkage, namespaces, and exported templates. Linkage: I think what we're doing is introducing a new category of linkage, in addition to the three that we have now. (no linkage, internal linkage, and external linkage) Within a single load unit, something with global linkage looks just like something that has non-global external linkage. Namespaces: I suggest that the two concepts should be orthogonal. There's nothing wrong with having a namespace that's opened in multiple load units. (Rationale: this is pretty much unavoidable, since we can expect that most load units will use namespace std. It also reflects existing practice.) Export: I don't know.

Symbol resolution

This is closely related to the issue of linkage, but not quite identical. By symbol resolution, I mean how a symbol that is undefined in one load unit can be bound to a global symbol in another load unit. By definition, this is something that happens in the dynamic linker. Issues dealing with symbol resolution include:

Do symbols that the static linker didn't see participate in symbol resolution? Example: the static linker links an executable against a dynamic library lib1. A variable x is undefined both in the executable and in the version of lib1 that the static linker sees. However, a different version of lib1, which is available at load set launch time, does have a global symbol x. Is this a well-formed program? (My suggestion: no. The Standard should require that all symbols can be resolved at static link time.)
What happens if there is a symbol that disappears between static and dynamic link time? That is: the static linker expects a symbol x to be defined in a dynamic library lib1, but, at load set launch time, the version of lib1 that gets loaded has no such global symbol. (Again, my suggestion: undefined behavior.)
If the static linker expects a symbol that's undefined in the executable to be provided by a dynamic library lib1, are there any circumstances in which, at load set launch time, we can get that symbol from a different library lib2? If the answer should be "no", what restrictions do we need to prevent it from happening?
Are there any circumstances in which it's legal to have an ordinary global function or variable that's defined in more than one load unit? Suppose, for example, that x is an undefined symbol in an executable, that lib1 and lib2 are direct dependencies of the executable, and that each of the provides a global symbol x. Is this behavior defined? If so, which symbol satisfies the executable's dependency?
A special case: suppose that x is an undefined symbol in an executable, that at static link we find a global x in single library lib1, and that at dynamic link time we find a global x in two different libraries lib1 and lib2. Which one do we get, or does this result in undefined runtime behavior? (Note that both of them might be third-party libraries from different vendors, and that they may evolve independently.)
If we have something that may properly be defined in more than one translation unit (e.g. a global template specialization), which one do we get? This is a more serious issue than it might seem at first sight, when combined with the previous questions: the general principle I think we'd all like is that a library's interface should not change between static link time and dynamic link time but its implementation might, so we need a consistent view of whether the set of templates instantiated by a load unit is part of the load unit's interface or its implementation.
Direct versus indirect dependencies. Consider two cases. First, a load set consists of an executable that's linked against two dynamic libraries, lib1 and lib2. Second, a load set consists of an executable that's linked against lib1, and lib1 was linked in turn against lib2. In case 2, the executable has a direct dependency and an indirect dependency. The static linker that builds the executable doesn't necessarily see lib2. Is symbol resolution different for direct and indirect dependency? How about the case where an executable has both a direct and an indirect dependency on the same dynamic library? What about where an executable has a dependency on a dynamic library, and also loads a loadable library with a dependency on the same library?
A general question: which load units are candidates for symbol resolution? (One simple rule, perhaps too simple, is that an unresolved symbol x in some load unit M may only be satisfied by a global symbol x in a load unit M' if M' is a direct dependency of M, and if x is found in M' both at static and at dynamic link time.)

I regard symbol resolution rules as our nastiest problem.

Definitions in the Standard

Are any modifications needed to the following parts of the Standard?

The definition of a program (section 3.5)
Phases of translation (section 2.1)
The one definition rule (section 3.2)

Some possible choices of symbol resolution rules will require ODR changes. We might reasonably end up with a rule where a global symbol x means different things in different load units. That may be unavoidable; I suggest, however, that we should at least make sure that a symbol always means the same thing within a single load unit.

Loadable libraries

What is the standard interface for loading a library by name after a load set has already started running? What happens after the library has been loaded? How do its global symbols participate in the load set's symbol resolution? How can one access a global symbol from that library?

Some incomplete suggestions:

We need to develop an API for loading and unloading dynamic libraries and for accessing symbols from them. Existing practice varies widely: dlopen on Linux and Solaris, shlopen on HPUX, the dyld API on OS X, etc. To access a global variable x from a library, we might load the library and then write something like lib_handle->fetch_symbol<int>("x"). To call a global function from a library, we might do a symbol lookup and then assign the result to a function pointer. We'll need to think about query interfaces: how do we fetch an overloaded name? Naturally, the query interface must be implemented and tested before we think about standardizing it.
The global symbols defined in a loadable library do not participate in symbol resolution in the load set as a whole; symbol resolution between a loadable library and its dependencies works the same way as symbol resolution between an executable and its dependencies. The only way for a load set to access symbols from a loadable library is through the query API.

Type equivalence

If a type with the name X is defined in multiple load units, is it considered to be the same type? (Imagine a header file that's #included into multiple source files, for example.) Some specific considerations:

Does the type_info object returned by typeid in one load unit compare equal to the object returned by typeid in another load unit? (Proposed answer: yes if the type is defined as global in both load units.)
Can you throw an exception of type X in one load unit and catch it in another load unit? (Proposed answer: yes if type X is defined as global in both load units.)
Is it legal to declare a type X in one load unit and an unrelated type of the same name in another load unit, or should we take that as violating the ODR? (Proposed answer: it's legal if X is defined as nonglobal in both load units.)
If a global class template is instantiated with the same parameters in two different translation units, do we get the same type? (Note that the answer might be different depending on whether the template parameters are local or global, whether we're talking about non-type template parameters, etc.)
Does an explicit instantiation in two different load units violate the ODR?
Can an explicit specialization of a template override an implicit instantiation in a different load unit?

Object identity

Some objects may be defined in multiple places, and the Standard says the implementation is responsible for making sure that only one object appears in the final executable. What happens if an object is defined in multiple load units that get linked into the same load set? Does the answer depend on whether we're talking about direct dependencies, indirect dependencies, or loadable libraries? Examples:

If a global function template is instantiated in multiple load units, is the address of the instantiation the same in all load units?
If a global class template with a static member variable is instantiated in multiple load units, do we get one copy or multiple copies? We can test this by checking the variable's address, or we can just write X<int>::foo = 7 in one load unit and check the value of X<int>::foo in another load unit.
If a global inline function is defined in multiple load units, does that function have the same address in all load units? Are its static variables (if any) shared between load units?

A complementary question: suppose that a load unit is part of multiple load sets on the same system. Is its state (e.g. the values of static and namespace-scope variables) shared between load sets, or does each load set conceptually have a separate copy of the load unit?

Order of initialization

We need to consider order of initialization of nonlocal objects within load units, order of initialization between load units, and interaction with atexit. Some specific issues:

Do we require that dynamic initialization of nonlocal objects in a dynamic library must take place at load time, or do we make the weaker guarantee that dynamic initialization within a translation unit takes place before any functions from that translation unit get called?
Do we have a mechanism for detecting failure in dynamic initialization of nonlocal objects? We might, for example, return an error code or throw an exception when we try to explicitly load a loadable library.
Is it guaranteed that all nonlocal objects in a dynamic library get initialized eventually?
Once we throw loadable libraries into the mix, we can no longer require that destructors run in the opposite order from constructors. (Constructors can run no sooner than when a library is loaded, and destructors can run no later than when it's unloaded. It's easy to construct cases where destruction in opposite order is impossible.) Similarly for interleaving with atexit. What ordering guarantees can we make instead?
What happens if a destructor or an atexit function uses resources from a library that has been unloaded by the time the function gets run? In one sense the answer is obvious: we can't make this work, and we have to say it's undefined. But the next question is how to say that in the Standard. What restrictions does the user have to satisfy? What guarantees on ordering do we have to impose to make sure that this mess only happens when users deliberately shoot themselves in the foot? (Partial suggestion: libraries that are loaded automatically, i.e. direct and indirect dependencies of the executable, are unloaded in opposite order from the load order.)
Is there a mechanism for users to specify arbitrary code that is to be executed immediately after a dynamic library is loaded or immediately before it is unloaded? If so, can users provide multiple such hooks, and is there a mechanism for controlling their ordering?

The Standard Library

What new features need to be added to the Standard Library to control dynamic linking? We've already discussed an interface for loading libraries and accessing symbols from them; is there anything else?
Which load unit are Standard Library names defined in: their own load unit, or some unspecified load unit, or every load unit?
Which names from the Standard Library, if any, are global?
Is the state of the Standard Library (e.g. the default locale, the new handler, the unexpected handler) shared between load units, private to a load unit, or something in between?
Are specializations of Standard Library templates (including user-defined specializations) global?
Does user-defined code that's intended to be called by the Standard Library have to be global?