1. Revision History
1.1. Revision 1
Added actionable language for inclusion into the standard. We now state that splayed layouts are the minimum requirement and any hierarchy or module name enforcement is solely the responsibility of the build system.
1.2. Revision 0
Initial Release 🎉
2. Motivation
As the advent of modules approaches, build systems and developers are still waiting on the final bits that will make it into C++20. However, because of various limitations, the standard cannot enforce specific convention, merely encourage them. This paper seeks to provide a possible convention that reduces work required by build systems, general effort from compilers (e.g., they will not need to implement a local socket based server for information passing), and to make the lives of developers easier as they will get to experiece a fairly consistent development process when moving between or across projects.
3. Design
The design for this behavior is as follows:
A so-called module entrypoint is placed inside of a directory. A module
entrypoint, as defined in §5 Wording is a file. The name of this entrypoint is user defined, with a fallback to a file with the base name
and possibly some file extension. We do not enforce the requirement of
a file extension so that operating systems without them can do as they please.
This also allows the community to eventually select a "better" file extension.
That said, the author recommends
as that is the environment variable used
by plenty of build systems to represent the C++ compiler, where as
represents the C PreProcessor. Additionally, moving libraries from
non-module APIs to modularized APIs can have a path of least resistance instead
of requiring something that has nothing to do with the name C++, like
or
.
Therefore, if a module’s interface has
, the compiler will look for a file with a base name of
, and (optionally) the same file extension as
.
This behavior only occurs during the construction of a BMI. If dependent modules are not yet compiled, or if users are expecting an object file to magically pop out of the compiler, they will be surprised when the compiler gives an error.
The building of object files is still left up to build systems so that existing distributed build system workflows are not interrupted (such as in the case of tools like Icecream, distcc, or sccache).
Where things get interesting is when a user desires to import another module into a module entrypoint. Given the following directory layout:
. └──src └──core ├──module . cxx ├──list . cxx └──io ├──module . cxx └──file . cxx
We can assume that, perhaps, the source for
looks something
like:
export module name ; export import core . io ; import : list ;
In other languages, this would imply that the compiler has to now recurse into
the
directory. We do not do this. Instead, the build system is
required to have seen the
and passed that directory along to
the compiler first. This is a combination of the various modules systems, but
has the following properties:
-
It does not make the compiler a build system
-
Existing work that has been done to handle dependency management does not need to be thrown away.
-
The
module does not have to exist in the directorycore . io
. Rather, it can exist techically anywhere.core / io -
Build systems are free to enforce the name of a module to its location on disk, while also permitting others to ignore it entirely.
-
Build systems can have a guaranteed fallback location if developers don’t want to have to manually specify the location of each and every module.
-
This doesn’t actually tie the compiler to a filesystem approach, as this is just a general convention.
-
Build systems are free to implement, additional conventions, such as the Pitchfork or Coven filesystem layout and enforce it for modules having legacy non-module code in the same project layout.
-
It allows developers to view modules as hierarchical, even if they aren’t. This means that, if treating modules as a hierarchy becomes widespread enough, the standard could possibly enforce modules as hierarchies in the future.
-
Platforms where launching processes are expensive can take advantage of improved throughput when reading from files.
-
Build systems and compilers are free to take an optimization where only the modified times of a directory are checked before the contents of each directory are checked. On every operating system (yes, every operating system), directories change their timestamp if any of the files contained within change, but do not update if child directories do as well. While some operating systems permit mounting drives and locations without modified times, doing so breaks nearly every other build system in existence. Thus we can safely assume that a build system does not need to reparse or rebuild a module if its containing directory has not changed.
4. Examples
The following two examples show how implicit module partition lookup can be used for both hierarchical and "splayed" directory layouts.
4.1. Hierarchical
This sample borrows from the above example. Effectively, to import
,
one must build it before building
simply because the build system
assumes that
refers to a directory named
from the project
root.
. └──src └──core ├──module . cxx ├──list . cxx └──io ├──module . cxx └──file . cxx
This behavior is not enforced by the compiler, but rather by the build system. If a build system does not support a hierarchical implicit lookup, it can at least support a splayed implicit lookup
4.2. Splayed
This approach is one that might be more commonly seen as C++ developers move from headers to modules.
. ├──core │ ├──module . cxx │ └──list . cxx └──io ├──module . cxx └──file . cxx
In the above layout,
is located in
, rather than under the
directory. A sufficiently simple build system could be told that
resides under
and not to rely on some kind of hierarchical
directory layout.
5. Wording
The following is to be placed into the current working draft at a location within close proximity to, or adjacent to, the current merged modules wording:
1 This subclause describes operations to be supported regarding modules, their implicit partitions, and interactions within the filesystem.
2 A module container is a collection of files represented by a directory. [fs.general]
3 A module entrypoint is a filesystem object within a module container that holds the purview of an exported module.
4An implicit module partition is a filesystem object adjacent to a module entrypoint.
1Conformance is required for all vendors whose implementations create a final program to be run on the abstract machine
2Compilers must provide a flag for users to declare a module container as a filesystem directory.
3Compilers must provide a flag for users to declare the name of a module entrypoint
4Compilers must use a predefined module entrypoint if none is
provided by a user. This entrypoint must be a file with a base name of module. The extension of the file is currently left unspecified, but should
be one of the several file extensions that have been historically used. (e.g.,
,
,
, and
, et. al.)
1In the module entrypoint's preamble, when an implementation encounters a module partition import, it needs to look for an implicit module partition with the same base name as the module partition