Document #: |
ISO/IEC/JTC1/SC22/WG21/P3335R0 |
Date: |
2024-07-11 |
Audience: |
SG15 |
Authors: |
René Ferdinand Rivera Morell |
Reply-to: | |
Copyright: |
Copyright 2024 René Ferdinand Rivera Morell, Creative Commons Attribution 4.0 International License (CC BY 4.0) |
1. Abstract
Specify a minimal set of core structured options [1] for C++ compiler front ends.
3. Motivation
Tools in the C++ ecosystem have dealt with using a myriad of different options to invoke C++ compiler front ends for decades. Although we have found ways to manage the variety it is advantageous to agree on a common language to reduce the growing complexities that the variety creates.
Having a standard common set of structured options allows for:
-
Reuse of implementation by tools that interface with compiler front ends.
-
Wider adoption of tools that use, as consumers or producers, the common options.
-
Lowers the barriers for unexperienced users as they have less to learn.
-
Can be a basis for other standards to form a common configuration vocabulary.
4. Scope
This proposal aims to specify a set of C++ compiler frontend structured options [1] sufficient to build common C++ use cases. This includes specifying both the names and semantics of the structured options.
This does not aim to standardize compiler frontend command line arguments. Although vendors are free to adopt the names and semantics specified if they wish. And we encourage such adoption.
5. Design
The approach for the names and semantics of the options follows these goals:
-
Prefer widely used terms in current tools, not just C++ compiler front ends.
-
Use widely understood semantics.
-
Improve the structure of the data.
5.1. File References
In various places the options need to refer to file names and paths. And in some of those instances it’s also possible to want to specify one or more attributes to the file. For example when indicating the language for a source file. We looked at various ways to achieve an arrangement to make that specification both optional and future proof. The schema we ended up with is multi-stage:
- Solitary String
-
A
"file.ext"
can be given such that it is up to the tool to determine any other file attributes as it can. Many times this means that the tool will use the extension to determine things like source language or output format. - Object With String Key+Value
-
A
{ "file.ext": "value" }
specifies thefile.ext
will set a default attribute ofvalue
. What the default attribute depends on the option we are specifying. For example, for a source it would indicate the source language. - Object With Object Value
-
A
{ "file.ext": { "a": value, … }}
would indicate specific attributes for the "file.ext".
This staged schema allows a minimal syntax for the common use cases. And incrementally specifying more information as needed.
5.2. Lists
We need to support specifying lists of homogeneous values, like files, while
also minimizing the amount of syntax for common cases. In the case of
indicating a list of values the approach we take is to allow for a single
value to be equivalent to a single element array (without adding array markers).
For example: "file.ext"
→ [ "file.ext" ]
, or { "file.ext": "value" }
→ [ { "file.ext": "value" } ]
5.3. Examples
Below are some examples that show a traditional command line invocation and the corresponding structured options specification. The examples are meant to be informational only to illustrate how the structured options could work. As such they may include options that are not proposed in this document. The example command line invocations may also contain some options that are not present in the structured options. I.e. they are not meant to be a one-to-one correspondence. Generally the examples where generated by running a real build system and collecting the command lines it invokes.
5.3.1. Hello World
This is the classic simplest C++ program with the twist that we want to allow full debugging when running it.
"g++" -O0 -fno-inline -Wall -g -static "hello.cpp" -o "hello"
"cl" "hello.cpp" /Fehello -TP /EHs /GR /Z7 /Od /Ob0 /W3 /Op /MLd /DEBUG
/subsystem:console
Those invocations can be represented as a somewhat more meaningful structured
options. This specification is formulated to be a single cross-vendor object
by using a vendor
specific section to represent options that only msvc
understands. And that other tools could ignore.
{
"source": "hello.cpp",
"output": {
"hello": "exec"
},
"optimization": {
"compile": "off",
"inline": false
},
"warnings": {
"enable": "all"
},
"debug": true,
"runtime": {
"multithread": false,
"debug": true,
"static": true
},
"vendor": {
"msvc": {
"subsystem": "console"
}
}
}
5.3.2. Compile And Link
A single invocation that does everything is not particularly common, except as basic textbook examples. Here we see the more common case of compiling to produce an object file for the TU. Then linking to get the final executable.
"g++" --fPIC -O0 -fno-inline -Wall -g -c -o "hello.o" "hello.cpp"
"g++" -g "hello.o" -o "hello"
The compile only equivalent structured options:
{
"source": "hello.cpp",
"output": {
"hello.o": "object"
},
"optimization": {
"compile": "off",
"inline": false
},
"warnings": {
"enable": "all"
},
"debug": true,
"runtime": {
"multithread": false,
"debug": true,
"static": true
}
}
Followed by the structured options to accomplish the link:
{
"source": "hello.o",
"output": {
"hello": "exec"
},
"debug": true
}
5.3.3. Many Sources
This is a single command B2 uses to bootstrap its engine on Linux with GCC, and Windows with MSVC. This is a variation on a simple basic invocation that builds many files with some extra options.
g++ -x c++ -std=c++11 -pthread -O2 -s -DNDEBUG bindjam.cpp builtins.cpp
class.cpp command.cpp compile.cpp constants.cpp cwd.cpp debug.cpp
debugger.cpp events.cpp execcmd.cpp execnt.cpp execunix.cpp filent.cpp
filesys.cpp fileunix.cpp frames.cpp function.cpp glob.cpp hash.cpp
hcache.cpp hdrmacro.cpp headers.cpp jam_strings.cpp jam.cpp jamgram.cpp
lists.cpp make.cpp make1.cpp md5.cpp mem.cpp modules.cpp native.cpp
option.cpp output.cpp parse.cpp pathnt.cpp pathsys.cpp pathunix.cpp
regexp.cpp rules.cpp scan.cpp search.cpp startup.cpp tasks.cpp
timestamp.cpp value.cpp variable.cpp w32_getreg.cpp mod_command_db.cpp
mod_db.cpp mod_jam_builtin.cpp mod_jam_class.cpp mod_jam_errors.cpp
mod_jam_modules.cpp mod_order.cpp mod_path.cpp mod_property_set.cpp
mod_regex.cpp mod_sequence.cpp mod_set.cpp mod_string.cpp mod_summary.cpp
mod_sysinfo.cpp mod_version.cpp -o b2
Other than many more files this example doesn’t differ much from the Hello World example.
{
"source": [
"bindjam.cpp",
"builtins.cpp",
"class.cpp",
"command.cpp",
"compile.cpp",
"constants.cpp",
"cwd.cpp",
"debug.cpp",
"debugger.cpp",
"events.cpp",
"execcmd.cpp",
"execnt.cpp",
"execunix.cpp",
"filent.cpp",
"filesys.cpp",
"fileunix.cpp",
"frames.cpp",
"function.cpp",
"glob.cpp",
"hash.cpp",
"hcache.cpp",
"hdrmacro.cpp",
"headers.cpp",
"jam_strings.cpp",
"jam.cpp",
"jamgram.cpp",
"lists.cpp",
"make.cpp",
"make1.cpp",
"md5.cpp",
"mem.cpp",
"modules.cpp",
"native.cpp",
"option.cpp",
"output.cpp",
"parse.cpp",
"pathnt.cpp",
"pathsys.cpp",
"pathunix.cpp",
"regexp.cpp",
"rules.cpp",
"scan.cpp",
"search.cpp",
"startup.cpp",
"tasks.cpp",
"timestamp.cpp",
"value.cpp",
"variable.cpp",
"w32_getreg.cpp",
"mod_command_db.cpp",
"mod_db.cpp",
"mod_jam_builtin.cpp",
"mod_jam_class.cpp",
"mod_jam_errors.cpp",
"mod_jam_modules.cpp",
"mod_order.cpp",
"mod_path.cpp",
"mod_property_set.cpp",
"mod_regex.cpp",
"mod_sequence.cpp",
"mod_set.cpp",
"mod_string.cpp",
"mod_summary.cpp",
"mod_sysinfo.cpp",
"mod_version.cpp"
],
"output": {
"b2": "exec"
},
"define": {
"NDEBUG": null
},
"language": {
"name": "c++",
"standard": "11"
},
"optimization": {
"compile": "safe",
"link": true,
"msvc.global_data": true
},
"runtime": {
"multithread": true,
"debug": false,
"static": true
}
}
"cl" /nologo /MP /MT /TP /Feb2 /wd4996 /wd4675 /O2 /GL /EHsc /Zc:wchar_t /Gw
-DNDEBUG bindjam.cpp builtins.cpp class.cpp command.cpp compile.cpp
constants.cpp cwd.cpp debug.cpp debugger.cpp events.cpp execcmd.cpp
execnt.cpp execunix.cpp filent.cpp filesys.cpp fileunix.cpp frames.cpp
function.cpp glob.cpp hash.cpp hcache.cpp hdrmacro.cpp headers.cpp jam.cpp
jamgram.cpp lists.cpp make.cpp make1.cpp md5.cpp mem.cpp modules.cpp
native.cpp option.cpp output.cpp parse.cpp pathnt.cpp pathsys.cpp
pathunix.cpp regexp.cpp rules.cpp scan.cpp search.cpp jam_strings.cpp
startup.cpp tasks.cpp timestamp.cpp value.cpp variable.cpp w32_getreg.cpp
mod_command_db.cpp mod_db.cpp mod_jam_builtin.cpp mod_jam_class.cpp
mod_jam_errors.cpp mod_jam_modules.cpp mod_order.cpp mod_path.cpp
mod_property_set.cpp mod_regex.cpp mod_sequence.cpp mod_set.cpp
mod_string.cpp mod_summary.cpp mod_sysinfo.cpp mod_version.cpp
/link kernel32.lib advapi32.lib user32.lib
/MANIFEST:EMBED /MANIFESTINPUT:b2.exe.manifest
The msvc
equivalent has the addition of listing some system libraries and
the special Windows embedded manifest for the executable.
{
"source": [
"bindjam.cpp",
"builtins.cpp",
"class.cpp",
"command.cpp",
"compile.cpp",
"constants.cpp",
"cwd.cpp",
"debug.cpp",
"debugger.cpp",
"events.cpp",
"execcmd.cpp",
"execnt.cpp",
"execunix.cpp",
"filent.cpp",
"filesys.cpp",
"fileunix.cpp",
"frames.cpp",
"function.cpp",
"glob.cpp",
"hash.cpp",
"hcache.cpp",
"hdrmacro.cpp",
"headers.cpp",
"jam_strings.cpp",
"jam.cpp",
"jamgram.cpp",
"lists.cpp",
"make.cpp",
"make1.cpp",
"md5.cpp",
"mem.cpp",
"modules.cpp",
"native.cpp",
"option.cpp",
"output.cpp",
"parse.cpp",
"pathnt.cpp",
"pathsys.cpp",
"pathunix.cpp",
"regexp.cpp",
"rules.cpp",
"scan.cpp",
"search.cpp",
"startup.cpp",
"tasks.cpp",
"timestamp.cpp",
"value.cpp",
"variable.cpp",
"w32_getreg.cpp",
"mod_command_db.cpp",
"mod_db.cpp",
"mod_jam_builtin.cpp",
"mod_jam_class.cpp",
"mod_jam_errors.cpp",
"mod_jam_modules.cpp",
"mod_order.cpp",
"mod_path.cpp",
"mod_property_set.cpp",
"mod_regex.cpp",
"mod_sequence.cpp",
"mod_set.cpp",
"mod_string.cpp",
"mod_summary.cpp",
"mod_sysinfo.cpp",
"mod_version.cpp",
"kernel32.lib",
"advapi32.lib",
"user32.lib"
],
"output": {
"b2": "exec"
},
"define": {
"NDEBUG": null
},
"language": {
"name": "c++",
"standard": "11"
},
"optimization": {
"compile": "safe",
"link": true,
"msvc.global_data": true
},
"runtime": {
"multithread": true,
"debug": false,
"static": true
},
"vendor": {
"msvc": {
"manifest": {
"source": "b2.exe.manifest",
"embed": true
}
}
}
}
5.4. Options
ℹ
|
In the tables below compiler drivers or front ends we list the un-prefixed option name. And for build systems we list any abstraction for the option. But importantly, we don’t list if the build system only allows for specifying the raw option. As it doesn’t add any more information than what is given for the compiler driver information. |
The options specified below show first how the concept is specified in two kinds of tools, compiler driver/front-ends and build systems. This is a small, but hopefully representative, sampling of syntax and semantics. The tools considered are: MSVC, [2] GCC, [3] CMake, [4] B2 [5]
And subsequently show the chosen key, value, and semantics of the option. The schema for the value is explained in the value if it’s brief. But otherwise defined in the semantics of each. As mentioned previously there can be multiple value types for an option. Each of those is outlined to the extent that the specific option allows (currently). The explanations of the options are not exhaustively precise. That is left for future wording.
5.4.1. Source
Tool | Name | Semantics |
---|---|---|
MSVC |
|
A file specified as a regular argument is added as a source to process. |
GCC |
|
A file specified as a regular argument is added as a source to process. |
CMake |
|
A file specified as an argument to a target is added as a source to process. |
B2 |
|
A file specified as an argument to a target is added as a source to process. |
- Key
-
Use
std.source
or shortenedsource
. - Value
-
Can be a single
string
, anobject
, or anarray
(with strings or objects). - Semantics
-
Adds the sources given to the set of files to process. Depending on the value the semantics can be adjusted:
string
-
The single file is added to the set. The type of file is determined by the file extension.
{ "source": "main.cpp" }
object
-
Specifying an
object
defines additional properties for the source.{ "source": { "main.cpp": "c++" } }
array
-
An array can contain either
string
orobject
values for the source. Each source in the array is added in order.{ "source": [ "main.cpp", "utils.cpp", "algo.cpp", { "api.i": "c" } ] }
- Merge Semantics
-
The sources in this specification are appended to any existing sources.
5.4.1.1. Source Object
When a source is specified as an object
it consist of a single key and value
item. Where the key is the filename of the source. And the value is the
kind of file it is. The minimal set of file types that a tool should support
are:
c++
-
A file to interpreted as containing C++ source code to process.
object
-
A compiled TU binary object to process, usually to link.
dynamic_lib
-
A collection of compiled TUs to process, usually to resolve at load time.
archive_lib
-
A collection of compiled TUs to process, usually to resolve at link link.
Source types to match the output
and language
options should be
supported in addition to those above.
The choice of using an object
with the single filename+type is to allow
an abbreviated method to override the default file type determination.
5.4.2. Output
Tool | Name | Semantics |
---|---|---|
MSVC |
|
Set the name of the generated output. The option specifies the kind of
output generated as: |
GCC |
|
Sets the file to output with ( |
CMake |
|
Defines a target for an executable or library. |
B2 |
|
Defines a target of the given type: |
- Key
-
Use
std.output
or shortenedoutput
. - Value
-
Is an
object
with single entry. - Semantics
-
Specifies the output file, and kind of output, to generate when processing the sources.
{
"output": {
"a.out": "exec"
}
}
The key in the entry specifies the filename of the output. And the value in the entry specifies the kind of output. The kind of output also indicates the type of operation the tool will do. Possible kinds of outputs:
exec
-
Links the compiled sources into an executable file.
object
-
Compiles the sources into a linkable object file.
dynamic_lib
-
Links the compiled sources into a dynamically loadable library.
archive_lib
-
Collects the compiled sources into an archive library of object files.
5.4.3. Include Directories
Tool | Name | Semantics |
---|---|---|
MSVC |
|
Adds the directory to the include search list. |
GCC |
|
Adds the directory to the include search list. |
CMake |
|
Adds the directories to the include search list. |
B2 |
|
Adds the directory, order unspecified, to the include search list. |
- Key
-
Use
std.include_dirs
or shortenedinclude_dirs
. - Value
-
The option would be either a single
string
or anarray
ofstring
-s. Eachstring
is a pathname of which interpretation is up to the application. - Semantics
-
Adds the listed pathnames to the end of the include directories of the application. It is up to the application to interpret how the composed list of directories is used. But it commonly interpreted as
#include
preprocessor directives to look for files in the order of the include directories list.
{
"include_dirs": "/opt/boost_config/include"
}
{
"include_dirs": [
"/opt/boost_config/include",
"/opt/openssl/include"
]
}
- Merge Semantics
-
The directories in this specification are appended to any existing directories.
5.4.4. Library Directories
Tool | Name | Semantics |
---|---|---|
MSVC |
|
Adds to the list of directories to search for link libs. The |
GCC |
|
|
CMake |
|
Adds to the list of directories which will be used by the linker to search
for libraries. Specifying |
B2 |
|
Adds to the list of directories which will be used by the linker to search for libraries. |
- Key
-
Use
std.library_dirs
or shortenedlibrary_dirs
. - Value
-
The option would be either a single
string
or anarray
ofstring
-s. Eachstring
is a pathname of which interpretation is up to the application. - Semantics
-
Adds the listed pathnames to the end of the library search directories of the application. It is up to the application to interpret how the composed list of directories is used.
{
"library_dirs": "/opt/boost_config/lib"
}
{
"library_dirs": [
"/opt/boost_config/lib",
"/opt/openssl/lib"
]
}
- Merge Semantics
-
The directories in this specification are appended to any existing directories.
5.4.5. Define Preprocessor Symbols
Tool | Name | Semantics |
---|---|---|
MSVC |
|
Defines a preprocessor symbol to a value overriding any previous definition.
If no value is given |
GCC |
|
Defines a preprocessor symbol to a value overriding any previous definition.
If no value is given |
CMake |
|
Defines a preprocessor symbol to a value overriding any previous definition. If no value is given no value is used and the value is up to the compiler. |
B2 |
|
Defines a preprocessor symbol to a value overriding any previous definition. If no value is given no value is used and the value is up to the compiler. |
- Key
-
Use
std.define
or shorteneddefine
. - Value
-
The option will be a dictionary where the keys are the preprocessor symbol to define and the values are mapped from JSON to corresponding preprocessor values.
- Semantics
-
For each symbol (the key name) the C++ preprocessor will define the symbol to the value. The value will be converted from JSON values as:
-
JSON
number
is converted to a string and pasted. -
JSON
string
is used directly. -
JSON
boolean
is converted astrue
⇒1
andfalse
⇒0
. -
JSON
null
converts to nothing, and hence the default implementation value should be used.
-
{
"define": {
"BOOST_ALL_NO_LIB": 1,
"_WIN32_WINNT": "0x0600",
"_GNU_SOURCE": true,
"U_USING_ICU_NAMESPACE": false,
"NOMINMAX": null
}
}
- Merge Semantics
-
The definitions in this specification either: add to existing set of definitions when the symbol doesn’t exist, or replace the definitions when the symbol already exists.
5.4.6. Undefine Preprocessor Symbols
Tool | Name | Semantics |
---|---|---|
MSVC |
|
Undefines the given preprocessor symbol. |
GCC |
|
Undefines the given preprocessor symbol. |
CMake |
N/A |
|
B2 |
|
Undefines the given preprocessor symbol. |
- Key
-
Use
std.undef
or shortenedundef
. - Value
-
The option would be either a single
string
or anarray
ofstring
-s. Eachstring
is a symbol to undefine. - Semantics
-
For each
string
in the value "undefines" the preprocessor symbol. The option is evaluated after thedefine
option.
{
"undef": [
"NDEBUG"
]
}
- Merge Semantics
-
The undefs in this specification either: add to existing set of undefs when the symbol doesn’t exist, or replace the undefs when the symbol already exists.
5.4.7. Language
Tool | Name | Semantics |
---|---|---|
MSVC |
|
Specified a source file is a C ( |
GCC |
|
Specified source files are the given language. Otherwise the file extension is used. |
CMake |
|
Specified source files are the given language. Otherwise the file extension is used. |
B2 |
|
Specified source files are the given language. Otherwise the file extension is used. |
- Key
-
Use
std.language
or shortenedlanguage
. - Value
-
The option value is a single
string
indicating the name of a language. The set of values is open. But at minimumc++
andc
must be recognized. Other values could be:assembly
,objective-c
,objective-c++
,fortran
,go
,d
,ada
. - Semantics
-
The given language sets the one to use for sources that do not otherwise specify one. The tool should indicate an error for languages it doesn’t recognize.
{
"language": "c++"
}
- Merge Semantics
-
The language in this specification replaces an existing language specification.
5.4.8. Optimization
Tool | Name | Semantics |
---|---|---|
MSVC |
|
Disables ( |
GCC |
|
Disables ( |
CMake |
|
Generates build description that may enable optimizations. |
B2 |
|
Disables ( |
- Key
-
Use
std.optimization
or shortenedoptimization
. - Value
-
When the value is a
string
it indicates the level of optimization. - Semantics
-
The level of optimization is applied to all the sources being processed. The set of values for
optimization
is fixed, but tools are free to ignore or use equivalent for the various values. Which optimizations the tool performs for each value is up to the tool. The only required semantic is foroff
that must disable all optimizations. Possible values:-
off
- Disable optimizations. -
minimal
- Optimizations that may improve speed and space. -
safe
- More optimizations that need more work. -
speed
- Prefer speed over space optimizations. -
space
- Prefer smaller binaries over speed optimizations. -
debug
- Optimize such that debugging capabilities are preserved.
-
{
"optimization": "minimal"
}
If the value is an object
it can have the following fields, and semantics:
compile
-
Same effect and values as above when the value is a
string
. link
-
A
boolean
value that whentrue
enables link time (whole program) optimizations. Whenfalse
, or not present, disables link time optimizations.
An optimization value of a string
is equivalent to the following object
specifications:
-
off
-{ "compile": "off", "link": false }
-
minimal
-{ "compile": "minimal" }
-
safe
-{ "compile": "safe" }
-
speed
-{ "compile": "speed" }
-
space
-{ "compile": "space" }
-
debug
-{ "compile": "debug" }
- Merge Semantics
-
The fields in the
object
value replace existing optimization fields. For a singlestring
value the equivalentobject
for that value is merged.
5.4.9. Debug
Tool | Name | Semantics |
---|---|---|
MSVC |
|
Specifies the type of debugging information to generate. |
GCC |
|
Produce debugging information. |
CMake |
|
Generates build descriptions with debug building. |
B2 |
|
Enable ( |
- Key
-
Use
std.debug
or shorteneddebug
. - Value
-
The option is a single
boolean
or anobject
. - Semantics
-
When the value is
true
, enables generation of debug information. When the value isfalse
, disables generation of debug information.
{
"debug": true
}
- Merge Semantics
-
The debug value in this specification replaces an existing debug value.
5.4.10. Vendor
We recognize that std
options will never be sufficient, or practical, to
delineate all possible functionality. To accommodate the flexibility needed
over time to support all build capabilities we need to allow for tools to
define their own options outside of the standard. While it is possible for tools
to use scoped keys to specify their own options, that method may be harder to
manage for some environments. To allow for easier destructuring we introduce
a vendor
option.
- Key
-
Use
std.vendor
or shortenedvendor
. - Value
-
The option contains a single
object
with tool defined keys and values. The names of the keys, or the type of the values, or the semantics are not specified here. And it is up to the tool creators to coordinate on unique keys. - Semantics
-
The value in the vendor specific fields is interpreted per the tool requirements. Any number of vendor keys+values is allowed. Tools are not restricted in what they support. Either of their own keys+values. Or the Keys+values of other vendor tools. This allows from some level of interchange for tools that need to support some understanding of what other tools specify. For example static analyzers that often need to digest sources across different vendors.
{
"debug": {
"enable": true,
"vendor": {
"gcc": {
"compressed": true
}
}
},
"optimization": {
"compile": "safe",
"link": true,
"vendor": {
"msvc": {
"global_data": true
}
}
},
"vendor": {
"msvc": {
"manifest": {
"source": "b2.exe.manifest",
"embed": true
},
"subsystem": "console"
}
}
}
- Merge Semantics
-
The semantics are up to the vendor to specify for the individual options they define. Above that, if a vendor key in this specification is not present in the existing specification it is added. Otherwise, for a key in the vendor
object
in this specification is not present in the existing specification it is added.
6. License
This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.