Document #: |
ISO/IEC/JTC1/SC22/WG21/P3051R0 |
Date: |
2023-12-11 |
Audience: |
SG15 |
Authors: |
René Ferdinand Rivera Morell |
Reply-to: | |
Copyright: |
Copyright 2023 René Ferdinand Rivera Morell, Creative Commons Attribution 4.0 International License (CC BY 4.0) |
1. Abstract
This aims to define a standard structured response file format that can become the best way to communicate compiling C++.
3. Motivation
A key aspect of inter-operation between tools in the ecosystem is having a common language to express tool commands, i.e. in compiler drivers, that can be understood and/or translated between different tools and platforms.
Currently tools use differing, but related, ways for users (and other tools) to specify the set of options to "toolsets" (compiler drivers, linkers, etc). While there are some commonalities in how those options are specified as "configuration response files" containing bare options, there are sufficient differences to hinder general inter-operation.
4. Scope
This proposal aims to specify a method for tools to specify arguments to other tools in a consistent and flexible manner. As such what it does and doesn’t aim to accomplish:
-
It does not aim to remove current arguments handling. It does allow for incrementally adoption of an alternative that facilitates common tool arguments.
-
It does not specify any particular options to replace existing options (except the ones to indicate the new response file). It does aim to specify an additional alternative option style that reduces the parsing complexity, and perhaps ambiguities, in tools.
5. Current Response Files
Current response files commonly contain "unstructured" sequence of command line arguments. Some also allow recursive inclusion and expansion of additional response files. Below are a summary of the syntax, capabilities, and restrictions of some of the compiler drivers.
5.1. Clang
-
Use of
@filename
argument. -
Use of
--config=filename
argument. -
References to other response files allowed.
Example response file:
# Several options on line
-c --target=x86_64-unknown-linux-gnu
# Long option split between lines
-I/usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/C++/5.4.0
# other config files may be included
@linux.options
Source: Clang Compiler User’s Manual [1]
5.1.1. GNU Compiler Collection, GCC
-
Use of
@filename
argument. -
References to other response files allowed.
Example response file:
-o "hello" -Wl,--start-group "bin/hello.o" -Wl,-Bstatic -Wl,-Bdynamic -Wl,--end-group -fPIC -g
Source: GCC Documentation [2]
5.2. Intel® oneAPI DPC++/C++
-
Use of
@filename
argument. -
References to other response files disallowed.
-
Platform specific option syntax.
-
#
prefixed line comments.
Example response file for "Linux":
# compile with these options
-O0
# end of response file
Example response file for "Windows":
# compile with these options
/0d
# end of response file
Source: Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference [3]
5.3. NVIDIA CUDA Compiler Driver NVCC
-
Use of
--options-file filename,…
, or-optf filename,…
.
-O0
Source: NVIDIA CUDA Compiler Driver NVCC Documentation [4]
5.4. Microsoft Visual C++
-
Use of
@filename
argument. -
References to other response files disallowed.
Example response file:
"hello.cpp" -c -Fo"bin\hello.obj" -TP /wd4675 /EHs /GR /Zc:throwingNew /Z7 /Od /Ob0 /W3 /MDd /Zc:forScope /Zc:wchar_t /Zc:inline /favor:blend
Source: Microsoft C++, C, and Assembler documentation [5]
5.5. Other
- Edison Design Group C++ Front End (
edgcpfe
) -
Does not support response configuration files. [6]
- Embarcadero C++ Builder
-
Supports at least
@filename
option, with bare arguments syntax. [7] - IBM Open XL C/C++ for AIX 17.1.0
-
Supports the Clang [1]
--config
option. - IBM Open XL C/C++ for Linux on Power 17.1.1
-
Has migrated to using the Clang [1] toolchain and supports the same options.
- IBM Open XL C/C++ and XL C/C++ for z/OS
-
Supports the Clang [1]
--config
option. - HPE Cray Programming Environment (CPE)
-
Support depends on the platform complier environment.
- NVIDIA HPC C++ (NVC++)
-
Does not support response configuration files. [8]
- Oracle® Developer Studio 12.6
-
Supports a single global options configuration file, with bare arguments syntax.
- Python
argparse
module -
Supports arbitrarily character prefixed response files. Where the files, by default, contain an argument per line. [9]
6. Design
Abstractly response files are files in the operating system storage that contain arguments and options in addition to the ones given directly in the tool invocation. For this design we are going to refer to two different types of ways to give pass the information to the tools:
- Arguments
-
Arguments use the syntax that one would specify directly in the command line as a user. This would be things like the
-O1
optimization flag argument. - Options
-
Options are the conceptual flag option that the tool understands that does not necessarily follow the same syntax as the flags specify in command line arguments.
Using those two distinct definitions allows us to specify them differently in the response file. Using arguments we will follow the same existing command line syntax. Keeping a form of compatibility with existing. But we would then be restricted to that syntax. Using options we can use a definition that fits best with a structured data definition.
Last consideration is the choice of structured data format we will have in the response files. Keeping with previous work and practice we will use JSON text as that format. [10] [11]
With that context, here are two example structured response files:
Simple Arguments | Structured Options |
---|---|
|
|
The simple arguments example shows specifying an "arguments" key with an array of values corresponding to the regular command line arguments. This mirrors what one would see in a JSON compilation database [11]. This style has some advantages:
-
There is a direct correlation with the JSON compilation database format which some tools support. Which would mean that they already have code to deal with at the JSON loading and understanding of it.
-
There’s a direct mapping for regular command line arguments. Hence it will be less effort to support this style for tools.
And there are some disadvantages:
-
The parsing of the arguments still has to happen to get the at the option and value.
-
It is subject to the same limitations of regular command line arguments. Like complexity and ambiguities of command line syntax.
The structured options example shows specifying an "options" key with an array of option names or option objects. Where the option objects contain an option name and structured value. Some advantages of this style of structured data are:
-
The option names do indicate a particular option prefix (i.e.
-
,--
,/
, etc) making it possible to use tools agnostic common names. -
The ability to use arrays, or possibly objects, for the option values allows for logical groupings and avoid extra tracking of such as is present in command line parsing of options.
Some possible disadvantages:
-
Tools will need an alternative path to understand the new options. Although hopefully this is balanced by the more direct availability of the values.
-
If this style is also to be supported in the JSON compile database format it means more work to accomplish that. But again, the hope is that there is an easier mapping from internal structures to this format.
One additional aspect of how the arguments and options are specified is that they allow for a simple transformation between them if that is needed. Although the options to arguments transformation is an easier one than the converse.
The design we are proposing has the following key points:
-
The format of the file is well formed JSON text.
-
The top level of that is a JSON object with one, of or both, arguments and options fields.
-
The addition of one command line option to specify the structured response file.
-
The arguments field has an array value with string values.
-
The arguments values are single strings containing the same options as would be specified in the command line.
-
The arguments values can contain the option for other structure response files. Which will be recursively inserted at the location of the option.
-
The options field has an array of values for structured options.
-
The options values can be either a single option name (for flag options) or an object.
-
The options values that are an object contain a single field with the base option name (i.e. with an option prefix).
-
The options value that is an object the field value in the object can be either a single value or an array of values.
-
The arguments and option names are not specified, and as such are implementation defined.
6.1. Command Line
We propose to add a single new command line option as a requirement to implementing this capability:
$ tool --std-rsp=file
Or:
$ tool -std-rsp:file
The std-rsp
command line option, which can be repeated, will read the
indicated file and parse the JSON text contents to configure the tool as needed.
6.2. File Format
The response file is a valid JSON text file with a single JSON object as the
root object. There are two mutually exclusive fields to specify the command
information: arguments
or options
.
There are two additional, optional, fields: $schema
and version
. The
$schema
field points at the released JSON Schema.
[12]
The version
field indicated the response format of the file. The version number follows those specified in the capability introspection version
numbering and semantics.
The arguments
field specifies a single array value of strings. Each string
array entry is a command line argument to be used directly by the tool. The
specific syntax of the arguments is up to the specific tool. For example a
compile invocation for GCC, and compatible compiler front-ends, would look like:
{
"$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_rsp-1.0.0.json",
"version": "1",
"arguments": ["-fPIC", "-O0", "-fno-inline", "-Wall", "-Werror", "-g", "-I\"util/include\"", "-c" ]
}
You can also include --std_rsp=file
options in the list of arguments to
include the arguments that are referenced in another response file, and so on.
For example, given a common.json
response file as such:
{
"$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_rsp-1.0.0.json",
"version": "1",
"arguments": ["-fPIC", "-O0", "-fno-inline", "-Wall", "-Werror", "-g", "-I\"util/include\"", "-c" ]
}
One can refer to it in a main response file that compiles a C++ source file:
{
"$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_rsp-1.0.0.json",
"version": "1",
"arguments": [ "--std_rsp=common.json", "main.cpp", "-o", "main.o" ]
}
The effect is that the options in common.json
ar inserted in the arguments
array at the location of the --std_rsp=common.json
argument.
The options
field specifies a single array value of options values. An
option value can be either a single string or a JSON object (option object).
The option object contains fields for each option to be used with each having
a value that is a single string or an array of strings. For example:
{
"$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_rsp-1.0.0.json",
"version": "1",
"options": [
"fPIC",
{ "O": "0",
"W": [ "all", "error" ],
"I": [ "util/include" ] },
"fno-inline",
"g",
"c"
]
}
When compared to having the arguments
field there are more constraints to the
syntax of the options:
-
The option names do not contain prefix characters (i.e. the
--
,-
,/
, etc) and without the value separator (i.e.=
,:
, etc). -
Flags, i.e. options without a value, must be specified as the single string in the options array.
-
Options, i.e. when there is a value (i.e.
--opt=value
command line argument) must be specified as part of an option object. I.e. as{ "opt": "value" }
. -
Options that can be specified multiple times they can be specified as multiple separate entries in the
options
array. Or can specified once, and the multiple values specified as an array in the value. I.e. as{ "opt": [ "value0", "value1" ] }
.
But like the arguments
field, additional response files can be inserted at the
indicated location by specifying a { "std_rsp": "file.json" }
option.
Do note though that even though the option names have a specific naming format they are still defined by the tool. The goal of having the restrictions on the option names is to make it possible in the future to specify tool agnostic options to facilitate general interop. But that is a subject for future proposals.
‼
|
Tools need to support being able to either refer to an options
style response file from an arguments style response file, and conversely.
|
6.3. Flags or Names
There is a question as to wether it’s better to use command line flags (for
example W
, o
, I
, etc) or non-command line names (for example
warning
, output
, include
, etc) in the options
field. We will call the
former the "Flags" choice, and the latter the "Names" choice. Each would mean:
- Flags
-
The keys would be direct correlation to the specific tool, i.e. compiler, command line options. For example a
I
field name would match the-I
. - Names
-
The keys would be symbolic names correlating to a concept that may map to one or more current command line options. For example an
include
field name would map to one or more-I
command line options. But it would also map to one or more new--include
command line options.
Given those definitions we can consider the pros and cons of each method.
Flags | Names | |
---|---|---|
Pros |
|
|
Cons |
|
|
ℹ
|
One key consideration is that this proposal does not prevent choosing
either flags or names as the options fields. Specifying the fields can be
accomplished in a further proposal that specifies either flags or names as
common specified syntax.
|
There are a set of possible design questions for this proposal that come to mind:
-
Specify
options
fields as "Flags" or "Names"? -
Specify the
options
in this proposal or a separate proposal?
7. Questions
- Why use a new option (
--std_rsp=file
) instead of existing response file methods like@file
? -
Implementing support in tools for using the same response file options would:
-
Mean that the option to add the structured response files would vary from tool to tool as some use different styles for specifying the file.
-
It would be harder to implement as it would require inspecting the file content to determine the parsing method needed.
-
- Why have an
arguments
field instead of just using theoptions
field? -
Having an
arguments
field has a couple of benefits:-
It makes it easier for tools to immediately support use of this format as they can directly inject the arguments into their existing command line argument parsing.
-
Makes it easier for tools that already support
compile_commands.json
to produce or consume structured response files as they are the same format for thearguments
field in both.
-
8. Implementation Experience
Here is a simple implementation, as a Python 3 script, that can accept any
number of arguments will translate any referenced structured response files to
use and return a list of arguments to pass to g++`
. Note, it’s not very
smart, as it makes various generalizations about option syntax for GCC. But it
does show the simplicity of a possible minimal integration:
#!/usr/bin/env python3
import collections.abc
import glob
import json
import os
import re
import sys
out_args = []
def add_rsp(filename):
with open(filename) as f:
rsp = json.load(f)
if 'arguments' in rsp:
add_rsp_args(rsp)
elif 'options' in rsp:
add_rsp_opts(rsp)
def add_rsp_args(rsp):
for arg in rsp['arguments']:
m = re.match(r'^--std_rsp=(.*)$', arg)
if m:
add_rsp(m.group(1))
else:
out_args.append(arg)
def add_rsp_opts(rsp):
for opt in rsp['options']:
if isinstance(opt, str):
add_rsp_opt(opt)
else:
for (key, value) in opt.items():
add_rsp_opt(key, value)
def add_rsp_opt(opt, value=None):
if value != None and not isinstance(opt, collections.abc.Sequence):
value = [value]
if opt == "std_rsp":
for filename in value:
add_rsp(filename)
elif value == None:
out_args.append(opt_as_arg(opt))
else:
arg_name = opt_as_arg(opt)
if opt in ["D", "I", "O", "W"]:
out_args.extend(map(lambda v: arg_name+v, value))
else:
out_args.extend(map(lambda v: arg_name+"="+v, value))
def opt_as_arg(opt):
if opt in ["help", "target-help", "version", "coverage", "entry", \
"no-sysroot-suffix", "sysroot", "param"]:
return "--"+opt
else:
return "-"+opt
for arg in sys.argv[1:]:
m = re.match(r'^--std_rsp=(.*)$', arg)
if m:
add_rsp(m.group(1))
else:
out_args.append(arg)
out_args = map(glob.escape, out_args)
out_args = map(lambda s: s.replace(' ', '\ '), out_args)
out_args = ' '.join(out_args)
print(out_args)
Usage:
$ stdrsp.py --std_rsp=example-02.json
-fPIC -O0 -fno-inline -Wall -Werror -g -I"util/include" -c
$ g++ `stdrsp.py --std_rsp=example-02.json` main.cpp
$ ls -1 main.o
main.o