1. Changelog
1.1. Revision 7 - June 27th, 2022
-
Rename
tois_empty
, per C++ Standards Committee request.if_empty -
Make sure
andprefix
are optional alongsidesuffix
and to be voted on separately.if_empty -
Add an example for
withif_empty
.limit ( 0 ) -
Make sure it is a constraint violation to encounter unrecognized embed parameters: move specification of preprocessor parameters to their own thing in the frontmatter of the preprocessor text entirely.
-
Explain the decisions for § 4.2.1 Parameters. These do not represent a change in the actual design, just a clarification requested by WG21 (the intent was to produce warnings/errors on unrecognized parameters, which this now allows by specifying it as a constraint rather than leaving it up to attribute wording).
-
We do not mention the word "attribute" or use "attribute"-based tokens at all here. We use different grammar terms to fully disambiguate between using attributes versus preprocessor parameters.
-
Explain some of the poll results from WG21 (same explanation as found in WG21 in the C++ version of this paper), in § 2.1 June 2022 Virtual C++ Meeting.
-
Add some explanations and showings for
versus__has_embed
/suffix
/prefix
-style of parameters in § 6.1 __has_embed(…) == 2, suffix/prefix/if_empty parameters, both, or neither?.if_empty
1.2. Revision 6 - June 17th, 2022
-
Editorial changes were made to the paper. These changes are non-consequential:
-
Moved
examples to the proper OPTIONAL section.is_empty -
Moved
examples to the embed parameter wording sub-clause.prefix
-
-
Added a history section explaining some of the pre-proposal steps this proposal went through to reach the form it is in today in § 6.2 Why a Preprocessor Directive, Specifically?.
-
Add the letter of support sent in to the C Standards and Shepherd’s Oasis, LLC in § 3.3 Support.
1.3. Revision 5 - April 12th, 2022
-
Additional syntax changes based on feedback from Joseph Myers, Hubert Tong, and users.
-
Minor wording tweaks and typo clean up.
-
An implementation available in Godbolt (since last revision as well and noted below).
-
The paper’s source code has been refactored:
-
Separated WG21 paper from WG14 paper.
-
Core paper together (rationale, reasoning), included in both C and C++ papers since rationale is identical.
-
-
Changed
to match feedback from last standards meeting, nominally that an empty resource returns__has_embed
instead of2
(but both decay to a truthy value during preprocessor conditional inclusion expressions). Modified by the wording and the prose in § 4.4 __has_embed.1 -
As a reaction to this, the
embed parameter is an optional part of the proposal, as explained in § 4.2.1.3 Empty Signifier. This did affect a user in an impactful manner but the new functionality is fine, but has some downsides w.r.t. "repeating yourself".is_empty
-
-
The wording for the limit parameter (in the embed parameter sub-clauses) adjusted to perform macro expansion, at least once. Exact wording may need help.
1.4. Revision 4 - February 7th, 2022
-
Clean up syntax.
-
Reimplement and deploy extension in Clang to ensure an implementation of named parameters work.
-
Change wording to encapsulate the new fixes.
-
Removed C++ wording to focus on C wording for this document.
1.5. Revision 3 - May 15th, 2021
-
Added post C meeting fixes to prepare for hopeful success next meeting.
-
Added 2 more examples to C and C++ wording.
-
Vastly improved wording and reduced ambiguities in syntax and semantics.
-
Fixed various wording issues.
1.6. Revision 2 - October 25th, 2020
-
Added post C++ meeting notes and discussion.
-
Removed type or bit specifications from the
directive.#embed -
Moved "Type Flexibility" section and related notes to the Appendix as they are now unpursued.
1.7. Revision 1 - April 10th, 2020
-
Added post C meeting notes and discussion.
-
Added discussion of potential endianness.
-
Improved wording section at the end to be more detailed in handling preprocessor (which does not understand types).
1.8. Revision 0 - January 5th, 2020
-
Initial release! 🎉
2. Polls & Votes
The votes for the C Committee are as follows:
-
Y: Ye
-
N: Nay
-
A: Abstain
The votes for the C++ Committee as as follows:
-
SF: Strongly in Favor
-
F: In Favor
-
N: Neutral
-
A: Against
-
SA: Strongly Against
2.1. June 2022 Virtual C++ Meeting
"EWG encourages P1967 to define the form of vendor extensions as parameters to
?"
SF | F | N | A | SA |
---|---|---|---|---|
4 | 4 | 3 | 1 | 0 |
This was the result of consensus. The extensive discussion also made it clear that we must make sure that unrecognized embed parameters, due to them changing how an initializer may be formed, must be considered ill-formed. Users may get around this by using
. To dispel the notion that they may be optional, frontmatter wording was added to cover this case.
Part of the discussion during this meeting was also whether or not the case for emptiness was useful. We moved the empty-based parameters to OPTIONAL pieces of wording, and expect to forward each of these on independent votes asides from the base proposal. This captures the sentiment of folks who may not have spoken up a lot during the meeting but nevertheless felt uneasy: we can simply go with whatever the poll says next meeting.
We took the feedback to rename
to
, since it is a better name for a "do-something-if-predicate-is-true" style attribute.
2.2. January/February 2022 C Meeting
"Does WG14 want the embed parameter specification as shown in N2898?"
Y | N | A |
---|---|---|
12 | 2 | 8 |
From the January/February 2022 Meeting Minutes, Summary of Decisions:
WG14 wants the embed parameter specification as shown in N2898.
We interpret this as consensus. We keep the parameters but make the one that folks were questioning (
) optional in response to the feedback during and after the meeting.
2.3. December 2020 Virtual C Meeting
"Do we want to allow #embed to appear in any context that is different from an initialization of a character array?"
Y | N | A |
---|---|---|
5 | 8 | 6 |
"Leaning in the direction of no but not clear." The paper author after consideration chose to keep this as-is right now. Discussion of the feature meant that trying to ban this from different contexts meant that a naïve, separated-preprocessor implementation would be banned and it would require special compiler magic to diagnose. Others pointed out that just trying to leave it "unspecified whether it works outside of the initialization of an array or not" is very dangerous to portability. The author agrees with this assessment and therefore will leave it as-is. The goal of this feature is to enable implementers to use the magic if they so choose, as an implementation detail and a Quality of Implementation selling point. Vendors who provide a simple expansion may not see improvements to throughput and speed of translation but that is their choice as an implementer. Therefore, we cannot do anything which would require them or any preprocessor implementation to traffic in magic directives unless they want to.
2.4. April 2020 Virtual C Meeting
"We want to have a proper preprocessor
over a
-based directive."
This had UNANIMOUS CONSENT to pursue a proper preprocessor directive and NOT use the
syntax. It is noted that the author deems this to be the best decision!
The following poll was later superseded in the C and C++ Committees.
"We want to specify embed as using
rather than
." (2-way poll.)
Y | N | A |
---|---|---|
10 | 2 | 3 |
-
Y: 10 bits-per-element (Ye)
-
N: 2 type-based (Nay)
-
A: 4 Abstain (Abstain)
This poll will be a bit harder to accommodate properly. Using a
that produces a numeric constant means that the max-length specifier is now ambiguous. The syntax of the directive may need to change to accommodate further exploration.
3. Introduction
For well over 40 years, people have been trying to plant data into executables for varying reasons. Whether it is to provide a base image with which to flash hardware in a hard reset, icons that get packaged with an application, or scripts that are intrinsically tied to the program at compilation time, there has always been a strong need to couple and ship binary data with an application.
Neither C nor C++ makes this easy for users to do, resulting in many individuals reaching for utilities such as
, writing python scripts, or engaging in highly platform-specific linker calls to set up
variables pointing at their data. Each of these approaches come with benefits and drawbacks. For example, while working with the linker directly allows injection of very large amounts of data (5 MB and upwards), it does not allow accessing that data at any other point except runtime. Conversely, doing all of these things portably across systems and additionally maintaining the dependencies of all these resources and files in build systems both like and unlike
is a tedious task.
Thusly, we propose a new preprocessor directive whose sole purpose is to be
, but for binary data:
.
3.1. Motivation
The reason this needs a new language feature is simple: current source-level encodings of "producing binary" to the compiler are incredibly inefficient both ergonomically and mechanically. Creating a brace-delimited list of numbers in C comes with baggage in the form of how numbers and lists are formatted. C’s preprocessor and the forcing of tokenization also forces an unavoidable cost to lexer and parser handling of values.
Therefore, using arrays with specific initialized values of any significant size becomes borderline impossible. One would think this old problem would be work-around-able in a succinct manner. Given how old this desire is (that comp.std.c thread is not even the oldest recorded feature request), proper solutions would have arisen. Unfortunately, that could not be farther from the truth. Even the compilers themselves suffer build time and memory usage degradation, as contributors to the LLVM compiler ran the gamut of the biggest problems that motivate this proposal in a matter of a week or two earlier this very year. Luke is not alone in his frustrations: developers all over suffer from the inability to include binary in their program quickly and perform exceptional gymnastics to get around the compiler’s inability to handle these cases.
C developer progress is impeded regarding the inability to handle this use case, and it leaves both old and new programmers wanting.
Finally, Microsoft has an ABI problem with its maximum string literal size that cannot be solved using string literals or anything treated like string literals, as the LLVM thread and the thread from Claire Xen make clear. It has also frustrated both C an C++ programmers alike, despite their best efforts. It was so frustrating that even extended-C-and-C++-compilers, like Circle, solve this problem with custom directives.
3.2. But How Expensive Is This?
Many different options as opposed to this proposal were seriously evaluated. Implementations were attempted in at least 2 production-use compilers, and more in private. To give an idea of usage and size, here are results for various compilers on a machine with the following specification:
-
Intel Core i7 @ 2.60 GHz
-
24.0 GB RAM
-
Debian Sid or Windows 10
-
Method: Execute command hundreds of times, stare extremely hard at
/Task Managerhtop
While
and
work well for getting accurate timing information and can be run several times in a loop to produce a good average value, tracking memory consumption without intrusive efforts was much harder and thusly relied on OS reporting with fixed-interval probes. Memory usage is therefore approximate and may not represent the actual maximum of consumed memory. All of these are using the latest compiler built from source if available, or the latest technology preview if available. Optimizations at
(GCC & Clang style)/
(MSVC style) or equivalent were employed to generate the final executable.
3.2.1. Speed
Strategy | 40 kilobytes | 400 kilobytes | 4 megabytes | 40 megabytes |
---|---|---|---|---|
GCC
| 0.236 s | 0.231 s | 0.300 s | 1.069 s |
-generated GCC
| 0.406 s | 2.135 s | 23.567 s | 225.290 s |
-generated Clang
| 0.366 s | 1.063 s | 8.309 s | 83.250 s |
-generated MSVC
| 0.552 s | 3.806 s | 52.397 s | Out of Memory |
3.2.2. Memory Size
Strategy | 40 kilobytes | 400 kilobytes | 4 megabytes | 40 megabytes |
---|---|---|---|---|
GCC
| 17.26 MB | 17.96 MB | 53.42 MB | 341.72 MB |
-generated GCC
| 24.85 MB | 134.34 MB | 1,347.00 MB | 12,622.00 MB |
-generated Clang
| 41.83 MB | 103.76 MB | 718.00 MB | 7,116.00 MB |
-generated MSVC
| ~48.60 MB | ~477.30 MB | ~5,280.00 MB | Out of Memory |
3.2.3. Analysis
The numbers here are not reassuring that compiler developers can reduce the memory and compilation time burdens with regard to large initializer lists. Furthermore, privately owned compilers and other static analysis tools perform almost exponentially worse here, taking vastly more memory and thrashing CPUs to 100% for several minutes (to sometimes several hours if e.g. the Swap is engaged due to lack of main memory). Every compiler must always consume a certain amount of memory in a relationship directly linear to the number of tokens produced. After that, it is largely implementation-dependent what happens to the data.
The GNU Compiler Collection (GCC) uses a tree representation and has many places where it spawns extra "garbage", as its called in the various bug reports and work items from implementers. There has been a 16+ year effort on the part of GCC to reduce its memory usage and speed up initializers (C Bug Report and C++ Bug Report). Significant improvements have been made and there is plenty of room for GCC to improve here with respect to compiler and memory size. Somewhat unfortunately, one of the current changes in flight for GCC is the removal of all location information beyond the 256th initializer of large arrays in order to save on space. This technique is not viable for static analysis compilers that promise to recreate source code exactly as was written, and therefore discarding location or token information for large initializers is not a viable cross-implementation strategy.
LLVM’s Clang, on the other hand, is much more optimized. They maintain a much better scaling and ratio but still suffer the pain of their token overhead and Abstract Syntax Tree representation, though to a much lesser degree than GCC. A bug report was filed but talk from two prominent LLVM/Clang developers made it clear that optimizing things any further would require an extremely large refactor of parser internals with a lot of added functionality, with potentially dubious gains. As part of this proposal, the implementation provided does attempt to do some of these optimizations, and follows some of the work done in this post to try and prove memory and file size savings. (The savings in trying to optimize parsing large array literals were "around 10%", compared to the order-of-magnitude gains from
and similar techniques).
Microsoft Visual C (MSVC) scales the worst of all the compilers, even when given the benefit of being on its native operating system. Both Clang and GCC outperform MSVC on Windows 10 or WINE as of the time of writing.
Linker tricks on all platforms perform better with time (though slower than
implementation), but force the data to be optimizer-opaque (even on the most aggressive "Link Time Optimization" or "Whole Program Optimization" modes compilers had). Linker tricks are also exceptionally non-portable: whether it is the
assembly command supported by certain compilers, specific invocations of
/
or others, non-portability plagues their usefulness in writing Cross-Platform C (see Appendix for listing of techniques). This makes C decidedly unlike the "portable assembler" advertised by its proponents (and my Professors and co-workers).
3.3. Support
To say that
enjoys broad C Community support is an understatement. In all the years we have written proposals for C and C++, this is the only one where someone physically mailed us a letter - from a different country - directly to the Standards Body to try and make a case for the feature directly, rather than what was already in the paper: