1. Changelog
1.1. Revision 0 - December 10th, 2023
-
Initial release. ✨
2. Introduction, Motivation, and Prior Art
in C would have avoided probably half of the issues ever caught by the Clang Static Analyzer’s malloc/free and retain/release checking, at least ten years ago when I worked on it.
defer — Jordan Rose, Formerly Swift @ Apple, Currently Signal, December 1st, 2023
The need to clean up resources, undo partially-successful function invocations, and perform actions upon early return has been a computing need since we started having computers that were capable of calculation. This need intensified with the introduction of resources that work with the boundaries of a system, from sockets and files to memory allocations and parallelism primitives.
We have also had a large variety of failures related to
or
-style of programming. It becomes incredibly precarious to balance such code correctly, and sometimes individuals even opt out of
entirely and simply repeat necessary cleanup on exit of each scope:
So this isn’t as ... "nice" ... as those "...
goto err1 ; " style solutions, this is my own little piece of hell :)
goto err2 ; It’s the result of retrofitting freeing memory in error situations to an application which used to not care about that because it was a one-shot thing.
— Martin Dørum, with code from: Housecat
As stated by the code author, this is not more robust code. In fact, this sort of idiom — while explicit — has the user repeating themselves multiple times over and over again. Each nested scope, each conditional, is a chance to potentially forget to free an element, or free too many elements too many times.
Conversely, there is the opposite idiom where — in an attempt to reduce the sort of code as shown above — the author deploys a series of outside-in, inside-out
s and labels to handle failure. Other times, there is only one failure and the use of sentinel values (such as
or the null pointer constant or
) are used to effectively have no-op
/
/
/etc. calls. But, even a simple set of conditionals with only one
can be error-prone in conjunction with formatting failures and other issues.
2.1. Language-based Solutions
Language-based solutions are far superior to library-based solutions for this problem. They provide a level of guarantees to make code a lot cleaner than normal.
-
GCC/Clang attribute __attribute__((cleanup(…))), which takes a function with a single
referencing the value passed in;void * -
GTK’s glib g_autoptr, which works only on Clang/GCC using the language-based
functionality;__attribute__ (( cleanup ( …))) -
qemu’s Lockable, which uses the above
to build a macro to do proper scoped locks;g_autoptr -
an ObjC defer based on
;__attribute__ (( cleanup ( …))) -
an Apple Open Source auto-cleanup, based still on
;__attribute__ (( cleanup ( …))) -
and, MSVC’s try-finally blocks, which offer a more imperative way of doing things rather than the declaration-based attribute.
Even individuals with legitimate technical grievances against C++ speak highly of things such as
:
It takes about 10 minutes to get used to the “
” stuff, and then its not a big deal.
g_autoptr ( foo ) I’ve basically removed all of my “
” handling.
goto error ; If you need MSVC support, clearly you shouldn’t be using this yet. My hope is that the clang interop with MSVC will help make that a non-issue before long.
XlC might actually support it with recent versions. They are pretty good about tracking GCC frontend features. I doubt suncc will get to it though.
This is not the first time programmers have moved towards this sort of solution. However, doing so often came at the cost of needing to leave the C language, or — as done above — embracing a potentially high degree of non-portability to achieve the goals. As a Standards Committee, it would seem prudent to produce a viable implementation of this that is capable of satisfying all of C’s stakeholders without adding undue burden to the language.
The only drawback of
is that it is tied to passing in a single function, and only taking a single
parameter. It becomes difficult to properly coordinate additional information to the function without figuring out potentially contested data transmission mechanisms between the function put into the
portion of the attribute and the actual run of the function itself.
2.2. Library-based Solutions
Just because there are language solutions to the library that others have built on top of, it has not stopped others from creating their own solutions produced entirely within library-based code. There are both simple and robust examples of this sort of code in the wild, with some of the most readily available being:
-
moonchilled’s Defer/Deferrable macros (requires top-of-scope declarations);
-
and, Jens Gustedt’s guard and defer macros (an incredibly robust implementation using all sorts of tricks to provide the right behavior).
The details of these libraries — both the simple version in moonchilled’s rendition and the more robust version in Jens Gustedt’s offering — are testaments to the ability of a dedicated individual to work through the conditions of their language to produce remarkable code to solve the engineering problems they are facing. Unfortunately, each has a has a wide variety of implementation drawbacks:
-
adds execution overhead to register each
or similarly-named macro into the current scope;Defer ( …) -
generally requires a change in the way the program is written (e.g., using
instead ofdefer_return
orreturn
instead ofdefer_break
);break -
and, typically has a maximum number of registrations (as the storage has be be pre-allocated at the start of the scope to hold deferral registrations).
Even when the final code compiles down to something fairly efficient, these macros still tend to cost extra stack space or take up additional binary size through (potential) dynamic allocations to hold defers that go beyond a certain limit and need extra space to hold certain kinds of callbacks. This means that most code written using library solutions end up fairly suboptimal to the language based or preprocessor-based solutions.
3. Design
The design of this feature chooses 4 core tenets upon which it is based:
-
compile-time available (no run-time construction or space needed, do not pay for what you do not use);
-
binds to the innermost scope (closest);
-
can only appear as part of a statement (e.g., at block scope);
-
is not a full call stack-unwinding mechanism.
These four tenets are important to cleave close to existing practice and avoid any potential run-time overhead.
By making it a locally-scoped entity, we can have many more statements and much more data accessible to what is the effective equivalent of the
function. This prevents us from fiddling with making larger structures and calling a function, when we can instead have it entirely in-line and improve code motion and optimization opportunities that would not be available otherwise.
Notably, this is not similar to C++'s "RAII" (Resource Acquisition is Initialization) idiom in the specific area that
will run every object’s destructor up the entire call stack. While C++ implementations with extensions may offer to give the same behavior as a run destructor for
blocks in their C++-with-extensions mode, C does not need to have any
-like function or unwinding capabilities at all.
We explain these tenets and our design choices in the following subsections.
3.1. Defer Binding: Scope-Based
binds to the scope it was defined in. There is one other choice for
, which is to bind to the scope of the function call. This is the programming language Go’s choice of
[go-defer]. Choosing function-based binding for C would be an unmitigated disaster of corner cases and add the potential for needing run-time accumulation (e.g., dynamic allocation) of resources for
in order to handle
s which appear in loops or other constructs whose execution is only determined at run-time but is still scoped to a compile-time entity like the scope of a function.
To give an example of how quickly this behavior unravels itself, consider the following Go code:
for i:= 0 ; i< 100000 ; i++ { mutex. Lock() defer mutex. Unlock() * counter+= 1 }
This code immediately deadlocks. The fix to this in Go is to write the loop like this:
for i:= 0 ; i< 100000 ; i++ { func () { mutex. Lock() defer mutex. Unlock() * counter+= 1 }() }
This is just an extremely long-winded way of having a scope-based
. It is not all bad for function-based
: advantages include queueing up a piece of work to be done only if certain conditions are met. E.g., one can place a
inside of an
statement and then having it run at the end of the function if and only if that
was entered. However, the cost of such behaviors means attempting to shoehorn a design from Go which has the backing of a garbage collector and on-demand allocation. The first go snippet above, if it had not deadlocked, required dynamic allocation in an earlier version of Go. It took significant optimization work to get to a place where this would no longer be the case.
We know for a fact that many C compilers are averse to taking control of hidden dynamically-sized (not necessarily heap) allocations. It can often result in issues in the portability of code to smaller platforms. We also know for a fact that memory is neither free nor cheap in the C programming language; Go can pull this off because it has a run-time that can manage its garbage collector and is generally geared towards high-resource environments (even if it uses those resources efficiently). As a language feature, we cannot prioritize a design which may require an unbounded amount of code to be (potentially) stored in heap space so that it can be run as a callback, with potentially-saved data from each iteration of a loop stored in that construct as well.
3.2. defer
Syntax and Grammar: secondary-block
The syntax of a
block simply uses:
defer-statement:
-
defer secondary-block
A secondary-block is the same grammar term used for e.g.
statements, so all of the typical syntactic constructs — even the ones that look questionable — are allowed. We expect coding guidelines and build-failing tools to apply conformance to make these more legible (e.g., clang-tidy) and resistant to the usual failure cases.
We also chose the all-lowercase name
for this feature. This is, technically, a breaking change. We do not mind swapping every instance of
in this paper to be
, or
, or
; the exactly spelling of the introductory keyword is of little consequence to us. We use
in this paper to draw clear connections to the existing practice in other languages which use a similar keyword, such as the Go programming language.
/
is also, as shown in the existing practice above, a very common spelling for this feature.
3.3. Reference Captures "by Default"
This proposal does not tie the acceptance of the proposal to the presence of an explicit capturing clause, as was the case in other versions[N2895][N2542]. It simply allows for any variables that are visible in the scope of the
statement to be allowed like any other named entity. No implicit or invisible copy of the variables is performed: it simply refers to those variables just as in the same way as the rest of the surrounding code. This is the safest and best way to handle the way that this feature works.
The reason this is critically important is due to the potential to double-free. For example, consider the following function call:
void f () { void * p = malloc ( meow ); defer free ( p ); /* … */ if ( some_important_condition ) { take_ownership_and_use ( p ); p = NULL; } /* … */ }
If captures are done by-value (the pointer’s value is copied and held onto for the duration of the code until the function is exited by some means), then this example is a potential double-free. This is an enormous footgun. Copying by-value also introduces a (hidden, semi-uncontrolled) state that will exist on all implementations until optimizers can potentially get rid of the extra copies stored in the
statement. We note that this is the position Swift tends to take with its closures: values are referred to by-reference if they can prove such accesses are safe, but otherwise decay to by-value copies. However, Swift generally codifies and relies more on its ability to perform certain optimizations, whereas C implementations are allowed to be far weaker in terms of their optimization and translation/evaluation/execution guarantees.
Copying by-value for
is bad design in-general; anything that exists in the same scope and cannot escape said scope (such as statement expressions,
statements, and otherwise) should always refer to existing variables through their name and at the same location/address. There is no risk of failure here because
statements are not objects or declarations; they do not occupy a fixed amount of space as a reference-able object, they cannot be passed to
, they cannot be copied or put on the heap versus the stack or some other form of storage location. They are simply a form of code (and code organization) like other flow control and code control entities. Treating them like callbacks (e.g., things that can be saved/transported/invoked at an arbitrary point later in time) is antithetical to the feature itself.
For these reasons,
should refer to existing variables as thought they were normal l-values (because they are and they should be). We anticipate that, in a future where Lambdas/Nested Functions/etc. are possible, specific styles of capture can be obtained through their use where it will be explicit and documented neatly by the use of such hypothetical features themselves.
3.4. "Why Does This Not Unwind The Whole Call Stack??"
No C implementation provides a compiler-driven unwinding that we could find, even with
. There is one notable exception, but it requires code to be in "C++ mode" (or have the equivalent of
passed to the compiler to enable it in "C mode"). Right now, calling any of:
-
;exit -
;_Exit -
;quick_exit -
;thrd_exit -
or,
;abort
did not produce any code that called either the cleanup-annotated variables, or other code.
works similarly: no stack unwinding or call stack back-travel is done when any function that refuses to return and returns control to the host environment is done.
Note: This is compatible with C++ semantics for a similar C++ feature: constructors and destructors.
It is noteworthy that not even C++ destructors run on the invocation of any of these functions, either. (You can test that assumption here.) They have to use the C++-specific function
and work with the
in order to get appropriate unwinding behavior. Therefore, there is no precedent — not even from C++ — that C or C++ code should appropriately and carefully unwind the stack.
, therefore, will not provide this functionality. This makes it cheaper and easier to implement for platforms that do not have
, while also following existing practice to the letter. Notably, the "cheapness" and "ease" that will come from the implementation means that at no point will there ever need to be a maintained runtime of unwind scopes or exception handling-alike tables. In fact, no storage of any form of propagation information is necessary for this feature. It simply incentivizes the programming practices currently available to C programs: error codes, structured returns (with error codes embedded), and other testable function outputs in conjunction with better-defined cleanup code.
The one place this does not hold up is
. Consider the following code:
#include <stdlib.h>#include <stdio.h>#include <threads.h>extern void * ep ; extern void * ep2 ; extern int alternate ; void cfree ( void * userdata ) { void ** pp = ( void ** ) userdata ; printf ( "freeing %p !! \n " , * pp ); free ( * pp ); } [[ gnu :: noinline ]] void use ( void * p ) { if (( ++ alternate % 2 ) == 0 ) ep = p ; else ep2 = p ; } int thread ([[ maybe_unused ]] void * arg ) { __attribute__ (( cleanup ( cfree ))) void * p = malloc ( 1 ); printf ( "allocating %p !! \n " , p ); use ( p ); thrd_exit ( 1 ); return 1 ; } int main () { __attribute__ (( cleanup ( cfree ))) void * p = malloc ( 1 ); printf ( "allocating %p !! \n " , p ); int r = 0 ; thrd_t th0 = {}; thrd_create ( & th0 , thread , NULL); thrd_join ( th0 , & r ); use ( p ); exit ( 0 ); return 0 ; } void * ep = 0 ; void * ep2 = 0 ; int alternate = 0 ;
As of December 1st, 2023 on GCC trunk with the latest libpthreads, this code will print:
allocating 0xa072a0 !! allocating 0x7f8034000b70 !! freeing 0x7f8034000b70 !!
with
turned on (or built in C++ mode), and
allocating 0x47e2a0 !! allocating 0x7f7e14000b70 !!
with
not provided. (See it running and change the flags here.) This indicates that, specifically for
and its underlying implementation on
/
, the system will deploy a C++-style exception to do unwinding. This is fine for an implementation, and it is a conforming extension to add unwinding on top of C in this manner (to e.g. be more behavior-compatible with C++ or to protect precious thread-based resources).
However, note that even in this example, the memory from
is always leaked, no matter what. This means that even in C++ mode or C mode with
specified,
,
, and similar do not provide unwinding capabilities. Implementations should feel free to change or enhance this behavior.
Finally, we note that pretty much everything in MSVC is done by doing stack unwinding with their Structured Exception Handling (SEH) or similar techniques, so for the macros we provide almost every single one will be defined and have the value of
. This includes even
.
3.4.1. What about thread static storage destructors, atexit
, etc.?
These features require explicitly opt-in from the user in order to do program-specific and thread-specific cleanup in C. (C++, for threads, just relies on its RAII primitives in conjunction with parallelism language features and parallelism primitives). They can be hooked into while writing
to register each
statement’s code into them, and provide a form of artisanal & manual unwinding. Some applications that must retain the integrity of its data tend to use these features as a way to perform rollback or as a last-minute way to sanity check assumptions and data.
This proposal does not change anything about the semantics of these functions in any way.
3.4.2. So how does this proposal handle it??
We leave room for a future paper adding conditionally supported, compile-time checkable unwinding semantics to C. That is, we say that any
D that is reached may or may not run if a non-local jump or program termination occurs. We state that this is implementation-defined. Right now, we provide no macros or other hard-specified behavior on this. This will allow us to write papers immediately after the defer paper to properly define unwinding/stack unwinding, and their associated behaviors with
/
,
/
,
,
,
, and other
/
-style of functionality.
To prepare for such a future, this paper was written to eventually cover such behaviors and document them in a way that a program can react to the presence of unwinding reliably. That paper is here: https://thephd.dev/_vendor/future_cxx/papers/C%20-%20Unwinding.html.
3.5. Compile-time Construct
Due to the nature of the design, all
blocks can be transformed during translation and require no execution-time coordination or marking. This is imperative to ensure that the feature produces no overhead compared to
functionality, to
+
with no Structured Exception Handling catching on MSVC, or to manually writing a series of
in a (potentially deeply-nested) set of
s. For example, consider this function from real-world code (the linked code from Martin, listed previously in this proposal) that performs a series of nested
s with a series of
s for cleanup.
h_err * h_build_plugins ( const char * rootdir , h_build_outfiles outfiles , const h_conf * conf ) { char * pluginsdir = h_util_path_join ( rootdir , H_FILE_PLUGINS ); if ( pluginsdir == NULL) return h_err_create ( H_ERR_ALLOC , NULL); char * outpluginsdirphp = h_util_path_join ( rootdir , H_FILE_OUTPUT "/" H_FILE_OUT_META "/" H_FILE_OUT_PHP ); if ( outpluginsdirphp == NULL) { free ( pluginsdir ); return h_err_create ( H_ERR_ALLOC , NULL); } char * outpluginsdirmisc = h_util_path_join ( rootdir , H_FILE_OUTPUT "/" H_FILE_OUT_META "/" H_FILE_OUT_MISC ); if ( outpluginsdirmisc == NULL) { free ( pluginsdir ); free ( outpluginsdirphp ); return h_err_create ( H_ERR_ALLOC , NULL); } //Check status of rootdir/plugins, returning if it doesn’t exist { int err = h_util_file_err ( pluginsdir ); if ( err == ENOENT ) { free ( outpluginsdirphp ); free ( outpluginsdirmisc ); free ( pluginsdir ); return NULL; } if ( err && err != EEXIST ) { free ( outpluginsdirphp ); free ( outpluginsdirmisc ); free ( pluginsdir ); return h_err_from_errno ( err , pluginsdir ); } } //Create dirs if they don’t exist if ( mkdir ( outpluginsdirphp , 0777 ) == -1 && errno != EEXIST ) { free ( outpluginsdirphp ); free ( outpluginsdirmisc ); free ( pluginsdir ); return h_err_from_errno ( errno , outpluginsdirphp ); } if ( mkdir ( outpluginsdirmisc , 0777 ) == -1 && errno != EEXIST ) { free ( outpluginsdirphp ); free ( outpluginsdirmisc ); free ( pluginsdir ); return h_err_from_errno ( errno , outpluginsdirmisc ); } //Loop through plugins, building them struct dirent ** namelist ; int n = scandir ( pluginsdir , & namelist , NULL, alphasort ); int i ; for ( i = 0 ; i < n ; ++ i ) { struct dirent * ent = namelist [ i ]; if ( ent -> d_name [ 0 ] == '.' ) { free ( ent ); continue ; } char * dirpath = h_util_path_join ( pluginsdir , ent -> d_name ); if ( dirpath == NULL) { free ( outpluginsdirphp ); free ( outpluginsdirmisc ); free ( pluginsdir ); return h_err_create ( H_ERR_ALLOC , NULL); } char * outdirphp = h_util_path_join ( outpluginsdirphp , ent -> d_name ); if ( outdirphp == NULL) { free ( dirpath ); free ( outpluginsdirphp ); free ( outpluginsdirmisc ); free ( pluginsdir ); return h_err_create ( H_ERR_ALLOC , NULL); } char * outdirmisc = h_util_path_join ( outpluginsdirmisc , ent -> d_name ); if ( outdirmisc == NULL) { free ( dirpath ); free ( outdirphp ); free ( outpluginsdirphp ); free ( outpluginsdirmisc ); free ( pluginsdir ); return h_err_create ( H_ERR_ALLOC , NULL); } h_err * err ; err = build_plugin ( dirpath , outdirphp , outdirmisc , outfiles , conf ); if ( err ) { free ( dirpath ); free ( outdirphp ); free ( outdirmisc ); free ( outpluginsdirphp ); free ( outpluginsdirmisc ); free ( pluginsdir ); return err ; } free ( dirpath ); free ( outdirphp ); free ( outdirmisc ); free ( ent ); } free ( pluginsdir ); free ( outpluginsdirphp ); free ( outpluginsdirmisc ); free ( namelist ); return NULL; }
This is a fairly small function, clocking in at some 130 lines long. There are, as far as most reviewers can tell, no errors in the creation or deletion of the various kinds of resources (particularly, repeated memory allocations). The exact same code can be restructured as follows.
h_err * h_build_plugins ( const char * rootdir , h_build_outfiles outfiles , const h_conf * conf ) { char * pluginsdir = h_util_path_join ( rootdir , H_FILE_PLUGINS ); if ( pluginsdir == NULL) return h_err_create ( H_ERR_ALLOC , NULL); defer free ( pluginsdir ); char * outpluginsdirphp = h_util_path_join ( rootdir , H_FILE_OUTPUT "/" H_FILE_OUT_META "/" H_FILE_OUT_PHP ); if ( outpluginsdirphp == NULL) { return h_err_create ( H_ERR_ALLOC , NULL); } defer free ( outpluginsdirphp ); char * outpluginsdirmisc = h_util_path_join ( rootdir , H_FILE_OUTPUT "/" H_FILE_OUT_META "/" H_FILE_OUT_MISC ); if ( outpluginsdirmisc == NULL) { return h_err_create ( H_ERR_ALLOC , NULL); } defer free ( outpluginsdirmisc ); //Check status of rootdir/plugins, returning if it doesn’t exist { int err = h_util_file_err ( pluginsdir ); if ( err == ENOENT ) { return NULL; } if ( err && err != EEXIST ) { return h_err_from_errno ( err , pluginsdir ); } } //Create dirs if they don’t exist if ( mkdir ( outpluginsdirphp , 0777 ) == -1 && errno != EEXIST ) { return h_err_from_errno ( errno , outpluginsdirphp ); } if ( mkdir ( outpluginsdirmisc , 0777 ) == -1 && errno != EEXIST ) { return h_err_from_errno ( errno , outpluginsdirmisc ); } //Loop through plugins, building them struct dirent ** namelist ; int n = scandir ( pluginsdir , & namelist , NULL, alphasort ); if ( n == -1 ) { return h_err_from_errno ( errno , namelist ); } defer { for ( int i = 0 ; i < n ; ++ i ) { free ( namelist [ i ]); } free ( namelist ); } for ( int i = 0 ; i < n ; ++ i ) { struct dirent * ent = namelist [ i ]; if ( ent -> d_name [ 0 ] == '.' ) { continue ; } char * dirpath = h_util_path_join ( pluginsdir , ent -> d_name ); if ( dirpath == NULL) { return h_err_create ( H_ERR_ALLOC , NULL); } defer free ( dirpath ); char * outdirphp = h_util_path_join ( outpluginsdirphp , ent -> d_name ); if ( outdirphp == NULL) { return h_err_create ( H_ERR_ALLOC , NULL); } defer free ( outdirphp ); char * outdirmisc = h_util_path_join ( outpluginsdirmisc , ent -> d_name ); if ( outdirmisc == NULL) { return h_err_create ( H_ERR_ALLOC , NULL); } defer free ( outdirmisc ); h_err * err ; err = build_plugin ( dirpath , outdirphp , outdirmisc , outfiles , conf ); if ( err ) { return err ; } } return NULL; }
All of the special resource cleanup along certain branches are now completely gone. We have shrunk this function by around 14 lines of code. There are no resource leaks in this code. There was a potential resource leak in the first bit of the code, where if
was six in the code’s
loop and the third loop failed, it would return but fail to release the rest of the
results. We added an additional early exit and check to this, and such a check does not require another list of
s (which makes the differential here even more significant, since we have eliminated even potential future code with
). When
is used, additional early returns and checks added to the function later give no risk of forgetting to free or clean up specific temporary resources. Getting to:
-
shrink code by 18% of its original size (when including the differential for the additional checks added here and the
s they do not have to do);free -
guarantee that further error checks and error handling do not introduce new forms of vulnerabilities/leaks;
-
keeps resources in their scope and does not require refactoring nested
OR early returns;if
is a fairly good yield for a standard C feature. This sort of high-impact, high-quality refactoring enables better cleanup (as realized by GTK in its earliest iterations of
) and resistance to changes-over-time.
3.6. Visibility & Clarity of Code
This feature also achieves something that C users have frequently requested for functionality of this caliber. While C++ destructors "hide" the code behind the destruction of an object,
leaves that effect clear in the code by requiring that it is placed in the requisite scope. One can simply trace backward from a
or a
paired with its label and see what actions will be taken by
.
This is also a bit harder to achieve with
-style code. Furthermore, because of the way the attribute works, one cannot use the normal and typical free/delete functions that have the usual behavior and are easily understandable.
passes a pointer to the declared variable. So, writing the following:
int main () { __attribute__ (( cleanup ( free ))) void * p = malloc ( 1 ); return 0 ; }
is incorrect, because it will pass
— a
that is then cast to the
— to the
function. This means this code will compile, link, and run, but attempt to free the address of the stack variable that represents the pointer and not the value of the pointer itself. (Thankfully, GCC will warn about this.) A new function has to be written, that takes a
, then casts it to
, and then dereferences the
to pass it to the
function.
It is not an ideal interface.
3.7. Safety & defer
: Preventing Leaks from Human Vulnerability
One of the most important tenets of this feature is resistance to human fallibility. There is a lot of ways in which a human being who deals with resource handling and required calls may fail to do such explicitly paired calls for creating a resource/entity and releasing a resource/entity. Many vulnerabilities happen because restructuring to
ladders (shown above to have still leaked some resources from the
code, though this code itself is not directly vulnerable and just a leak) or
-style code and leave out necessary actions, e.g. [goto-fail]. One such CVE that highlights human fallibility in the face of necessary control flow is the Linux Kernel’s somewhat recent CVE-2021-3744[cve-2021-3744].
[cve-2021-3744] is not a vulnerability where they forget to free the data in totality: it was that the data for
was not freed along very specific paths in a crop of sensitive code.
Adding a
at the allocation of these resources (particularly, line 877 after the tag was successfully init with the DM work area) would make it impossible to forget to release the resources associated with that variable (or setting it to NULL
along specific paths where the data was transferred off or taken ownership of).
Note: This is not an indictment of the quality of Linux Kernel source code, but — as this sort of vulnerability has been repeated time and time again over the last two decades — a cautionary tale of how human beings are allowed to be fallible.
To quote Daniel Stenberg, maintainer of curl:
It burns in my soul
Reading the code now it is impossible not to see the bug. Yes, it truly aches having to accept the fact that I did this mistake without noticing and that the flaw then remained undiscovered in code for 1315 days. I apologize. I am but a human.
(Emphasis mine.) We are all human. One of the takeaways that people usually have from this is that we need to put "more eyeballs" or "do better teaching", but this is — again — not the first time this sort of vulnerability happened. For example, the same kind of bug also came from the GnuTLS implementation back in 2014[gnu-tls-bug-analysis]. The moral of the story is not to beat human beings up for their fallibility or mistakes, but to turn around to the language designers and actually ask them why this sort of problem can go on for nearly 40 years of C programming and nobody actually bring a solution to the problem.
It is time to start acknowledging that our lack of built-in tools in C is not doing the job quite right. The fact that titans in our industry over 30, 40 years can have even the tiniest slip-up or slightest indentation mistake used against them and their code, points to a fundamental issue in the way the language interacts with and works with its user base. Requiring perfect, 24/7 vigilance from people who support trillion-USD market cap industries and billion-USD quarterly budget businesses across the entire globe — and in space itself — while only earning a fraction of that while sometimes unable to support themselves is quite frankly bonkers. Our users have faithfully served the C language for decades.
They deserve tools and features that can cover their fallibility and make it difficult for them to forget to handle certain cases of bugs, whether it’s heap overflow, use-after-free/double-free, integer overflow or otherwise. It is time we recognize that, in some ways, we made it error-prone and wrong, and that we can do small, simple things to make it better.
3.8. defer
Ordering?
The code in a
block runs after every other statement and expression in the block, save for other
s (which execute in reverse-lexical order from whence they appeared). Nested
statements execute at the end of the block for the
they are within. It is recognized that nested
s, for some people may be considered a sincere code smell. Therefore, there is optional wording below (6.6) to allow for this if people rally behind this idea.
3.8.1. defer
interleaved with return
?
A
clause executes at the end of the scope, after every other kind of execution for the scope finishes. This includes after the invocation of the expression of a
statement. That is, given this example code:
int woof (); int bark (); int use ( int x ){ defer { woof (); } return bark (); }
The order of execution is
and then
. Another example:
int use ( int x ){ int * p = & x ; * p = 400 ; defer { * p = 500 ; } return * p ; }
The return value of the
function is
, not
. Also notably, because it runs after all other non-
statements but just before the termination of the (function or block) scope, all of the variables are still alive when referred to.
Note: This is compatible with C++ semantics for a similar C++ feature: constructors and destructors. See this live code snippet:
struct destroy_me { int & r ; ~ destroy_me () { r = 5 ; } }; int main () { int r = 4 ; destroy_me dm { r }; return r ; }
3.8.2. Flow Control / Jumps out of defer
statements cannot allow compile-time jumps out of themselves and into other
s or out into a surrounding scope. Otherwise, that would damage the integrity of a sequence of defers:
int get_work_order (); void rollback ( int handle ); void attempt_transaction ( int handle0 , int handle1 ); int main () { int very_important_handle = get_work_order (); defer { rollback ( very_important_handle ); } int very_important_handle2 = get_work_order (); defer { rollback ( very_important_handle2 ); // !!! goto try_attempt ; } try_attempt : opt ( int ) result = attempt_transaction ( very_important_handle , very_important_handle2 ); return 0 ; }
That
jumps out of the second
statement, and back into the main block. This would skip over executing of the first
(due to the reverse-lexical-order of how defer blocks are run), which would result in unintuitive behavior. In order to cut this off, we simply do not allow compile-time jumps (
,
,
, or
) out of a
block. Should more reasonable semantics be nailed down at a later date, we can go back and fill in these intentional blanks (which, generally, is not possible when something is made undefined behavior rather than a constraint violation).
3.8.3. goto
and other Flow Control over an existing defer
Much like the previous section, this code is also banned:
#include <stdio.h>int main () { goto b ; defer { printf ( " meow" ); } b : printf ( "cat says" ); }
Ostensibly, one could justify the way this works. "The
is not really executed, it is more-so translated to happen at the end of the scope. Therefore, this should print
", some may state. However — for consistency’s sake — this gets a bit more confusing when it is not just a
jumping over a
statement, but a
making it look like it’s entirely unreachable:
#include <stdio.h>int main () { printf ( "cat says" ); return 0 ; defer { printf ( " meow" ); } }
Do we still print
here? The conclusion this paper comes to is "no". While
provides code motion at compile-time, there is still the mental model of the programmer to consider and the tooling of the compiler to consider. In almost every case that anyone looks at this code, the
looks like it is part of an unreachable set of code; if it were mandated to run, this would be problematic.
For this case specifically,
before a
in the same scope simply means it is not run. Therefore,
s cannot be jumped over by
s, similar to how Variable-Length Arrays cannot be jumped over by similar control flow constructs when the compiler can know about it. But, one can
early before a
is ever reached.
3.8.4. goto
and other Flow Control into an existing defer
This case has to be banned. Under no circumstances can we allow
or similar into a
statement. Consider the following code, which has 2 potential exit branches (however contrived):
int main ( int argc , char * argv []) { void * p = malloc ( 1 ); defer { my_label : free ( p ); } goto my_label ; if ( argc < 2 ) { return 1 ; } /* … */ return 0 ; }
If the
is leapt into, which "branch" of
are we running? The one in
, or the one at the other end at
? Furthermore, defer runs after, not before, things like return expressions are evaluated. What is the return value from
? What does this function end up doing? All of these make absolutely no sense. Any flow control into a
is 110% banned, and for good reason.
3.8.5. What about longjmp
?
Unfortunately,
is non-local, runtime-controlled control flow. It can easily defy all of C’s typical static analysis. Therefore, we cannot form any reasonable Constraint Violations (compile-time errors) for this category of behavior. Still, we enumerate several cases of execution-time behavior where, so long as the non-local jump does not put execution outside of the current
block, the behavior remains well-defined (modulo other issues with jumping into/over/etc. things that may be within the
block). If the non-local jump escapes the
, though, we say the behavior is utterly undefined.
4. C++ Compatibility: Why Not Member Functions + Constructors/Destructors?
As will be asked one hundred thousand times throughout the course of this proposal’s life:
4.1. Why Not Just Put Member Functions And Constructors/Destructors Into C? RAII Is Powerful And Solves This Problem?
There are several purely technical reasons for not pursuing a constructor/destructor-alike solution that is fully compatible with C and C++ member declarations. Briefly, they can be categorized as follows.
-
Function overloading required for constructor syntax, even without other member function syntax.
-
Name mangling required for constructor syntax, even without other member function syntax.
-
And, interoperability with older C code that will suddenly be imbued with the potential for constructor/destructor semantics and may not produce binary-compatible construction representations (implementation-specific).
The first two are interconnected and also simply part of the bargain. If any of GCC, Clang, or MSVC wanted to adopt a proposal for C that would inject member functions into the language, they would — naturally and correctly — do so with the implementation that has served C++ well over the last few decades, by using name mangling. But, implementation-controlled name mangling is abhorrent to C developers for a wide variety of reasons, least of all being they have less control over their Application Binary Interfaces than they have already been deprived of. While C mangling is fairly consistent across most platforms (in that there is either little or none at all), C++ name mangling implementations can outstrip the implementation of many C17 frontends in their entirety (and, in one documented case, has done so for a vendor in WG14 that supports both C and C++).
Function overloading — and the requisite name mangling schemes that would come with it — are very much not feasible for C implementations or the C language.
Furthermore,
actually escapes a small issue that C++ has created for itself with
, exceptions, and destructors. As C++ would generally rightfully claim, this feature is redundant with its RAII concept. They would be correct, except in one place: the C++ standard library. Peter Sommerlad has spent over 10 years working on a proposal for a "Generic Scope Guard" for C++, so long that it has a C++ Paper Number from both the P-based proposal system and also a large series of WG21 N-document numbers[p0052].
Sommerlad’s efforts ultimately failed in C++ because C++ has a rule that objects created and owned by the C++ Standard Library must never have a non-
destructor. One of the most pertinent use cases is having scope guards which, in their destructor, could roll back transactions and then throw an exception. Because a hypothetical
object could not possibly have a non-
destructor, this meant that creating an object, giving it a destructor that invokes the code contained by the function stored in a
, and having that function throw an exception, it would not achieve its desired purpose. It would, instead of throwing that exception, immediately call
and kill the program on the spot. (All exceptions thrown in a
destructor — as would be mandated would a
— immediately call
when an exception hits the boundaries of a
-marked function, including C++ destructors.)
Thusly, C++'s own rules about destructors — that they refused to break in this one case — makes it impossible to create an RAII object that fulfills one of the primary uses of
.
, as a neutral language construct that is not tied to a member object, does not have this problem as there is no implication of success or failure; it is just an alternate form of code motion that is not tied to the lifetime of an object, but instead to lexical scope directly. Given the history of Peter Sommerlad’s [p0052], we are unsure if they will pursue a language-based solutions to escape the rules of their own Standard Library (which are, for many reasons, completely justified). Therefore, this looks like one of those features that will, despite being very fundamental in either C or C++ code, be taken care of by user-defined libraries, shims, and polyfills (e.g., hand-rolled or Boost) for the foreseeable future.
4.2. Signaling Failure in Destructors
Destructors also present another serious problem in that the only way to communicate things out of them is to either:
-
capture specific information in the constructor and then propagate it out of the defer call in some fashion (antithetical to the design of both destructors or
);defer -
or, throw an exception.
Asides from the general issues of how palatable exceptions may or may not be for C, destructors with unwinding and exceptions prove to be truly untenable in many cases. For example, for file-type resources in C++ or thread-type resources in C++, the standard mandates that any errors or exceptions generated are just completely consumed and swallowed and never communicated outside of their destructor. This presents a greater issue for the degree of resource safety, and often users have had to manually flush
s and close them manually to check for errors properly rather than find out log files would suddenly truncate their output in the middle of serious outages where that information was necessary:
virtual ~ basic_filebuf (); Effects: Calls
.
close () If an exception occurs during the destruction of the object, including the call to
, the exception is caught but not rethrown (see [res.on.exception.handling]).
close () — C++ Standard, 31.10.3.2 [filebuf.cons], December 10th, 2023
This happens in numerous other places in the C++ standard library. It is sufficient to state that this is very undesirable for C; swallowing exceptions -- or any other failure -- in a destructor-like design is not a good design for C. It is analogous to having error codes set on
that get ignored, and this has been a significant source of problems for C development (especially when the actual values of
being set end up implementation-defined, as they did for the
/
messes that resulted in the functionality of the latter becoming undefined behavior).
Note: Having cleanup behavior completely devoid of context and potential failures, trapped in its own function scope, removes the ability to react to important context related to the success or failure of certain complex hardware and operating system resources.
, by its design, is always placed local to the scope and has access to local information. At the cost of needing to constantly write the
itself, it retains the ability to be shuffled or move and take into account error codes, failed operations, and more that typical context-devoid type-based RAII resources cannot (without being explicitly written to with function objects or similar variables taken into its constructor).
4.3. The Ideal World
Speaking briefly from a language design standpoint: in an ideal world, both of these solutions would be present side by side to offer the user a maximally flexible choice of error handling, automated cleanup, general-purpose undo power, and freedom to choose context (or not have any contextual information at all). Unfortunately, due to the technical challenges of name mangling from member functions (destructors and constructors) it may be some time before C sees a solution where an object contains its own clean up code and that clean up code follows an object through the system (e.g., by using types).
Despite the advances that
makes for C, this will present a greater problem over time as C users get used to functional, block-scoped cleanup but do not have the ability to attach
s to individual struct fields or to the lifetime of a given object.
attempted to do this, but in a way that was brittle. It did not transfer to other object declarations and could not be properly relocated without runtime coordination and orchestration by the user. It also could not be applied to structure fields, providing one kind of block-scope safety but making it less ideal in other cases and requiring extra boilerplate functions to get the job done. RAII achieves this by leveraging the type system.
achieves this but requires that every place a resource is used, it must have a
block written. It is much more manual than RAII.
4.4. The Polyfill/C++ Fix
In either case, for the above stated reasons, we will not be pursuing providing member functions (constructors/destructors) for C. We also do not anticipate C++ being thrilled about that, and may see us asking for
as trying to inject yet another alternative design into C++ for poor reasons. We would like to not antagonize C++ any further while respecting the design of C, and therefore offer the below partial solution to the problem.
Note: We do invite C++ users, for the sake of interoperation, to create their own using a structure/class with Class Template Argument Deduction (CTAD) and lambdas, as that can cover the space fairly nicely with (effectively, needs an extra
and requires braces) identical syntax to this proposal and no Standard Library rules to worry about. Feel free to use ours:
#include <type_traits>#include <utility>template < typename _Fx > struct __defer_t { _Fx __fx ; __defer_t ( _Fx && __arg_fx ) noexcept ( :: std :: is_nothrow_move_constructible_v < _Fx > ) : __fx ( :: std :: move ( __arg_fx )) {} ~ __defer_t () noexcept ( :: std :: is_nothrow_invocable_v < _Fx > ) { __fx (); } }; template < typename _Fx > __defer_t ( _Fx __fx ) -> __defer_t <:: std :: decay_t < _Fx >> ; #define __DEFER_TOK_CONCAT(X, Y) X ## Y #define __DEFER_TOK_PASTE(X, Y) __DEFER_TOK_CONCAT(X, Y) #define defer __defer_t \ __DEFER_TOK_PASTE(__scoped_defer_obj, __COUNTER__) = \ [&]() #include <stdio.h>int main () { defer { defer { printf ( " :3" ); }; printf ( " meow" ); }; printf ( "cat says" ); return 0 ; }
5. Implementation Experience
This proposal is modeled after existing practice, but is not directly provided in a C compiler in C mode. It can be approximated (as shown above) using C++, but C++ does not have a language feature for this form that isn’t just a stripped-down form of RAII.
This leaves only
as has been implemented in many compilers, some of which are GCC, Clang, XLC, and Tiny C Compiler. Most of what has shaped this proposal has been driven by this feature. However, we specifically do not use an attribute in this version of the proposal because
is:
-
ignorable by the wording for attributes (and their expected behavior);
-
tied too strongly to a declaration (and may fall off in an unwarranted fashion);
-
and, in the existing practice, does not have support for single fields in a structure or similar without writing an extended function to pick out specific pieces of a structure to cleanup.
In contrast
has much clearer mechanisms to achieve the same goal, and may be applied to a wider variety of instances and cases than the attribute can.
In the future, we expect that — should someone solve the tension between name mangling, member functions, constructors/destructors, and more — we could consider moving into a destructor-based solution that is tied to the type system and objects. We view that as better to solve the problem for individual member fields and larger objects, while reducing the amount of times that the cleanup code may need to be written. However, C users have expressed a deep tie to having code that will run be visible in the scope. Macros violate this rule (e.g.
backed by
), destructors violate this rule, but
as a feature does not.
6. Wording
Wording is relative to the latest draft revision of the C Standard.
Note: This wording uses the lowercase
keyword directly in its wording. We recognize that this may not be suitable, and it is not an integral part of this proposal to type in
for us. It is just simpler to read in this form. If it is necessary to change it, we’ve got more than enough time to add
and do the same
to provide the convenience macro for the lowercase spelling.
6.1. Modify §5.1.2.2.3 Program termination to ensure defers in main
are run
5.1.2.2.3 Program terminationIf the return type of the
function is a type compatible with
main , a return from the initial call to the main function is equivalent to calling the
int function with the value returned by the
exit function as its argument after all active
main statements of the function body of main have been executed ; …
defer
6.2. Modify 6.4.1 Keywords to include defer
6.4.1 KeywordsSyntaxkeyword: one of
…
default
deferdo
…
6.3. Modify §6.8 Statements’s unlabeled-statement grammar production to include a new defer-statement
6.8 StatementsSyntaxstatement:
labeled-statement
unlabeled-statement
unlabeled-statement:
expression-statement
attribute-specifier-sequenceopt primary-block
attribute-specifier-sequenceopt jump-statement
- defer-statement
primary-block:
compound-statement
selection-statement
iteration-statement
secondary-block:
statement
6.4. Add a new §6.8.7 's section describing the new defer-statement
6.8.7 Defer statementsSyntaxdefer-statement:
defer secondary-block
DescriptionLet D be a defer statement, S be the secondary block of D referred to as its deferred content, and E be the enclosing block of D.
ConstraintsJumps by means of
into E shall not jump over a defer statement in E.
goto Jumps by means of
shall not jump into any defer statement.
goto Jumps by means of
,
return ,
goto , or
break shall not exit S.
continue SemanticsWhen execution reaches a defer statement D, its S is not immediately executed during sequential execution of the program. Instead, S is executed upon:
the termination of the block E (such as from reaching its end);
or, any exit from E through means of flow control such as
,
return ,
goto ,
break , or
switch .
continue The execution is done just before leaving the enclosing block E.
Multiple defer statements execute in the reverse lexical order they appeared in E. Within a single defer statement D, if D contains one or more defer statements of its own, then these defer statements are also executed in reverse lexical order at the end of S, recursively, according to the rules of this clause.
If E has any defer statements D that have been reached and their S have not yet executed, but the program is terminated or leaves *E through any means such as:
a function with the deprecated
function specifier, or a function annotated with the
_Noreturn /
no_return attribute, is called;
_Noreturn or, any signal
,
SIGABRT , or
SIGINT occurs;
SIGTERM then any such S are not run, unless otherwise specified by the implementationFN0✨). Any other D that have not been reached are not run.
FN0✨)The execution of deferred statements upon non-local jumps or program termination is a technique sometimes known as "unwinding" or "stack unwinding", and some implementations perform it. See also ISO/IEC 14882 Programming languages — C++, section [except.ctor].
If a non-local jump (such as
) is used within E but before the execution of D:
longjmp
if control leaves E, D's statements will not be executed;
otherwise, if control returns to a point in E and causes D to be reached more than once, there is no effect.FN1✨)
FN1✨)This is because the "execution" of a defer statement only lets the program know that S will be run. There is no observable side effect to repeat from reaching D, as the manifestation of any of the effects of S will happen when if and only if it is exited or terminated as previously specified.
If a non-local jump (such as
) is executed from S and control leaves S, the behavior is undefined.
longjmp If a non-local jump (such as
) is executed outside of any D and:
longjmp
it jumps into any S;
or, it jumps over any D;
the behavior is undefined.
EXAMPLE 1: Defer statements cannot be jumped over or jumped out of.
#include <stdio.h>int f () { goto b ; // constraint violation defer { printf ( " meow" ); } b : printf ( "cat says" ); return 1 ; } int g () { return printf ( "cat says" ); defer { printf ( " meow" ); } // okay: no constraint violation, not executed // print "cat says" to standard output } int h () { goto b ; { // okay: no constraint violation defer { printf ( " meow" ); } } b : printf ( "cat says" ); return 1 ; // prints "cat says" to standard output } int i () { { defer { printf ( "cat says" ); } // okay: no constraint violation goto b ; } b : printf ( " meow" ); return 1 ; // prints "cat says meow" to standard output } int j () { defer { goto b ; // constraint violation printf ( " meow" ); } b : printf ( "cat says" ); return 1 ; } int k () { defer { return 5 ; // constraint violation printf ( " meow" ); } printf ( "cat says" ); return 1 ; } int j () { defer { b : printf ( " meow" ); } goto b ; // constraint violation printf ( "cat says" ); return 1 ; } int k () { goto b ; // okay: no constraint violation { b : defer { printf ( "cat says" ); } } printf ( " meow" ); return 1 ; // prints "cat says meow" to standard output } int m () { goto b ; // constraint violation { defer { printf ( " meow" ); } b : } printf ( "cat says" ); return 1 ; } EXAMPLE 2: All the expressions and statements of an enclosing block are evaluated before executing defer statements. After all defer statements are executed, then the block is left.
int main () { int r = 4 ; int * p = & r ; defer { * p = 5 ; } return * p ; // return 4; } EXAMPLE 3: It is implementation-defined if defer statements will execute if the exiting / non-returning functions detailed previously are called.
#include <stdio.h>#include <stdlib.h>int main () { void * p = malloc ( 1 ); if ( p == NULL) { return 0 ; } defer free ( p ); exit ( 1 ); // "p" may be leaked } EXAMPLE 4: Defer statements, when execution reaches them, are tied to their enclosing block.
#include <stdio.h>#include <stdlib.h>int main () { { defer { printf ( " meow" ); } if ( true) defer printf ( "cat" ); printf ( " says" ); } // "cat says meow" is printed to standard output exit ( 0 ); } EXAMPLE 5: Defer statements execute in reverse lexical order, and nested defer statements execute in reverse lexical order but at the end of the defer statement they were invoked within. The following program:
int main () { int r = 0 ; { defer { defer r *= 4 ; r *= 2 ; defer { r += 3 ; } } defer r += 1 ; } return r ; // return 20; } is equivalent to:
int main () { int r = 0 ; r += 1 ; r *= 2 ; r += 3 ; r *= 4 ; return r ; // return 20; } EXAMPLE 6: Defer statements can be executed within a
, but a
switch cannot be used to jump over a defer statement.
switch #include <stdlib.h>int main () { void * p = malloc ( 1 ); switch ( 1 ) { defer free ( p ); // constraint violation default : defer free ( p ); break ; } return 2 ; }
6.5. OPTIONAL: Add to 6.8.7 Defer statements a new paragraph 3 additional constraint to reject multiply-nested defer
.
6.8.7 Defer statementsSyntaxdefer-statement:
defer secondary-block
DescriptionLet D be a defer statement, S be the secondary block of D, and E be the enclosing block of D.
ConstraintsA defer statement shall not appear within another defer statement.
Note: 📝 Editor: also edit Example 4 and Example 5 with
in the appropriate places, and change the description to make it clear it is a constraint violation.
6.6. Modify Annex J’s list of undefined behaviors with non-local jump undefined behavior (e.g. longjmp
)
Note: 📝 For the editor to do within the Annex J undefined behavior list.