To make progress, we need better data on costs and performance to evaluate the - often simplistic and narrowly focused - solutions suggested. — Direction for ISO C++ [P0939R2]
1. Introduction
In this paper, we will look at the size costs of error handling. We’ll break things down into one-time costs and incremental costs, and subdivide by costs paid for error neutral functions, raising an error, and handling an error.2. Changes
R1 of this paper removed the discussion on how exceptions are implemented, and reduced the set of error handling strategies down to throwing exceptions, return codes, abort, and noexcept abort. This is to make the paper and charts easier to read. The data is the same.R1 renamed the "stripped" cases to "baremetal" cases, to better frame where those numbers would be applicable.
3. Measuring methodology
All benchmarks lie. It’s important to know how a benchmark is set up so that the useful parts can be distinguished from the misleading parts.The specific build flags can be found in Appendix B. Following is a brief summary.
MSVC 2019 was used for MSVC x86 and MSVC x64 builds. The /d2FH4 flag described in [MoFH4] was used, and /EHs was used when exceptions were on.
GCC 7.3.1 from the Red Hat Developer Toolset 7.1 was used for my GCC builds. The Linux x64 platform was targeted.
Clang 8.0.0, libc++, and libc++abi was used for my Clang builds. The Linux x64 platform was targeted. The system linker and C library leaked in to this build. The system GCC was GCC 4.8.4 from Ubuntu 14.04.3.
All the binaries are optimized for size, rather than speed.
All the binaries are built with static runtimes, so that we can also see the costs of the error handling runtime machinery. For many people, this is a sunk cost. If the cost of the runtime machinery isn’t of interest, then don’t pay attention to the one-time costs, and just look at the incremental costs. Sizes were not calculated by just doing the "easy" thing and comparing the on-disk sizes of the resulting programs. Programs have lots and lots of padding internal to them due to alignment constraints, and that padding can mask or inflate small cost changes. Instead, the size is calculated by summing the size of all the non-code sections, and by summing the size of each function in the code sections. Measuring the size of a function is a little tricky, as the compiler doesn’t emit that information directly. There are often padding instructions between consecutive functions. My measurements omit the padding instructions so that we can see code size differences as small as one byte.
Measurements are also included where the size of some data sections related to unwinding are omitted. On x64 Linux, programs can have an .eh_frame and .eh_frame_hdr section that can help with emitting back traces. x64 Windows has similar sections named .xdata and .pdata. These sections aren’t sufficient to implement C++ exception handling, and they don’t go away when exceptions are turned off. On Linux and Windows, these sections should be considered a sunk cost, but on more exotic platforms, it is reasonable to omit those sections, as stack trace costs may not be tolerable. These measurements are all labeled as "baremetal". x86 Windows doesn’t have these sections, so the "baremetal" measurements are the same as the regular measurements.
Note that on Linux, the entire user mode program can be statically linked. This is the program under test, the C++ runtime, the C runtime, and any OS support. On Windows, the program, the C++ runtime, and the C runtime can be statically linked, but the OS support (kernel32.dll) is still distinct. With this in mind, refrain from comparing the one-time MSVC sizes to the Clang and GCC sizes, as it isn’t comparing the same set of functionality.
These benchmarks are run on very small programs. On larger programs, various code and data deduplication optimizations could substantially change the application-level costs of error handling. [MoFH4] documents the kinds of deduplication that MSVC 2019 performs.
4. Starter test cases
To start with, we will look at code similar to the following:This code has some important properties for future comparisons.struct Dtor { ~ Dtor () {}}; int global_int = 0 ; void callee () { /* will raise an error one day*/ } void caller () { Dtor d ; callee (); global_int = 0 ; } int main () { caller (); return global_int ; }
-
will eventually raise errors.callee () -
needs to clean up thecaller ()
object in error and non-error conditions.d -
should only setcaller ()
in success cases.global_int -
The code doesn’t have any error cases yet. We can see the cost of error handling machinery when no errors are involved.
In the actual tests all the function bodies are in separate .cpp files, and link-time / whole-program optimizations aren’t used. If they had been used, the entire program would get optimized away, removing our ability to measure error handling differences.
The above program is a useful template when using exceptions or
as an error handling mechanism, but it won’t work as well for error codes. So we mutate the program like so...
This is pretty typical integer return value code, without any macro niceties.int callee () { return 0 ;} int caller () { Dtor d ; int e = callee (); if ( e ) return e ; global_int = 0 ; return e ; }
Most of the programs were built with exceptions turned off, but the throw_* cases and noexcept_abort all had exceptions turned on in the program.
-
abort: When an error is encountered, kill the program with
.std :: abort -
noexcept_abort: Same as abort, except exceptions are turned on, and all the functions declared in user source are marked as
.noexcept -
return_val: Return an integer error code, where zero represents success.
-
throw_exception: Throw an exception deriving from std::exception, that contains only an int. This should represent more typical use cases.
Data on many more error handling strategies can be found in R0 of this paper.
Expository code for all the cases can be found in Appendix C. The actual code used for the benchmark can be found on my github.
5. Measurements
5.1. Initial error neutral size cost
My first batch of measurements is comparing each of the mechanisms to the baremetal.abort test case. This lets us focus on the incremental costs of the other mechanisms.Warning! Logarithmic axis! Linear version here
Warning! Logarithmic axis! Linear version here
These tables show us that the one-time cost for exceptions is really high (6KB on MSVC x86, 382KB on Clang x64), and the one time cost for unwind information is pretty high too (6KB on MSVC x64, 57KB on Clang). Note that noexcept_abort has the same cost as regular abort right now. If everything is
, the exception machinery costs are not incurred.
5.2. Incremental error neutral size cost
To measure the incremental cost of error neutral code, the code will be updated as follows:The "2" versions of these functions are slightly different than the original versions in order to avoid optimization where identical code is de-duplicated (COMDAT folding). Each error handling case was updated to the idiomatic form that had the same semantics as this error neutral form. Here are the incremental numbers:void callee2 ( int amount ) { global_int += amount ; // will error one day } void caller2 ( int amount ) { Dtor d ; callee2 ( amount ); global_int += amount ; } int main () { caller (); caller2 ( 0 ); return global_int ; }
The delta between the best and the worst is much smaller in the incremental error neutral measurements than in the one-time cost measurements.
and return values were always cheaper than exceptions, even with included unwind information.
5.3. Initial size cost of signaling an error
What happens when an error is signaled first time? What’s the one-time cost of that first error?void callee () { if ( global_int == INT_MAX ) throw 1 ; }
Warning! Logarithmic axis! Linear version here
Warning! Logarithmic axis! Linear version here
On MSVC, there are multiple ways to build with exceptions "on". This experiment was built with
, which turns on exceptions in a C++ conforming manner. The Microsoft recommended flag is
, which turns on exceptions for all C++ functions, but assumes that
functions won’t throw. This is a useful, though non-conforming option. The trick is that the noexcept_abort
implementation calls
, and that’s an
function that isn’t marked as
, so we suddenly need to pay for all the exception handling costs that we had been avoiding by making everything
. We can’t easily make the C runtime, or other people’s code
. We don’t see this on GCC and Clang because the C library they are calling marks abort as
, and that lets them avoid generating the exception machinery.
GCC’s first throw costs look worse than Clang’s because Clang paid a lot of those costs even before there was a throw.
5.4. Incremental size cost of signaling an error
void callee2 ( int amount ) { if ( global_int + amount == INT_MAX ) throw 1 ; global_int += amount ; }
These numbers are all over the place. Here are some highlights:
-
On GCC and Clang, throwing an exception is more incrementally expensive than all the non-throwing variants.
-
On MSVC, it isn’t just the first non-
function that is expensive in noexcept_abort, but the later calls are expensive too.noexcept
5.5. Initial size cost for handling an error
To get the initial handling costs, we’ll rewrite main to look something like this...int main () { try { caller (); } catch ( int ) { global_int = 0 ; } caller2 ( 0 ); return global_int ; }
abort
results won’t be included here, because there is no "handling" of an abort
call in C++. The environment needs to handle it and restart the process, reboot the system, or relaunch the rocket.
Here we see that the initial catch cost exceptions is high compared to the alternatives.
5.6. Incremental size cost for handling an error
Now for the incremental code, and the associated costs.Note that this is measuring the cost of handling a second error within a single function. If the error handling were split over multiple functions, the cost profile may be different.int main () { try { caller (); } catch ( int ) { global_int = 0 ; } try { caller2 ( 0 ); } catch ( int ) { global_int = 0 ; } return global_int ; }
6. Conclusion
Exceptions and on-by-default unwinding information are reasonable error handling strategies in many environments, but they don’t serve all needs in all use cases. C++ needs standards conforming ways to avoid exception and unwind overhead on platforms that are size constrained. C++ is built on the foundation that you don’t pay for what you don’t use, and that you can’t write the language abstractions better by hand. This paper provides evidence that you can write error handling code by hand that results in smaller code than the equivalent exception throwing code if all you use is terminate semantics or an integer’s worth of error information. In each of the six test cases, terminate and integer return values beat exceptions on size, even before stripping out unwind information.7. Acknowledgments
Simon Brand, Niall Douglas, Brad Keryan, Reid Kleckner, Modi Mo, Herb Sutter, John McFarlane, Ben Saks, and Richard Smith provided valuable review commentary on R0 of this paper.Thanks to Lawrence Crowl, for asking the question "what if everything were
?".
Charts generated with [ECharts].
Appendix A: Why no speed measurements?
[P1886] contains speed measurements for various error handling strategies.Appendix B: The build flags
MSVC
The compiler and flags are the same for 32-bit and 64-bit builds, except that the 32-bit linker uses /machine:x86 and the 64-bit linker uses /machine:x64Compiler marketing version: Visual Studio 2019
Compiler toolkit version: 14.20.27508
cl.exe version: 19.20.27508.1
Compiler codegen flags (no exceptions): /GR /Gy /Gw /O1 /MT /d2FH4 /std:c++latest /permissive- /DNDEBUG
Compiler codegen flags (with exceptions): /EHs /GR /Gy /Gw /O1 /MT /d2FH4 /std:c++latest /permissive- /DNDEBUG
Linker flags: /OPT:REF /release /subsystem:CONSOLE /incremental:no /OPT:ICF /NXCOMPAT /DYNAMICBASE /DEBUG *.obj
Clang x64
Toolchains used:-
Clang 8.0.0 and libc++
-
System linker from Ubuntu 14.04.3’s GCC 4.8.4 installation
Compiler codegen flags (no exceptions): -fno-exceptions -Os -ffunction-sections -fdata-sections -std=c++17 -stdlib=libc++ -static -DNDEBUG
Compiler codegen flags (exceptions): -Os -ffunction-sections -fdata-sections -std=c++17 -stdlib=libc++ -static -DNDEBUG
Linking flags: -Wl,--gc-sections -pthread -static -static-libgcc -stdlib=libc++ *.o libc++abi.a
GCC x64
Toolchain used: GCC 7.3.1 from the Red Hat Developer Toolset 7.1Compiler codegen flags (no exceptions): -fno-exceptions -Os -ffunction-sections -fdata-sections -std=c++17 -static
Compiler codegen flags (exceptions): -Os -ffunction-sections -fdata-sections -std=c++17 -static
Linking flags: -Wl,--gc-sections -pthread -static -static-libgcc -static-libstdc++ *.o
Appendix C: The code
As stated before, this isn’t the exact code that was benchmarked. In the benchmarked code, functions were placed in distinct translation units in order to avoid inlining. The following code is provided to demonstrate what the error handling code looks like.Common support code
Expand to see code snippets
All casesstruct Dtor { ~ Dtor () {}}; int global_int = 0 ;
throw_exception
class err_exception : public std :: exception { public : int val ; explicit err_exception ( int e ) : val ( e ) {} const char * what () const noexcept override { return "" ; } };
Initial error neutral functions
This section lays the groundwork for future comparisons. All of these cases are capable of transporting error information from a future signaling site (callee
) to a future catching site (main
). No errors are signaled here, but the plumbing is in place.
Expand to see code snippets
Default main functionabort, throw_exceptionint main () { caller (); return global_int ; }
noexcept_abortvoid callee () { /* will raise an error one day*/ } void caller () { Dtor d ; callee (); global_int = 0 ; }
return_valvoid callee () noexcept { /* will raise an error one day*/ } void caller () noexcept { Dtor d ; callee (); global_int = 0 ; }
int callee () noexcept { return 0 ;} int caller () noexcept { Dtor d ; int e = callee (); if ( e ) return e ; global_int = 0 ; return e ; }
Incremental error neutral functions
Here, we add an extra two functions with error transporting capabilities so that we can measure the incremental cost of error neutral functions. These functions need to be slightly different than the old functions in order to avoid deduplication optimizations.In order to save on text length, the only functions that will be listed here are the functions were added or changed compared to the previous section.
Expand to see code snippets
Default main functionabort, throw_exceptionint main () { caller (); caller2 ( 0 ); return global_int ; }
noexcept_abortvoid callee2 ( int amount ) { global_int += amount ; } void caller2 ( int amount ) { Dtor d ; callee2 ( amount ); global_int += amount ; }
return_valvoid callee2 ( int amount ) noexcept { global_int += amount ; } void caller2 ( int amount ) noexcept { Dtor d ; callee2 ( amount ); global_int += amount ; }
int callee2 ( int amount ) { global_int += amount ; return 0 ; } int caller2 ( int amount ) { Dtor d ; int e = callee2 ( amount ); if ( e ) return e ; global_int += amount ; return e ; }
Initial signaling of an error
Expand to see code snippets
abortnoexcept_abortvoid callee () { if ( global_int == INT_MAX ) abort (); }
return_valvoid callee () noexcept { if ( global_int == INT_MAX ) abort (); }
throw_exceptionint callee () { if ( global_int == INT_MAX ) { return 1 ; } return 0 ; }
void callee () { if ( global_int == INT_MAX ) throw err_exception ( 1 ); }
Incremental signaling of an error
Expand to see code snippets
abortnoexcept_abortvoid callee2 ( int amount ) { if ( global_int + amount == INT_MAX ) abort (); global_int += amount ; }
return_valvoid callee2 ( int amount ) noexcept { if ( global_int + amount == INT_MAX ) abort (); global_int += amount ; }
throw_exceptionint callee2 ( int amount ) { if ( global_int + amount == INT_MAX ) { return 1 ; } global_int += amount ; return 0 ; }
void callee2 ( int amount ) { if ( global_int + amount == INT_MAX ) throw err_exception ( 1 ); global_int += amount ; }
Initial handling of an error
Expand to see code snippets
return_valreturn_structint main () { if ( caller ()) { global_int = 0 ; } caller2 ( 0 ); return global_int ; }
throw_exceptionint main () { if ( caller (). error ) { global_int = 0 ; } caller2 ( 0 ); return global_int ; }
int main () { try { caller (); } catch ( const std :: exception & ) { global_int = 0 ; } caller2 ( 0 ); return global_int ; }
Incremental handling of an error
Expand to see code snippets
return_valthrow_exceptionint main () { if ( caller ()) { global_int = 0 ; } if ( caller2 ( 0 )) { global_int = 0 ; } return global_int ; }
int main () { try { caller (); } catch ( const std :: exception & ) { global_int = 0 ; } try { caller2 ( 0 ); } catch ( const std :: exception & ) { global_int = 0 ; } return global_int ; }