N3486
__self_func

Published Proposal,

Authors:
Latest:
https://thephd.dev/_vendor/future_cxx/papers/C%20-%20__self_func.html
Previous Revisions:
n3470 (r1), n3439 (r0)
Paper Source:
GitHub ThePhD/future_cxx
Project:
ISO/IEC 9899 Programming Languages — C, ISO/IEC JTC1/SC22/WG14

Abstract

This is a proposal to allow for a function to refer to itself, useful in macros and within statement expressions.

1. Changelog

1.1. Revision 2 - February 11th, 2025

1.2. Revision 1 - February 3rd, 2025

1.3. Revision 0 - December 23rd, 2024

2. Introduction and Motivation

C99 introduced __func__, which is the string name of the function. Microsoft added __FUNCSIG__ and __FUNCDNAME__ to provide another way to get the string name of the function call. But, none of the compilers ever added a way to refer to the current function directly, despite the need for it appearing it macros and other places that wished to implement e.g. Tail Recursion or other traits in a function name-agnostic ways. Macro authors forced users to pass in the name of the function so it could be properly recursed on, but this is slightly cumbersome.

Recently, _Self and -- at the behest of the Community to rename it -- _Recur have both shown up in Celeste’s "C Extensions to Support Generalized Functions Calls, v3.5" [N3315]. Its specification is there in the wording but it only exists to describe tail-calling. Some support was expressed for lifting it out and making it its own entity rather than something that existed purely in the wording itself.

This paper is the lift out, implementing __self_func as a keyword.

3. Design

The design for this is, thankfully, very simple and easy: __self_func is a keyword/identifier that represents the current function invocation the compiler is in. This is implementable very simple in the compiler frontend by simply performing an identifier substitution for the name of the function being translated, and erroring if at file scope. __self_func is a "function designator" in C Standardese terms, that represents the current function. It is a constraint violation for it to be used at any non-block scope. The wording tries to make this easy by making it part of the block grammar, banning it for existing at file scope.

In general, this allows someone to

3.1. Separate from the function itself

One of the key benefits of __self_func is the ability to use it in code without necessarily knowing the name of the function or being tied to the function in any way. This was actually an important addendum to emulating va_start on Windows’s x64 ABI, which is not like the fully-documented SysV ABI. In C++, decltype(func-name) or func-name had to be passed to the function in order to get type information from the function to determine important qualities of the ABI and calling convention. This is what enabled full emulation of va_start outside of the C or C++ standard library and retrieving values correctly on those platforms (e.g., being able to pass all the tests).

However, C++ still has a vast deficiency here in that even with their new Deducing This feature (see: Gasper Azman’s P0847) which allows you to get the "this" of the function (even a "callable function" like a Lambda), it

  1. does not work as a substitute for the function itself, just the "object" the function is tied to (if it has one);

  2. and, does not allow someone to get the signature of the function they are within without explicitly modifying the function itself to pass that object to something like what is attempted in va_start in the above-linked code.

This means that C++ still can’t viably implement va_start(list) with neither the last argument nor the function name passed explicitly to the macro. Deducing this also does not solve this problem because deducing this does not work inside of normal function calls. In C, there’s two critical portions of this for a cross-platform implementation of va_start that relies purely on the library to perform this trick:

  1. typeof(foo) allowing the retrieval of the function’s type (standardized in C23);

  2. _Generic (or other faux-reflection) allows probing the signature of the function in certain ways;

  3. and, some way to access the name of the function or function object without needing to modify the function signature as it stands (currently unavailable).

This means that the implementation of va_start above still has the same deficiency in both the C and C++ versions: we cannot get the name of the calling function independent of the user passing that in -- or passing the function type signature -- as extra information. The user must pass it in to us, which breaks the interface of va_start mandated by the standard (and is not helped by the old va_start interface that took an additional parameter representing the last argument to the function call).

This proposal enables standalone, independent retrieval of the function designator of the function in use, which gives access to both the function itself and, importantly for a pure, platform-based, assembly-hacking, ABI-probing, library-only implementation using information from the function’s type signature and other (platform) macro definitions/compile-time information.

3.2. Current and Future: "Unnamed" Functions/Blocks, and Lambdas

This is important when lambdas (e.g. Jens Gustedt’s Basic Lambdas for C: N2892) need this as they are, effectively, anonymous complete objects (when not a "function literal") that is callable. They don’t necessarily have a name. This also applies to Apple’s C extension, Blocks, which can be nameless when created. There’s no way to get the name of a Lambda or an unnamed Block.

This also solves the problem for plain ol' C functions whose names are generated (either by a code generator which does not immediately provide the name its generating to the code being stuffed into the function) or are created VIA macro (e.g. using the extension __COUNTER__).

In this way, __self_func solves many existing problems.

3.3. Keyword Name

We do not care what this is called. This has 3 popular names:

__self and _Self are common identifiers in some places. _Recur and __recur are fine and N3315 changed its nomenclature to that to follow feedback from the C community about what it could be named. __func__, unfortunately, already exists as a predefined identifier in C (it should have been called __func_name__ but we can’t fix that now). __self_func is what this proposal settled on, using the double underscores to have it exist in a space similar to an identifier, even if the intent is to exist as a keyword.

Another popular name is __this_func__ or __this_func, but this and __this and __this__ are highly contentious keywords that already exist and typically have meaning, thanks to C++ and member objects in-general.

As stated earlier, we do not care what it is called, so long as it exists. Suggestions are welcome and any name will do just fine.

4. Wording

This wording is relative to C’s latest working draft.

📝 Editor’s Note: The ✨ characters are intentional. They represent stand-ins to be replaced by the editor.

4.1. OPTIONAL: Modify "Predefined identifiers" (6.4.3.2)

4.2. Add the new keyword __self_func to §6.4.2

Syntax

1 keyword: one of

...

__self_func

4.3. Add __self_func to the primary-expression grammar of §6.5.3.1

Syntax

1 postfix-expression:

primary-expression

postfix-expression [ *expression ]

postfix-expression ( argument-expression-listopt )

postfix-expression . identifier

postfix-expression -> identifier

postfix-expression ++

postfix-expression --

compound-literal

__self_func

4.4. Add a new section §6.5.3.✨ "__self_func

6.5.3.✨ __self_func
Constraints
__self_func shall only appear in the compound statement of a function definition.
Semantics

__self_func is a function designator (6.3.3.1) designating and having the type of the function it is used in.

References

Informative References

[N3315]
Alex Celeste. C Extensions to support Generalized Function Calls, v3.5. August 19th, 2024. URL: https://www.open-std.org/JTC1/SC22/WG14/www/docs/n3315.htm