______________________________________________________________________ 16 Preprocessing directives [cpp] ______________________________________________________________________ 1 A preprocessing directive consists of a sequence of preprocessing tokens. The first token in the sequence is a # preprocessing token that is either the first character in the source file (optionally after white space containing no new-line characters) or that follows white space containing at least one new-line character. The last token in the sequence is the first new-line character that follows the first token in the sequence.1) preprocessing-file: groupopt group: group-part group group-part group-part: pp-tokensopt new-line if-section control-line if-section: if-group elif-groupsopt else-groupopt endif-line if-group: # if constant-expression new-line groupopt # ifdef identifier new-line groupopt # ifndef identifier new-line groupopt elif-groups: elif-group elif-groups elif-group elif-group: # elif constant-expression new-line groupopt else-group: # else new-line groupopt endif-line: # endif new-line _________________________ 1) Thus, preprocessing directives are commonly called "lines." These "lines" have no other syntactic significance, as all white space is equivalent except in certain situations during preprocessing (see the # character string literal creation operator in _cpp.stringize_, for example). control-line: # include pp-tokens new-line # define identifier replacement-list new-line # define identifier lparen identifier-listopt ) replacement-list new-line # undef identifier new-line # line pp-tokens new-line # error pp-tokensopt new-line # pragma pp-tokensopt new-line # new-line lparen: the left-parenthesis character without preceding white-space replacement-list: pp-tokensopt pp-tokens: preprocessing-token pp-tokens preprocessing-token new-line: the new-line character 2 The only white-space characters that shall appear between preprocess- ing tokens within a preprocessing directive (from just after the introducing # preprocessing token through just before the terminating new-line character) are space and horizontal-tab (including spaces that have replaced comments or possibly other white-space characters in translation phase 3). 3 The implementation can process and skip sections of source files con- ditionally, include other source files, and replace macros. These capabilities are called preprocessing, because conceptually they occur before translation of the resulting translation unit. 4 The preprocessing tokens within a preprocessing directive are not sub- ject to macro expansion unless otherwise stated. 16.1 Conditional inclusion [cpp.cond] 1 The expression that controls conditional inclusion shall be an inte- gral constant expression except that: it shall not contain a cast; identifiers (including those lexically identical to keywords) are interpreted as described below;2) and it may contain unary operator expressions of the form defined identifier or defined ( identifier ) which evaluate to 1 if the identifier is currently defined as a macro name (that is, if it is predefined or if it has been the subject of a #define preprocessing directive without an intervening #undef direc- tive with the same subject identifier), zero if it is not. _________________________ 2) Because the controlling constant expression is evaluated during translation phase 4, all identifiers either are or are not macro names -- there simply are no keywords, enumeration constants, and so on. 2 Each preprocessing token that remains after all macro replacements have occurred shall be in the lexical form of a token (_lex.token_). 3 Preprocessing directives of the forms # if constant-expression new-line groupopt # elif constant-expression new-line groupopt check whether the controlling constant expression evaluates to nonzero. 4 Prior to evaluation, macro invocations in the list of preprocessing tokens that will become the controlling constant expression are replaced (except for those macro names modified by the defined unary operator), just as in normal text. If the token defined is generated as a result of this replacement process or use of the defined unary operator does not match one of the two specified forms prior to macro replacement, the behavior is undefined. After all replacements due to macro expansion and the defined unary operator have been performed, all remaining identifiers and keywords3), except for true and false, are replaced with the pp-number 0, and then each preprocessing token is converted into a token. The resulting tokens comprise the control- ling constant expression which is evaluated according to the rules of _expr.const_ using arithmetic that has at least the ranges specified in _lib.support.limits_, except that int and unsigned int act as if they have the same representation as, respectively, long and unsigned long. This includes interpreting character literals, which may involve converting escape sequences into execution character set mem- bers. Whether the numeric value for these character literals matches the value obtained when an identical character literal occurs in an expression (other than within a #if or #elif directive) is implementation-defined.4) Also, whether a single-character character literal may have a negative value is implementation-defined. Each subexpression with type bool is subjected to integral promotion before processing continues. 5 Preprocessing directives of the forms # ifdef identifier new-line groupopt # ifndef identifier new-line groupopt check whether the identifier is or is not currently defined as a macro name. Their conditions are equivalent to #if defined identifier and #if !defined identifier respectively. 6 Each directive's condition is checked in order. If it evaluates to false (zero), the group that it controls is skipped: directives are processed only through the name that determines the directive in order _________________________ 3) An alternative token (_lex.digraph_) is not an identifier, even when its spelling consists entirely of letters and underscores. Therefore it is not subject to this replacement. 4) Thus, the constant expression in the following #if directive and if statement is not guaranteed to evaluate to the same value in these two contexts. #if 'z' - 'a' == 25 if ('z' - 'a' == 25) to keep track of the level of nested conditionals; the rest of the directives' preprocessing tokens are ignored, as are the other prepro- cessing tokens in the group. Only the first group whose control con- dition evaluates to true (nonzero) is processed. If none of the con- ditions evaluates to true, and there is a #else directive, the group controlled by the #else is processed; lacking a #else directive, all the groups until the #endif are skipped.5) 16.2 Source file inclusion [cpp.include] 1 A #include directive shall identify a header or source file that can be processed by the implementation. 2 A preprocessing directive of the form # include <h-char-sequence> new-line searches a sequence of implementation-defined places for a header identified uniquely by the specified sequence between the < and > delimiters, and causes the replacement of that directive by the entire contents of the header. How the places are specified or the header identified is implementation-defined. 3 A preprocessing directive of the form # include "q-char-sequence" new-line causes the replacement of that directive by the entire contents of the source file identified by the specified sequence between the " delim- iters. The named source file is searched for in an implementation- defined manner. If this search is not supported, or if the search fails, the directive is reprocessed as if it read # include <h-char-sequence> new-line with the identical contained sequence (including > characters, if any) from the original directive. 4 A preprocessing directive of the form # include pp-tokens new-line (that does not match one of the two previous forms) is permitted. The preprocessing tokens after include in the directive are processed just as in normal text (each identifier currently defined as a macro name is replaced by its replacement list of preprocessing tokens). If the directive resulting after all replacements does not match one of the two previous forms, the behavior is undefined.6) The method by which a sequence of preprocessing tokens between a < and a > preprocessing token pair or a pair of " characters is combined into a single header name preprocessing token is implementation-defined. _________________________ 5) As indicated by the syntax, a preprocessing token shall not follow a #else or #endif directive before the terminating new-line character. However, comments may appear anywhere in a source file, including within a preprocessing directive. 6) Note that adjacent string literals are not concatenated into a sin- gle string literal (see the translation phases in _lex.phases_); thus, an expansion that results in two string literals is an invalid direc- tive. 5 The mapping between the delimited sequence and the external source file name is implementation-defined. The implementation provides unique mappings for sequences consisting of one or more nondigits (_lex.name_) followed by a period .) and a single nondigit. The implementation may ignore the distinctions of alphabetical case. 6 A #include preprocessing directive may appear in a source file that has been read because of a #include directive in another file, up to an implementation-defined nesting limit. 7 [Example: The most common uses of #include preprocessing directives are as in the following: #include <stdio.h> #include "myprog.h" --end example] 8 [Example: Here is a macro-replaced #include directive: #if VERSION == 1 #define INCFILE "vers1.h" #elif VERSION == 2 #define INCFILE "vers2.h" /* and so on */ #else #define INCFILE "versN.h" #endif #include INCFILE --end example] 16.3 Macro replacement [cpp.replace] 1 Two replacement lists are identical if and only if the preprocessing tokens in both have the same number, ordering, spelling, and white- space separation, where all white-space separations are considered identical. 2 An identifier currently defined as a macro without use of lparen (an object-like macro) may be redefined by another #define preprocessing directive provided that the second definition is an object-like macro definition and the two replacement lists are identical, otherwise the program is ill-formed. 3 An identifier currently defined as a macro using lparen (a function- like macro) may be redefined by another #define preprocessing direc- tive provided that the second definition is a function-like macro def- inition that has the same number and spelling of parameters, and the two replacement lists are identical, otherwise the program is ill- formed. 4 The number of arguments in an invocation of a function-like macro shall agree with the number of parameters in the macro definition, and there shall exist a ) preprocessing token that terminates the invoca- tion. 5 A parameter identifier in a function-like macro shall be uniquely declared within its scope. 6 The identifier immediately following the define is called the macro name. There is one name space for macro names. Any white-space char- acters preceding or following the replacement list of preprocessing tokens are not considered part of the replacement list for either form of macro. 7 If a # preprocessing token, followed by an identifier, occurs lexi- cally at the point at which a preprocessing directive could begin, the identifier is not subject to macro replacement. 8 A preprocessing directive of the form # define identifier replacement-list new-line defines an object-like macro that causes each subsequent instance of the macro name7) to be replaced by the replacement list of preprocess- ing tokens that constitute the remainder of the directive.8) The replacement list is then rescanned for more macro names as specified below. 9 A preprocessing directive of the form # define identifier lparen identifier-listopt ) replacement-list new-line defines a function-like macro with parameters, similar syntactically to a function call. The parameters are specified by the optional list of identifiers, whose scope extends from their declaration in the identifier list until the new-line character that terminates the #define preprocessing directive. Each subsequent instance of the function-like macro name followed by a ( as the next preprocessing token introduces the sequence of preprocessing tokens that is replaced by the replacement list in the definition (an invocation of the macro). The replaced sequence of preprocessing tokens is terminated by the matching ) preprocessing token, skipping intervening matched pairs of left and right parenthesis preprocessing tokens. Within the sequence of preprocessing tokens making up an invocation of a func- tion-like macro, new-line is considered a normal white-space charac- ter. 10The sequence of preprocessing tokens bounded by the outside-most matching parentheses forms the list of arguments for the function-like macro. The individual arguments within the list are separated by comma preprocessing tokens, but comma preprocessing tokens between matching inner parentheses do not separate arguments. If (before argument substitution) any argument consists of no preprocessing tokens, the behavior is undefined. If there are sequences of prepro- cessing tokens within the list of arguments that would otherwise act as preprocessing directives, the behavior is undefined. _________________________ 7) Since, by macro-replacement time, all character literals and string literals are preprocessing tokens, not sequences possibly containing identifier-like subsequences (see 2.1.1.2, translation phases), they are never scanned for macro names or parameters. 8) An alternative token (_lex.digraph_) is not an identifier, even when its spelling consists entirely of letters and underscores. Therefore it is not possible to define a macro whose name is the same as that of an alternative token. 16.3.1 Argument substitution [cpp.subst] 1 After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. A parameter in the replacement list, unless preceded by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is replaced by the corresponding argument after all macros contained therein have been expanded. Before being substituted, each argument's preprocess- ing tokens are completely macro replaced as if they formed the rest of the translation unit; no other preprocessing tokens are available. 16.3.2 The # operator [cpp.stringize] 1 Each # preprocessing token in the replacement list for a function-like macro shall be followed by a parameter as the next preprocessing token in the replacement list. 2 If, in the replacement list, a parameter is immediately preceded by a # preprocessing token, both are replaced by a single character string literal preprocessing token that contains the spelling of the prepro- cessing token sequence for the corresponding argument. Each occur- rence of white space between the argument's preprocessing tokens becomes a single space character in the character string literal. White space before the first preprocessing token and after the last preprocessing token comprising the argument is deleted. Otherwise, the original spelling of each preprocessing token in the argument is retained in the character string literal, except for special handling for producing the spelling of string literals and character literals: a \ character is inserted before each " and \ character of a character literal or string literal (including the delimiting " characters). If the replacement that results is not a valid character string literal, the behavior is undefined. The order of evaluation of # and ## opera- tors is unspecified. 16.3.3 The ## operator [cpp.concat] 1 A ## preprocessing token shall not occur at the beginning or at the end of a replacement list for either form of macro definition. 2 If, in the replacement list, a parameter is immediately preceded or followed by a ## preprocessing token, the parameter is replaced by the corresponding argument's preprocessing token sequence. 3 For both object-like and function-like macro invocations, before the replacement list is reexamined for more macro names to replace, each instance of a ## preprocessing token in the replacement list (not from an argument) is deleted and the preceding preprocessing token is con- catenated with the following preprocessing token. If the result is not a valid preprocessing token, the behavior is undefined. The resulting token is available for further macro replacement. The order of evaluation of ## operators is unspecified. 16.3.4 Rescanning and further replacement [cpp.rescan] 1 After all parameters in the replacement list have been substituted, the resulting preprocessing token sequence is rescanned with all sub- sequent preprocessing tokens of the source file for more macro names to replace. 2 If the name of the macro being replaced is found during this scan of the replacement list (not including the rest of the source file's pre- processing tokens), it is not replaced. Further, if any nested replacements encounter the name of the macro being replaced, it is not replaced. These nonreplaced macro name preprocessing tokens are no longer available for further replacement even if they are later (re)examined in contexts in which that macro name preprocessing token would otherwise have been replaced. 3 The resulting completely macro-replaced preprocessing token sequence is not processed as a preprocessing directive even if it resembles one. 16.3.5 Scope of macro definitions [cpp.scope] 1 A macro definition lasts (independent of block structure) until a cor- responding #undef directive is encountered or (if none is encountered) until the end of the translation unit. 2 A preprocessing directive of the form # undef identifier new-line causes the specified identifier no longer to be defined as a macro name. It is ignored if the specified identifier is not currently defined as a macro name. 3 [Note: The simplest use of this facility is to define a "manifest con- stant," as in #define TABSIZE 100 int table[TABSIZE]; 4 The following defines a function-like macro whose value is the maximum of its arguments. It has the advantages of working for any compatible types of the arguments and of generating in-line code without the overhead of function calling. It has the disadvantages of evaluating one or the other of its arguments a second time (including side effects) and generating more code than a function if invoked several times. It also cannot have its address taken, as it has none. #define max(a, b) ((a) > (b) ? (a) : (b)) The parentheses ensure that the arguments and the resulting expression are bound properly. 5 To illustrate the rules for redefinition and reexamination, the sequence #define x 3 #define f(a) f(x * (a)) #undef x #define x 2 #define g f #define z z[0] #define h g(~ #define m(a) a(w) #define w 0,1 #define t(a) a f(y+1) + f(f(z)) % t(t(g)(0) + t)(1); g(x+(3,4)-w) | h 5) & m (f)^m(m); results in f(2 * (y+1)) + f(2 * (f(2 * (z[0])))) % f(2 * (0)) + t(1); f(2 * (2+(3,4)-0,1)) | f(2 * (~5)) & f(2 * (0,1))^m(0,1); 6 To illustrate the rules for creating character string literals and concatenating tokens, the sequence #define str(s) # s #define xstr(s) str(s) #define debug(s, t) printf("x" # s "= %d, x" # t "= %s", \ x ## s, x ## t) #define INCFILE(n) vers ## n /* from previous #include example */ #define glue(a, b) a ## b #define xglue(a, b) glue(a, b) #define HIGHLOW "hello" #define LOW LOW ", world" debug(1, 2); fputs(str(strncmp("abc\0d", "abc", '\4') /* this goes away */ == 0) str(: @\n), s); #include xstr(INCFILE(2).h) glue(HIGH, LOW); xglue(HIGH, LOW) results in printf("x" "1" "= %d, x" "2" "= %s", x1, x2); fputs("strncmp(\"abc\\0d\", \"abc\", '\\4') == 0" ": @\n", s); #include "vers2.h" (after macro replacement, before file access) "hello"; "hello" ", world" or, after concatenation of the character string literals, printf("x1= %d, x2= %s", x1, x2); fputs("strncmp(\"abc\\0d\", \"abc\", '\\4') == 0: @\n", s); #include "vers2.h" (after macro replacement, before file access) "hello"; "hello, world" Space around the # and ## tokens in the macro definition is optional. 7 And finally, to demonstrate the redefinition rules, the following sequence is valid. #define OBJ_LIKE (1-1) #define OBJ_LIKE /* white space */ (1-1) /* other */ #define FTN_LIKE(a) ( a ) #define FTN_LIKE( a )( /* note the white space */ \ a /* other stuff on this line */ ) But the following redefinitions are invalid: #define OBJ_LIKE (0) /* different token sequence */ #define OBJ_LIKE (1 - 1) /* different white space */ #define FTN_LIKE(b) ( a ) /* different parameter usage */ #define FTN_LIKE(b) ( b ) /* different parameter spelling */ --end note] 16.4 Line control [cpp.line] 1 The string literal of a #line directive, if present, shall be a char- acter string literal. 2 The line number of the current source line is one greater than the number of new-line characters read or introduced in translation phase 1 (_lex.phases_) while processing the source file to the current token. 3 A preprocessing directive of the form # line digit-sequence new-line causes the implementation to behave as if the following sequence of source lines begins with a source line that has a line number as spec- ified by the digit sequence (interpreted as a decimal integer). If the digit sequence specifies zero or a number greater than 32767, the behavior is undefined. 4 A preprocessing directive of the form # line digit-sequence "s-char-sequenceopt" new-line sets the line number similarly and changes the presumed name of the source file to be the contents of the character string literal. 5 A preprocessing directive of the form # line pp-tokens new-line (that does not match one of the two previous forms) is permitted. The preprocessing tokens after line on the directive are processed just as in normal text (each identifier currently defined as a macro name is replaced by its replacement list of preprocessing tokens). If the directive resulting after all replacements does not match one of the two previous forms, the behavior is undefined; otherwise, the result is processed as appropriate. 16.5 Error directive [cpp.error] 1 A preprocessing directive of the form # error pp-tokensopt new-line causes the implementation to produce a diagnostic message that includes the specified sequence of preprocessing tokens, and renders the program ill-formed. 16.6 Pragma directive [cpp.pragma] 1 A preprocessing directive of the form # pragma pp-tokensopt new-line causes the implementation to behave in an implementation-defined man- ner. Any pragma that is not recognized by the implementation is ignored. 16.7 Null directive [cpp.null] 1 A preprocessing directive of the form # new-line has no effect. 16.8 Predefined macro names [cpp.predefined] 1 The following macro names shall be defined by the implementation: __LINE__ The line number of the current source line (a decimal constant). __FILE__ The presumed name of the source file (a character string literal). __DATE__ The date of translation of the source file (a character string lit- eral of the form "Mmm dd yyyy", where the names of the months are the same as those generated by the asctime function, and the first character of dd is a space character if the value is less than 10). If the date of translation is not available, an implementation- defined valid date is supplied. __TIME__ The time of translation of the source file (a character string lit- eral of the form "hh:mm:ss" as in the time generated by the asctime function). If the time of translation is not available, an imple- mentation-defined valid time is supplied. __STDC__ Whether __STDC__ is predefined and if so, what its value is, are implementation-defined. __cplusplus The name __cplusplus is defined to the value 199711L when compiling a C++ translation unit.9) +------- BEGIN BOX 1 -------+ The date is intended to specify the expected official date of the standard. The Project Editor will revise the date as needed. _________________________ 9) It is intended that future versions of this standard will replace the value of this macro with a greater value. Non-conforming compil- ers should use a value with at most five decimal digits. +------- END BOX 1 -------+ 2 The values of the predefined macros (except for __LINE__ and __FILE__) remain constant throughout the translation unit. 3 If any of the pre-defined macro names in this subclause, or the iden- tifier defined, is the subject of a #define or a #undef preprocessing directive, the behavior is undefined.