Submitter:Fred J. Tydeman
Submission Date:2017-02-23
Document: WG14 N2115

Problem being solved

With respect to preprocessor behaviour of line numbers and __LINE__, there are differences between implementations, and between some releases of the same implementation.

For most implementations, it is not clear what conventions are being used for line numbering, therefore, one cannot predict what line number will be associated with a given preprocessor token.

This makes if very difficult to write portable code that has consistent behaviour.

I believe that the following are either implicit undefined, or unclear:

  1. value of a __LINE__ token in a preprocessor directive
  2. value of a __LINE__ token in a macro definition
  3. use of __LINE__ token as a parameter in a function-like macro definition
  4. value of __LINE__ exceeds 2147483647
  5. value of a multi-line __LINE__ token
  6. line number associated with a multi-line preprocessor directive
  7. line number associated with a multi-line macro invocation
  8. value of a __LINE__ token argument in a multi-line function-like macro invocation

Do we want to make these defined behaviour, implementation-defined behaviour, unspecified behaviour, or leave them as undefined?

These have shown up in (at least) DRs 173, 464, 483.

DR 464 has: In a distributed development environment, the exact file name passed to the compiler or preprocessor may vary from site to site. It is therefore desirable to be able to set the file name as seen by __FILE__ and elsewhere to a uniform value. The mechanism to do this is the
#line <num> "<string>"
form of the '#line' preprocessor directive. It is also necessary that such a directive leave the line numbering sequence unchanged. Further, it is desirable that edits that change the location of the directive in the source module should not require modification to the directive and that comments embedded in the directive likewise do not have to be accounted for.

Searches of the online literature show that a directive of the form
#line __LINE__ "string"
is expected to have this property.

DR 173 (against C89/C90) raised two questions.

  1. If a __LINE__ token is split over multiple lines, what is its line value?

    An example of that.

    
    #line 100
     int j = __LI\
    NE__;
    
    
  2. If a macro invocation is split over multiple lines, what is its line value?

    An example of that.

    
    #define LINE __LINE__
    #define MAC( a, b ) a, b, __LINE__
    #line 200
      int j[] = { 
        MAC(
            __LINE__,
            LINE
           )
       };
    
    

I have yet to find the committee response to those two questions (might be in WG14 N333 which I cannot find).

Related to the above is: If __LINE__ (which could come from a macro expansion) is an argument to a macro, what value does it get? Some implementations give it the value of the line it is on, others give it the value of the macro invocation.

DR 483 asks the question: What is the line number of a __LINE__ token in a macro replacement list: Is it the line number of the macro definition or of the macro invocation?

An example of that.


#line 500
#define LINE __LINE__

#line 1000
 int j = LINE;         /* is this 500 or 1000? */

The committee response does not explicitly answer the question (but implies it is the line number of the macro invocation); however, it still is undefined.

In thinking about the above issues, another related issue is: If a preprocessing directive spans multiple lines, what line number is associated with it? And, what line number is associated with a __LINE__ token (which might come from macro expansion) that is part of a preprocessing directive?

An extreme example of that.


#line 2000
\
 \
\
 #\
 \
\
l\
\
i\
\
n\
\
e\
 \
 \
_\
_\
\
L\
\
I\
\
N\
\
E\
_\
_\
\
 \
\
 //
  int k = __LINE__;

Another issue is what to do about nested macros? For example, gcc documentation has

All the tokens resulting from macro expansion are reported as having appeared on the line of the source file where the outermost macro was used. We intend to be more accurate in the future.

An example of that.


  int line[5];
#line 1000
  assert(               /* 1000 */
   (assert(             /* 1001 */
    (assert(            /* 1002 */
     (assert(           /* 1003 */
      (line[0] = __LINE__, 1004,
       line[1] = __LINE__, 1005)
     ),line[2] = __LINE__, 1006)
    ), line[3] = __LINE__, 1007)
   ),  line[4] = __LINE__, 1008)
  );                    /* 1009: end of outer assert() */

The following is not allowed by 6.10.4#3:


#line 2147483648

but the following is(?) allowed:


#line 2147483647
  printf("%i\n", __LINE__);     /* 2147483647 */
  printf("%i\n", __LINE__);     /* 2147483648 or -2147483648 or ??? */

It is not clear what is the status of this code (which uses __LINE__ as a macro parameter):


#define macro( __LINE__ ) __LINE__
  int i = macro( -3 );

Solution

Add the following to 6.10.4 Line control:

The presumed line number (footnote 1) of a __LINE__ token (which could come from macro expansion):

(footnote 1) The presumed line number is the same as the physical line number unless changed by a #line directive.

(footnote 2) Required by the assert() macro.

The presumed line number of a preprocessing directive (which could span multiple lines) shall/should be the presumed line number of the # preceding the directive name.

The presumed line number associated with a macro definition (which could span multiple lines) shall/should be the presumed line number of each invocation of that macro.

The presumed line number associated with a macro invocation (which could span multiple lines) shall/should be the presumed line number of the macro name of the invocation.

If the value of __LINE__ exceeds an implementation defined value (which shall be at least 2147483647), the behaviour shall/should be undefined.

NOTE: The following is implied by the above:

A preprocessing directive of the form
#line __LINE__ "newFilename"
shall/should change the presumed file name without changing the presumed line number.

Example

#line 3000
#line __LINE__ "newFilename"
int k = __LINE__; /* 3000 */

A preprocessing directive of the form
#line newNumber __FILE__
shall/should change the presumed line number without changing the presumed file name.

END NOTE.

Use the 'shall' in the above if we want well defined behaviour, otherwise, use the 'should' if we want to continue with implicit undefined behaviour. With different wording, we could allow for implementation-defined or unspecified behaviour.

Existing practice

Based upon my testing, each of the above requirements has been implemented by at least one implementation, however, it is not clear if there are any implementations that do them all.

Feedback from implementors

Great that you are addressing this!

We don't see any problem with the proposal, and we agree that there is some looseness in the current wording that could stand to be addressed.

Sorry, but I don't see any real benefit to users of the C standard to try to nail down the value of __LINE__ beyond what's already there.

Personally, I would be very, very disappointed if the standard nailed down exactly what that program should output. I absolutely want there to be wiggle room.

What is the purpose of having a standard other than to nail down exactly what a program should output? If the standard does provide wiggle room we should tighten it up.

Surely the Standard should be definitive...

If it is believed that the standard doesn't provide a good deal of implementation latitude, I would like to address that by making the requirements looser. Trying to "clarify" the requirements would be a step backward; I don't want to go there at all.

I'm curious, and probably missing the point; but what problem is this solving? In the past WG14 has been really good at providing a parallel "Rationale" for the decisions the committee makes; is there a Rationale statement, or a work item that explains why this is necessary?