[SG16-Unicode] P1208R3 / source_location

Axel Naumann Axel.Naumann at cern.ch
Wed Feb 20 01:25:03 CET 2019


Thanks everyone, this is what I'll take to Core.
Axel.

On 19.02.19 13:58, Corentin wrote:
> After talking with Tom, I'd like to modify function_name to be a
> NTMBS as it is something we can actually guarantee and I don't think
> __func__ should constrain the design of source location. It would
> consistent with thTstatisfy the NB comment (whose resolution was
> adopted in that direction this morning)
>
> Tom convinced me that filename cannot and should not be a NTMBS
>
>
> On Tue, 19 Feb 2019 at 13:22 Robert Douglas <rwdougla at gmail.com
> <mailto:rwdougla at gmail.com>> wrote:
>
>     Agree.
>
>     On Tue, Feb 19, 2019 at 5:17 PM Tom Honermann <tom at honermann.net
>     <mailto:tom at honermann.net>> wrote:
>
>         On 2/18/19 1:17 PM, Robert Douglas wrote:
>>         Historical footnote, these are intended to be as drop-in as
>>         possible for existing facilities. __FILE__ is a "character
>>         string literal," which gets it's null termination in phase 7.
>>         Since we are accessing these at run-time, we should thus
>>         expect these to be NTBS. Changes to this expectation would be
>>         a deviation from these being a drop-in replacement to
>>         __FILE__ and __func__. Note that [dcl.fct.def.general]
>>          p 8 defines __func__ as an implementation-defined string as
>>         if static const char __func__[] = "function-name "; which
>>         implies, also, an NTBS. This is the reasoning for NTBS. To do
>>         otherwise, would deviate this feature from __FILE__ and
>>         __func__, which it is designed to replace.
>
>         Agreed.  Certainly guaranteeing that these have a null
>         terminator is required given that file_name() returns const
>         char*.  I don't agree with associating these with NTMBSs
>         though since multi-byte has encoding implications.
>
>         Tom.
>
>>
>>
>>         On Mon, Feb 18, 2019 at 11:20 AM Corentin
>>         <corentin.jabot at gmail.com <mailto:corentin.jabot at gmail.com>>
>>         wrote:
>>
>>             Quick reply : display only, no expectation the file can
>>             be open, or exists, or is a file. It's purely
>>             informative. But expectation it can be displayed, the
>>             main use cases being logging. Otherwise I agree with you.
>>
>>             On Mon, Feb 18, 2019, 7:16 AM Tom Honermann
>>             <tom at honermann.net <mailto:tom at honermann.net>> wrote:
>>
>>
>>                 On Feb 18, 2019, at 10:04 AM, Corentin
>>                 <corentin.jabot at gmail.com
>>                 <mailto:corentin.jabot at gmail.com>> wrote:
>>
>>>
>>>                 Very good points. 
>>>                 Wouldn't it be sufficient to specify that the
>>>                 strings are NTMBS encoded using the execution
>>>                 character set?
>>>                 source_location currently avoids making any
>>>                 assumption about how these strings are formed,
>>>                 including that they are derived from a source file.
>>>                 So since the value is implementation-defined, so
>>>                 should be the way it's constructed. 
>>>                 However, it is reasonable to assume that these
>>>                 things are valid text and therefore have a known
>>>                 encoding.
>>>
>>>                 Adding Tom, because this is borderline SG16 territory. 
>>
>>                 This isn’t borderline as we have (recently) requested
>>                 review of anything involving file names. 
>>
>>>
>>>
>>>                 @Tom: Do you want to see source_location this week
>>>                 knowing that I'd hope it would get through LWG
>>>                 before the end of the week?
>>>                 Or do you think having function_name / filename as
>>>                 multi-bytes strings encoded using the execution
>>>                 character set is reasonable?
>>>                 The alternative I see are
>>>
>>>                   * Leave it unspecified
>>>                   * Force a specific character set... which the
>>>                     world is not ready for
>>>
>>                 I think there is a higher level question to answer.
>>                 Are the provided file names display only, or should
>>                 one expect to be able to open the file using the
>>                 provided name?
>>
>>                 If they are display only, then we can specify an
>>                 encoding for them similarly to what is done for
>>                 member functions of std::filesystem::path. In this
>>                 case, we must explicitly acknowledge that the names
>>                 do not roundtrip through the filesystem (though
>>                 typically will in practice). Note that, on Windows,
>>                 file names cannot be represented accurately using
>>                 char based strings, so unless we want to add wchar_t
>>                 support, these names will be technically display only. 
>>
>>                 If they are potentially not display only, then we
>>                 can’t associate an encoding and the names are
>>                 bags-of-bytes. This is a limitation of POSIX. But
>>                 then we need wchar_t support for Windows. 
>>
>>                 In San Diego, the guidance we gave for the stacktrace
>>                 proposal is that file names are  implementation
>>                 defined bags-of-bytes. If we advised otherwise for
>>                 source location, we would be giving inconsistent
>>                 guidance. 
>>
>>                 I think we should discuss this in SG16 this week. Not
>>                 necessarily to propose changes for the proposal, but
>>                 to solidify our collective thinking around file names. 
>>
>>                 Tom. 
>>>
>>>                 Thanks, 
>>>                 Corentin
>>>
>>>
>>>
>>>                 On Mon, 18 Feb 2019 at 03:56 Axel Naumann
>>>                 <Axel.Naumann at cern.ch <mailto:Axel.Naumann at cern.ch>>
>>>                 wrote:
>>>
>>>                     Hi Robert,
>>>
>>>                     Regarding your P1208R3:
>>>
>>>                     Nit: it's titled "D1208R3", it doesn't mention
>>>                     email addresses.
>>>
>>>                     Not-so-nit: a NB comment on the reflection TS
>>>                     asks to not use NTBS but
>>>                     NTMBS and "Where NTBS is mentioned in the
>>>                     document under ballot, the
>>>                     encoding used for the string’s value is
>>>                     unspecified." Jens agrees that
>>>                     the proposed solution should be applied:
>>>                     "Specify that the strings are
>>>                     first formed using the basic source character
>>>                     set (with
>>>                     universal-character-names as necessary) then
>>>                     mapped in the manner
>>>                     applied to string literals with no encoding
>>>                     prefix in phases 5 and 6 of
>>>                     translation."
>>>
>>>                     I would very much hope that both changes are
>>>                     also applied to P1208R3. I
>>>                     call this out explicitly in our recommended NB
>>>                     comment response paper.
>>>
>>>                     Cheers, Axel.
>>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/unicode/attachments/20190219/df17cea1/attachment-0001.html 


More information about the Unicode mailing list