Doc. no.: P0417R0
Date: 2016-07-13
Reply to: Beman Dawes <bdawes at acm dot org>
Audience: Core, Library

C++17 should refer to ISO/IEC 10646 2014 instead of 1994

ISO standards are only supposed to have normative references to the latest version of other ISO standards, yet the C++17 CD still refers to ISO/IEC 10646-1:1993, Information technology — Universal Multiple-Octet Coded Character Set (UCS)— Part 1: Architecture and Basic Multilingual Plane.

There have been three revisions and numerous amendments of 10646 since 1994. The changes that impact the C++17 CD include:

See http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html for a copy of ISO/IEC 10646:2014.

Proposed changes

Strike the wording high-lighted in red and add the wording high-lighted in green.

1.2 Normative references [intro.refs]

— ISO/IEC 10646-1:1993, Information technology — Universal Multiple-Octet Coded Character Set (UCS) — Part 1: Architecture and Basic Multilingual Plane :2014, Information technology — Universal Coded Character Set (UCS)

22.5 Standard code conversion facets [locale.stdcvt]

For the facet codecvt_utf8:

— The facet shall convert between UTF-8 multibyte sequences and UCS2 UTF-16 or UCS4 UTF-32 (depending on the size of Elem) within the program.

...

For the facet codecvt_utf16:

— The facet shall convert between UTF-16 multibyte sequences and UCS2 UTF-16 or UCS4 UTF-32 (depending on the size of Elem) within the program.

E.1 Ranges of characters allowed [charname.allowed]

10000-1FFFD, 20000-2FFFD, 30000-3FFFD, 40000-4FFFD, 50000-5FFFD,
    60000-6FFFD, 70000-7FFFD, 80000-8FFFD, 90000-9FFFD, A0000-AFFFD,
    B0000-BFFFD, C0000-CFFFD, D0000-DFFFD, E0000-EFFFD, F0000-FFFFD,
    100000-10FFFD