[SG16-Unicode] Abstract and notes for D1859R0: Standard terminology for execution character set encodings

Corentin Jabot corentinjabot at gmail.com
Mon Sep 9 20:48:03 CEST 2019


Character Repertoire. The collection of characters included in a character
set.
Character Set. A collection of elements used to represent textual
information
Coded Character Set. A character set in which each character is assigned a
numeric code point. Frequently abbreviated as character set, charset, or
code set; the acronym CCS is also used.
Abstract Character. A unit of information used for the organization,
control, or representation of textual data.

I will admit i am confused. It's either Character Set or Character
Repertoire



On Mon, 9 Sep 2019 at 20:37, Zach Laine <whatwasthataddress at gmail.com>
wrote:

> On Sun, Sep 8, 2019 at 8:16 PM Tom Honermann <tom at honermann.net> wrote:
>
>> On 9/8/19 12:02 PM, Steve Downey wrote:
>>
>> Character repertoire sounds good, and I will eventually learn to spell
>> it. Character set is definitely terminology from the pre-unicode times, and
>> unfortunately tends to merge the repertoire and encoding,
>> https://www.iana.org/assignments/character-sets/character-sets.xhtml
>>
>> I think I was a little over zealous earlier in stating that Unicode uses
>> "character repertoire" as I described.  I looked again and don't find that
>> term formally defined in the standard.  However, "repertoire" is used
>> throughout the standard in ways that I believe are consistent with my
>> description.  I wasn't able to find an alternative formal term.
>>
> I fully endorse overzelousness as applied to Unicode discussions.
>
>> The way I've been thinking about it is that a "character repertoire"
>> describes a set of *abstract characters* (a formal Unicode term) and a
>> "character set" describes a set of *encoded characters* (a formal
>> Unicode term) that associate each *abstract character* member of a
>> "character repertoire" with a *code point* (a formal Unicode term)
>> within a *codespace* (A formal Unicode term).  See sections 2.4 and 3.4
>> of Unicode 12 and uses of the word "repertoire" within those chapters.  The
>> Unicode standard does use the term "character set", but I didn't find a
>> formal definition.
>>
> I think I follow, except that I don't see whether there is a distinction
> between "character repertoire" and "abstract characters".  Is there?  I'm
> asking because if there is not, I'd prefer to standardize the formally
> described term, which sounds like is "abstract characters".
>
> Zach
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/unicode/attachments/20190909/15ce17b6/attachment.html 


More information about the Unicode mailing list