JTC1/SC22
N2768
Date: Mon, 20 Jul 1998 17:23:32 -0400 (EDT)
From: "william c. rinehuls" <rinehuls@access.digex.net>
To: sc22docs@dkuug.dk
Subject: SC22 N2768 - Minutes of WG20 June Meeting
__________________ beginning of title page _________________________
ISO/IEC JTC 1/SC22
Programming languages, their environments and system software interfaces
Secretariat: U.S.A. (ANSI)
ISO/IEC JTC 1/SC22
N2768
TITLE:
Minutes of SC22/WG20 (Internationalization) Meeting on June 15-19, 1998 in
Dublin, Ireland
DATE ASSIGNED:
1998-07-20
SOURCE:
Secretariat, ISO/IEC JTC 1/SC22
BACKWARD POINTER:
N/A
DOCUMENT TYPE:
Minutes of WG20 Meeting
PROJECT NUMBER:
N/A
STATUS:
N/A
ACTION IDENTIFIER:
FYI
DUE DATE:
N/A
DISTRIBUTION:
Text
CROSS REFERENCE:
SC22 N2762
DISTRIBUTION FORM:
Open
Address reply to:
ISO/IEC JTC 1/SC22 Secretariat
William C. Rinehuls
8457 Rushing Creek Court
Springfield, VA 22153 USA
Telephone: +1 (703) 912-9680
Fax: +1 (703) 912-2973
email: rinehuls@access.digex.net
__________________ end of title page; beginning of minutes _________
MINUTES - DUBLIN, June 15-19, 1998
ISO/IEC JTC 1 SC22/WG20
Meeting #14 - Internationalization
July 20, 1998
1. Introduction and announcements by Convenor
* New people: Takata, Whistler, Clews, Yamanaka, Küster
* finding consensus
* additional agenda point for convenor's report to the SC22 plenary -
15.3, N566
* clarification on electronic document distribution: Word 2.0-6.0
* Miles Ellis - PDF converter: In future you should connect to
com1.etrc.ox.ac.uk using the user name SC22 and the password JTC1SC22
(case sensitive) rather than using anonymous ftp. Otherwise the
procedure remains the same except that, since this service is relatively
little used I would be grateful if you would let me know when you have
sent any file(s) in order to ensure that I run Distiller. If you don't
let me know then I may not realise and may not run Distiller for several
days!
* Timing: discuss sort after Tuesday to allow Clews to participate
* Please update the distribution list
2. Introduction of national delegations, liaisons, and cooperations
Clews, John
UK
Sesame (6/16/98)
Everson, Michael
Ireland
ETG
Fujimura, Koreaki
Japan
Electrotechnical Laboratory
Garland, Tom
Ireland
Sun (98-06-18 only)
Küster, Marc
Germany
Uni Tübingen
LaBonté, Alain
Canada
Trésor du Quebéc
Sherif, Khaled
Egypt
IBM
Simonsen, Keld
Denmark
DKUUG
Soor, Baldev
Canada
IBM
Takata, Masayuki
Japan
Edogawa University
Whistler, Ken
USA
Sybase
Winkler, Arnold
(USA)
Unisys, convener
Yamanaka. Gail
USA
Oracle
3. Appointment of chairperson, secretary, and drafting committee
Chair: Winkler
Secretary Winkler
Drafting committee Simonsen, Fujimura, Whistler, Everson
all approved as presented
4. Approval of prior meeting's minutes
544
Minutes - Cairo, November 1997
Winkler
97-11-20
admin
Minutes are approved.
5. Future Meeting Schedule and Plans
#15
October 18-22, 1998 (changed)
Tel Aviv
Israel
#16
May 3-7, 1999
Malvern
USA
#17
tbd
Copenhagen
Denmark
#18
tbd
Quebec
Canada
6. Recognition of new documents and assignment to agenda items
Nr.
Title
Source
Date
Project
557
Resolutions from the CAW - January 1998
CAW
98-01-22
admin
558
Liaison report to WG15
Keld Simonsen
98-04-23
admin
559
Language Independent Specification techniques
Keld Simonsen
98-04-24
22.30.02.03
560
Digital Winter - contribution from Bob Barbour
Bob Barbour
98-05-06
admin
561
Final Agenda - Dublin, June 15-19, 1998
Winkler
98-06-15
admin
562
Participants - Dublin, June 15-19, 1998
Winkler
98-06-19
admin
563
Minutes - Dublin, June 15-19, 1998
Winkler
98-06
admin
564
Resolutions - Dublin, June 15-19, 1998
SC22/WG20
98-06-19
admin
565
JTC1 resolutions from Sendai
JTC1 N5448
98-06-05
admin
566
Convenor's report to SC22 plenary Sept. 98
Winkler
98-06-19
admin
567
Summary of voting and comments to FCD 14651 - International string
ordering (N2607)
SC22 N2719
98-05-18
22.30.02.02
568
Disposition of comments to FCD 14651 ballot
22.30.02.02
569
Summary of voting and comments on FCD 14652 - Specification of cultural
conventions
SC22
98-06-12
22.30.02.03
570
Disposition of comments to FCD 14652 ballot
22.30.02.03
571
Liaison report from SC2/WG2
Winkler
June 1998
admin
572
SC22 chairman's report on JTC1 plenary and chairmens' forum in Sendai
SC22 N2726
98-06-10
admin
573
Dual currency handling in Locales with respect to the euro
Soor, Uma
IBM Canada
98-05-28
22.30.02.03
574
Internationalization in Fortran 2000
SC22/WG5 N1320
June 1998
admin
575
Money-to-string function
Keld Simonsen
June 1998
22.30.02.03
576
Draft Unicode Technical Report #10
Unicode collation algorithm
Mark Davis
Ken Whistler
97-03-30
22.30.02.02
577
Table of replies and comments to Fast Track ballot on ISO/IEC DIS 15897
(EN 12005)
SC22 N2717
98-05-14
22.30.02.03
578
Suggested BNF (Backaus-Noir Format) syntax for template tables for ISO
14651
Ken Whistler
98-06-18
22.30.02.02
579
Contributions to the sorting of accents
Everson
Melagrakis
98-06-18
22.30.02.02
580
Comments to N573 - dual currency handling in locales with respect to the
euro
Ienup Sung
Tom Garland
98-06-18
22.30.02.03
22.15435
7. Approval of Agenda
Additions:
Convenor's report to SC22 plenary N566 (15.3)
Approved with addition above
8. Liaison Reports
8.1 Additions/deletions/changes to liaisons
Winkler: Reconfirm liaison with SQL - Jim Melton, now SC32/WG3. Jim
asked for it.
Resolution: liaison with SC32/WG3, request OK from SC22
8.2 SC22/WG4, COBOL
no report. COBOL wanted to keep WG20 in SC22, fear of disruption of
their I18N efforts in the new standard.
8.3 SC22/WG15
Keld: IEEE says that WG15 has no expertise on I18N, but draft 2.b on
the POSIX utility standard contains some I18N, coordinated with WG20
work through Keld. Guideline for "National Profiles" will also address
I18N. 14766 is the number of this new technical report.
8.4 SC2/WG2
571
Liaison report from SC2/WG2
Winkler
June 1998
admin
Action Winkler: Distribute Plane 14 draft from SC2/WG2
8.5 WG14
Keld: C and C++ will hold some meetings together. New C standard will
include new identifiers (TR 10176) and dual currency specifications.
Dates and time formats are being harmonized between C and C++. Global
locale model remains.
No written report
8.6 WG21
Keld: C++ out for FDIS ballot - all I18N from C and object oriented
extensions. Conversions from input formats, monetary formats, etc.
no written report
8.7 WG5
574
Internationalization in Fortran 2000
SC22/WG5 N1320
June 1998
admin
Takata: internal draft document in WG5 - will NOT adopt the dynamic
locale model. Character "kind" will be used to specify ISO 10646
characters. Also switch from decimal point to comma and back.
8.8 GUIDE/SHARE Europe
dormant, no report
8.9 JTC1/WG5 (now SC35)
Resolution: continue liaison with subject matter and establish liaison
with the new SC35 and appropriate working groups. Request SC22
approval.
8.10 CEN TC 304
About sort: Marc Küster: project team on European ordering rules on
Subset #2 of 10646, including polytonic Greek, Cyrillic, and all Latins,
some symbols. Küster is editor for the sort standard, Everson for the
subset standards. Interested in harmonization with 14651 - no competing
standards. Updated version will be available soon.
Keld: Reorganization in TC304 in project teams - 15 and counting ....
P1 - sorting
P17 - euro locale related to 14652
P11 - alphabets of European languages
P10 - subsets of 10646
P9 - conversion projects using 2022, and others.
P2 - cultural registry (IS 15897)
8.11 TC37
Keld: IS 12199 (sort) is on hold for alignment with 14651 - upon
request of Hjulstad (project leader of CEN sorting standard and editor
of 12199).
Action Küster: ask Hjulstad if he wants to be on the WG20 mailing list
and invite him to further WG20 meetings.
8.12 ITU-T
no report
8.13 Ada report from Keld:
Some Ada people have done some work together with Keld on an Ada binding
for IS 15435.
Action Keld: ask WG9 if they want a formal liaison?
9. Review of prior meetings action items
SD-5
Action item list
Winkler
98-05-06
admin
New action for Ken: take over from Kung A9711-8 on word break API.
10. Revision of TR 10176
Reporting that TR is being published and will be available soon. ITTF
did not release publication date yet. It is obvious, that a revision of
this TR is needed soon, because the repertoire of ISO 10646 is growing
rapidly and must be represented fully in Annex A of the TR.
Action Winkler to contact Kido about possible editorship.
Keld said he would be volunteering for editor as previously agreed.
Keld also mentioned that WG20 earlier had resolved to revise the TR to
include further guidelines for internationalization, but no evidence to
that effect could be found in the minutes or disposition of comments.
11. International string ordering ISO/IEC CD 14651
567
Summary of voting and comments to FCD 14651 - International string
ordering (N2607)
SC22 N2719
98-05-18
22.30.02.02
568
Disposition of comments to FCD 14651 ballot
22.30.02.02
576
Draft Unicode Technical Report #10
Unicode collation algorithm
Mark Davis
Ken Whistler
97-03-30
22.30.02.02
578
Suggested BNF (Backus-Naur Format) syntax for template tables for ISO
14651
Ken Whistler
98-06-18
22.30.02.02
579
Contributions to the sorting of accents
Everson
Melagrakis
98-06-18
22.30.02.02
The document N567 is incomplete, a part of the UK comments is missing.
The convener will reprint the complete document and distribute it in the
next mailing as N567R. Action Winkler.
Canonical equivalences are explained by Ken Whistler upon request from
Mr. Fujimura. It is not necessary to modify data in order to use
combined and decomposed characters for comparison - assign correct
weights.
11.1 Canada
Canada #1:
Alain explains the sort algorithm as a reminder for the new
participants. Everson says, that an open set of combining characters
must be processable to deal with "weird" languages. These are not
defined in any symbolic table.
The Unicode tables specify the allowed decomposition, the weights, and
the ordering of the accents.
Discussion about the validity of the canonical decomposition for
equivalence - important for the implementation of the sort algorithm,
should be transparent for the end user.
Marc: "closed" system is not a solution, any new character would
invalidate the default table. Only an open system can cope with
additional characters without changing the tables. Implementations will
define derived tables which fit into 32 bit (experience from Ken)....
Principle agreement to use Unicode method for canonical equivalences.
The method must not prevent the construction of storable keys ! Marc:
we are sorting linguistic entities, normalization is a great advantage.
Normalization is outside the scope of the standard. The data have to
"behave" as if they were normalized. Unicode tables support canonical
equivalences.
Ken: Europe could implement sorting of a sub-repertoire that contains
no canonical equivalences. MES-2 has no canonical equivalences, MES-3
does have them all.
Tables must be created for all combination of combining marks (derived
table).
Fujimura: what about Korean Jamos ? This is currently not mentioned in
14651. Ken: An algorithmic transformation creates the Korean syllables
and the weights. The Hanguls order in the binary order. Combining Jamo
tables are weighted in the order of the Jamos.
Canada #2:
agreed
Canada #3:
Ken: what is the exact form of the syntax of the tables in 14651.
Issue is how to tie the table to 14652.
Marc: lets discuss matters of principle, before we cycle through all
the comments.
Accepted in principle, syntax to be discussed when John Clews is
present.
Canada #4:
The question of user tailorability (smalls before upper, etc...) is in
an informative annex. How parameters for different collation behavior
are presented to the user should not be defined in the standard, should
be in the tutorial rather than in a specific annex.
Fujimura: Textual information is acceptable for the toggles, should be
in the tutorial.
Soor: Application has to make the distinction about the toggles.
Example: word or string ordering is dependent on the use of the
ordering algorithm for specific applications.
11.2 Denmark
APIs question must be addressed. Editor has to keep that in mind.
Denmark #1:
OK
Denmark #2:
OK
Denmark #3:
Binary strings are often stored in data bases - a warning to the user is
in order to make him aware that keys can be locale dependent and would
not work correctly in other locales. Add warning !
Denmark #4:
OK
Denmark #5:
"symbol equivalence" for accents and their combinations. This table
should be in 14652 (any number of them, including the Canadian).
Inclusion in 14651 can blow-up the table, in Vienna we decided to take
them out.
Keld requests the tables as a formal annex.
Soor: IBM does not want their hands forced. They would like to use
their own symbolics.
Denmark #6:
accepted
Denmark ed #1:
US preference would be that the names match the names of the 10646
characters.
Denmark ed #2
OK
Denmark ed #3:
accepted, if we retain the APIs.
11.3 Ireland
agreed to correct errors, add text,
Ireland - English language
Everson will edit for proper English.
11.4 Japan:
Japan #1:
Wait for 14652. Just as a warning
Japan #2a:
point is moot if APIs go away
Principle discussion about the APIs: Sweden, USA, Holland, Germany
require that APIs be removed from the standard. Keld: APIs can be
moved to 15435, they are not necessary to the use of the tables in
14651. It makes tactical sense to remove the APIs and get on with the
work. It is better to describe the functions than to require the APIs.
Agreement to remove the APIs.
Principle discussion about the tables: Symbolic tables are fine, but
"order start" is controversial.
Ken: We have to find a way to use the tables for the functionality of
the "order start" statement. Method of tailoring should not be
dependent on the order start.
Marc: tailoring must be kept in 14651, no dependency of POSIX is
desirable.
Keld: tables were developed in WG20, not POSIX.
Ken: tailoring must be able to be communicated to somebody else for
consistent output. Dependency on 14652 is too much for conformance to
14651.
Marc: dependency on 14652 forces POSIX, 14651 should stand independent
from 14652, but complete in itself. 14651 should be freestanding - this
might mean to put all syntax back into the standard.
Separate 14651 from 14652:
UK - yes;
DK - yes, but should meet 14652; CAN - no separation needed, but for
implementations it would be preferable to have a Unicode core in 14651;
Keld sees no reason to separate.
USA - 14652 can "inherit" the table from 14651 and add all the other
functions necessary, Unicode tailoring is a syntax for numeric or
symbolic table, simple syntax can be implemented in 14652 withany
required extension;
IE -
J - fully against 14652, but possibly support 14651, wants to have
freestanding 14651, syntax in 14652 is only 4 pages, easily moved to
14651;
D - freestanding standard preferred, develop pseudo syntax for
describing the functions;
J - repertoiremap is not needed;
Gail: implementers need a behaviour, not the format;
Egypt: freestanding 14651 is preferred.
Alain: create simplified syntax for 14651 that can also be used in
14652.
Ken: reorder after statements for symbols and for collation elements
are fundamental to any tailoring.
"reorder script" can create problems. if single characters of another
script are inserted into a script to be re-ordered. "include" statement
is unnecessary, if all tailoring starts from the default table.
Marc: can characters be given a different property (e.g. space as a
character)? "redefinition" is a valid concept, should not introduce new
syntax. "reorder" could be a "reorder after self" to change the
properties - this is the same as redefine.
Ordering of Arabic presentation forms is needed - problem of charmapping
is in 14652, not in the definition of 14651.
Specials must specify halfwidth, circled, etc... Not only 4th level,
also earlier levels. Some might be overkill.
Principle discussion about sorting of Arabic presentation forms: Most
data are stored in physical order on mainframes - shaping of data
results in presentation forms for incorrect sorting. Reverse order
(visual order) for presentation forms - this is not in the scope of UCS
character sorting. Conversion to Unicode requires field by field
reversal of data. A letter with all its presentation forms have the
same primary weight. The presentation forms differ in the tertiary
weight. The table has all forms of the same letter together, the
tailoring defines the scan backward of the presentation forms to achieve
reversed order. Reverse scan per block, not scripts.
The table has to appear in the order, the characters are to be sorted.
This contradicts the idea to have duplicate blocks of presentation
forms. One section is agreed with intrinsic and presentation forms of
Arabic.
What happens with APIs is:
5.1 remains
5.2 away
5.3 warnings
6. revisit conformance
Japan #4:
The word "prehandling" is used differently. in 5.1.1 preparation of the
symbolic table data should be used.
Japan #6 and #7
Definitions need to be imporoved.
Japan #8
OK
Japan #9
Use "P"
Japan #10
Rewrite due to prior decisions
Japan #16
OK
Japan #17
Benchmarks for toggles should be made informative - "maybe tested ..."
11.5 Netherlands
NA: another FCD will be ballotted
NB: the structure will be revised
NC: we take note and the document will be completely checked and revised
by a native speaker of English.
Alain will write detailed comments on the text.
Principle discussion on definitions - refer back to original standards
(character, et...), for new ones check in the meeting and perhaps new
text.
Prehandling N#52-N#61 comments are moot, this goes into an informative
annex
N#64 - end goes away due to elimination of APIs.
11.6 Sweden
S#1: All APIs and related definitions will be eliminated.
S#2: Reference file formats will e replaced by BNF format, some
symbolic data will remain, but will be self contained. A format will be
proposed but not mandatory.
S#3: OK
S#4: Normalization will not be mandated, but for level 3 implementation
decomposition is needed and Unicode method is recommended. Can be done
in a Note with the URL of the Unicode character table.
S#5: rejected
S#6: accepted . Unicode will use a tailored "default" table (from the
tailorable template) as their standard.
Action: Ken will submit Unicode data as a WG20 document, preferably
electronic.
S#7: Template will reflect logical order, but tailoring will be
allowed. Khaled will write a note to Kent about his opinion on order of
presentation forms.
S#8: Note the Thai problem, put it into "preparation" (same is true for
Tibetan).
S#9: issues for tailoring (1-3), some rejected (4, 5). Discussion with
Fujimura, who proposes to remove compatibility characters from the
template - after explanation of Unicode the proposed method is accepted.
S#10: Outside the scope of the standard. Left to prehandling, if
needed.
S#11: outside the scope of the standard or specific preprocessing (not
default)
S#12: Please bring specific examples if required
S#13: accepted
S#14: OK, same as prehandling
S#15: Word vs string ordering. Which spaces are considered? As
default we use string ordering, for word ordering, we use tailoring
"ignore off for ...". (The table will not contain deliberate errors).
S#16: Mathematical formulation is overkill. The standard will be
clarified so that it is clear that the exact representation is not
mandated by the standard - therefore a mathematical definition is not
needed.
S#17: out of scope
S#18: accepted in principle, except 5.6 and 7, no mathematical
description. BNF plus text. Example could be Danish, including reorder
after and tailoring of space...
S#19: take note.
No change in title is accepted.
Transliterative collation will not be included, scope will be edited for
clarity.
Example: Keld offers to do a Danish example.
Use of notes for non-normative parts - accepted in principle.
Definitions will be revisited.
Section 5: accepted
Symbols Hy etc.. will disappear.
11.7 United Kingdom
UK#1A: maintain 14651 in parallel with the devlopment of 10646 and its
amendments
UK#1B: Conformance clause will be revised, API section is moot. For
conformance, tailoring must be defined, can be zero.
UK#1C: no more relationship to 14652. Abbreviation tables create a
maintanance problem - autogenerate (possibly from the 10646 names).
Readability of abbreviations is desirable, must be fixed, never change.
Format of the autogenerated tables: SUxxxx, AUxxxx, mwhere Uxxxx is the
identifier of the character.
Discussion about min, half, cap, sub - should they be also be
transcribed into a symbolic value - decision is NO.
Unidata.txt defines the primary order of alphabets. From that we can
generate basekeys.txt and compkeys.txt with all the weights.
symdump.txt is an additional file that supports the sorting of symbols.
All files are generated from the unidata.txt that is updated with new
additions. UnicodeData-3.0.2.txt (version dependent) feeds into
unidata.txt for collation related information.
UK#1D: bindings: moot with APIs gone
UK#1E: remove these user requirements
UK#1F: Definitions: will be revised, field, procedure, precision will
go away with the APIs. Glyph goes away also.
Editorial comments from the UK:
Scripts need not be defined in the template - tailoring can be defined
by "section start" and "section end".
Principle discussion of section (script) definition: the section start
and section end statements are part of the tailoring, not part of the
template. Could also be done by specification of the first symbol of a
section.
BNF needs to be specified, Ken will provide a proposal.
UK#3A1: see script definition above.
UK#3A2: order of scripts - see below the principle discussion.
Principle discussion of script sequence: there seem to be requirements
that scripts that behave similarly, are sorted so that they follow each
other. Everson and Clews would like that the template (default) follows
these relationship. There are very differing opinions on this subject
with additional problems of scripts to be added later that would break
the default sequence and need tailoring of the tables anyway.
Solution for many seems unnecessary - binary sequence is fine as the
default. No user input exists on these scripts. One other solution is
that Europe will define a standard European profile that would possibly
be implemented by the computer companies.
Result: can be re-visited when user input exists. WG20 will keep the
UK request in mind in particular for Ethiopic and Georgian scripts.
UK#3A3: agreed
UK#3A4: script codes is moot
UK#3B: handled by tailoring method
UK#3C: solved by numeric symbols
UK#3D: Hebrew precedes Arabic, covered by table. Order of accents
needs to be defined.
UK#3De: Greek input from Evangelos was distributed. Problems for
Cyrillic letters is noted, need to be discussed in UTC.
UK#3E: accepted
11.8 USA comments:
USA#1: No more than 3 customizable levels in the conformance. This is
mainly for Java to be able to conform to 14651. Java has no mechanism
to tailor the 4th level. The Unicode order will be declared to be
conformant.
USA#2: create equivalent results is OK
USA#3: moot
USA#4: OK
USA#5: should include the latest amendments.
USA#6: YES
USA#7: Tables were provided, BNF to follow.
Principle discussion about the sequence of specials: Unicode treats
punctuations as ignoreable with the symbolic weight in addition. For
some 3000 characters swork needs to be done - Unicode sorts according to
code points. There is very little "user requirement" existing, it
should not be mandated by a standards organization, what the user
requirements are. Maintenance problem for any additional special exist,
if they are not in binary order.
Two options are obvious: full binary or full logical (Everson) order.
Possible compromise could be to order the specials with existing
preferences (about 50 perhaps) and order the rest binary. This would
eliminate the maintenance problem.
Unicode order is good; for data in other character sets, a table showing
this sequence would be helpful for the users. Alain's proposal to take
into account the Canadian standard (for about 40 specials). The
sequence in the Unicode tables is dependent on character properties. It
would be unfortunate to mix characters of differing properties
arbitrarily. Fundamental problem: compatibility (spacing) accents in
Canada will need tailoring.
Proposal Marc: use the 76 specials from Canada as the example of the
tailoring from the Unicode sequence to the one compliant to the Canadian
standard. Ken and John: specials should not be dealt in a special way,
if needed, this should be done by tailoring. Ken: The reason for the
Canadian standard was to order EBCDIC and ASCII data in the same
expected way.
Marc: proposed examples for tailoring: Arabic, Danish, Canadian,
Specials (IBM)
Ken, Baldev, and Alain will discuss the final order of specials and make
a reasonable order. The decision of this group will be reflected in the
next draft.
Principle discussion about the sequence of accents: Unicode has a
different order than 14651, putting all possible accents to the letter
a. A limited subset of these accents has been used in the market for
some years. The order of the diacritics is very similar as the order of
the specials - neutral way to order them in the sequence of Unicode and
let people tailor the sequence. Double diacritics have a derived value.
Ken would like to minimize the number of diacritics whose weights must
be changed to get consensus.
Arabic sequence is also considered - 14651 is correct, Unicode will be
adjusted.
For accents: research is needed, list of accents in 14651 must be
extended to completely cover all. Minimize changes.
Michael will prepare a paper and discuss it on e-mail.
Solution: take what is in 14651 today and extend it with all other in
10646 order.
Ken Whistler prepared a contribution for the BNF syntax for the
generation and tailoring of the tables in 14651 (N578). Comments are
appreciated to produce a final document that will become part of the
document as normative.
Conformance: pick a repertoire, generate a set of strings. Demonstrate
that a sort generates the exact same order as if sorted with the
implementation according to the standard. Same for tailoring.
Planned Progression for 14651:
Mid September
New draft available to restricted web
Alain
Israel meeting
final review and decision to send to SC22 for FCD ballot
participants in Israel
End June
draft disposition of comments
Alain to all participants of the Dublin meeting
End of July
comments on DoC to Alain
participants
End of August
Final DoC
send to SC22
FTP://dkuug.dk/jtc1/sc22/incoming for upload of documents, tables, etc.
John Clews described efforts in CEN TC304, TC37, and TC46 for ordering
especially in library sorting. ISO 12199 is for Latin only, ISO 999 is
a filing standard, TC46/SC2 is working on a combination of
transliteration and filing (where TC46/SC2/WG8 deals with the
transformation issues).
12. Cultural convention-specification standard ISO/IEC CD 14652
569
Summary of voting and comments on FCD 14652 - Specification of cultural
conventions
SC22 N2732
98-06-12
22.30.02.03
570
Disposition of comments to FCD 14652 ballot
22.30.02.03
575
Money-to-string function
Keld Simonsen
June 1998
22.30.02.03
577
Table of replies and comments to Fast Track ballot on ISO/IEC DIS 15897
(EN 12005)
SC22 N2717
98-05-14
22.30.02.03
580
Comments to N573 - dual currency handling in locales with respect to the
euro
Ienup Sung
Tom Garland
98-06-18
22.30.02.03
22.15435
Keld explains that this ballot, had it been a DIS ballot, would have
passed.
12.1 Canadian comments:
Change bars not possible, Keld agrees to produce an editor's report and
line numbers for easier identification of changes. Boxes or other
functions for readability will be used where possible.
CAN#1User can not necessarily select the cultural specification for a
specific application - this is a choice of the system administrator and
the application provider. Accepted in principle, some re-write will be
done to explain the situation
CAN#2: see Japanese comments''. Remove sentence in the introduction
that references the 14651 dependency.
CAN#3a: accepted:... dependent on language, culture, or ...
CAN#3b: accepted to remove
CAN#3c:
CAN#3d: accepted, point to other standard
CAN#3e: accepted
CAN#3f: accepted
CAN#3g: clarify need
CAN#3h: accepted, will be explained
Principle discussion about scope and relationship to POSIX.
J#01: "nothing more than POSIX" Keld says that this in the only
project that addresses I18N functionality in POSIX and other systems.
New TD on cultural and linguistic adaptability is formed to support the
enhancement of I18N in all JTC1 standards. Question is the market
relevance of the POSIX format for the future.
Keld: the POSIX format allows processing on IT systems.
Ken: strong feedback from the vendor community outside POSIX - can not
"just" be compiled and used.
Gail: vendors want content, not the format
Fujimura: if this is a compilable set of definitions, the standard must
contain the conditions of the compilation.
Marc: registry standard EN 12005 in CEN is fast tracked as ISO 15897.
Island could not create a conformant locale to this EN 12005. Question
about the reliability of the information, collected in free format.
Keld: errors in the POSIX syntax, needed is only the free-form.
Release of copyright is an additional issue. X/Open registry of locales
exist but costs money for access. Implementation today (Gnu) is to
prove the concept.
Major UNIX vendors and Mac would be potential customers for this
standard. This standard would extend the POSIX work with I18N
functionality. Fujimura: then the work should not be done in WG20, but
in POSIX.
Tom: we should possibly find a language independent way - POSIX exists
and works fine for Sun and/or any other POSIX conversant company. Is
there an other way ?
Ken: right way to do the work is to find the significant categories and
then seek the best way for specifications for each of the categories.
Example: collect all monetary formats, document them and then give
information, which are used where. This helps every company that needs
this kind on information.
Ken: major change in this standard is the introduction of 10646
(repertoiremap, collation, etc...). For many of these categories,
various other specification methods might be better suited.
Arnold gives a little history - POSIX asked WG20 to enhance the I18N
functionality, JTC1 rejected the proposed cultural registry standard
(due to lack of participating NBs).
Soor: in Cairo, we tried to make the standard more palatable to
NON-POSIX systems. Especially Java would be very much in need for the
cultural data. We need to pick a format for the specification that is
easier to use for more vendors, including the architects of the Java
I18N efforts.
Michael and Baldev: lets restrict the work to POSIX and get on with it.
Arnold gives his personal opinion that POSIX is "dead". Sun: POSIX is
not dead! We want to come up with specification of cultural
conventions. We have to take care of POSIX and also other systems.
Baldev: if 14652 should be valuable for other systems than POSIX, we
need to change the format of the specification.
Question: what is more valuable for the IT community - the information
or the format in which the information is presented ????
Decision: make clear in the scope that this standard is more POSIX
related. Keld promises to provide BNF syntax. Will look like BNF
syntax used in the POSIX or X/Open standards. Japan can accept the work
if new syntax is used and if new functionality is added. Suggests to
fall back to CD stage.
A Japanese suggestion to work on registration procedures was countered
by the convener, that this work item had been rejected by JTC1 and no
new work item has been requested by WG20.
Suggested title change to "Specification method for cultural
conventions". Resolution. Action: Include in convener's report.
J#08 and J#09: accepted
J#10: accepted - ellipsis will be fully defined in the BNF syntax.
J#11: accepted
J#12: use UCS names where ever symbolic names are needed in the
document. accepted in principle.
J#13: comment_char - Keld to check, if allowed.
J#14:
The editor will propose a disposition of comments to all the commenting
NBs and try to resolve the issues of syntax. Substantive issues such as
the contents of new LC_types, must be discussed here. Other issue -
paper based vs computer based. DK suggested some new categories. Japan
has BIDI issues.
No: spelling, hyphenation, transliteration,
Japan wants:
BI-DI, other time formats (JIS X0301)
USA is opposed to putting more things into FDCC-sets, as this would
further split the I18N conformance.
Character properties must not be specified in the cultural convention,
but are properties of the character itself. Add no new things that are
not needed for POSIX compatibility.
LCC_type has many parameters that are contentious. BIDI is a point in
case.
Resolution: (proposed by Michael Everson) that the liaison to SC2/WG2
requests WG2 that properties be defined for all characters in ISO 10646.
Action - Winkler: check the possibility of making it a TR.
Fujimura: J19 and CAN #4b - what is an "outdigit"? Keld: Output must
be defined in the locale (example: Saudi Arabia uses only Arabic
numbers, even in English text). If it is numeric formatting, it should
be in the LC_NUMERIC section. Makes sense only with IS 15435. There is
discussion, if the specification standard needs extension before the API
standard is approved.
Progression for 14652:
Draft disposition of comments by end of July
3 weeks for comments by participants
On September 7 we will have a agreed disposition of comments.
On September 21 a new draft of 14652 will be available
Further discussion in the meeting in Israel.
13. Internationalization API standard
Use of X/Open specifications OK, if credited to X/Open. Petr Janecek on
6/4/98 - e-mail followed, announcing that the OpenGroup board will
decide in its next meeting.
There are various ways to write the specifications language independent.
Keld gives examples - the group recommends :
procedure setlocale (category, string, localename)
parameter: 1: category input
2: string input
3: locale output results
Ken: any API specification is a rather large document, clarity is
required, not compactness, shortness ...
Relation to 14652 is important, Keld will provide a paper that shows,
which APIs deal with which 14652 specifications.
Progression of 15435:
New working draft by next meeting in Israel.
Registration request before the end of 1998.
14. ISO/IEC 10646 Issues
Input method for 10646 is in IS 14755 - very basic methods (pick from
screen, UCS identifiers, etc...).
Ken: Windows NT has a method to pick a character from the charmap.
Alain: This method does not provide any information about the
character. The standard mandates that the coding and the name be
displayed (in users language).
Ken: This might delay the implementation for some time (at least this
part).
Ken: A major problem: We are now at amendment 27 of 10646 - they need
to be added to the tables in various WG20 standards. Especially the TR
10176 needs to be re-issued with the extended list. A revision should
be done in a way that the TR points to a dynamic table of identifiers -
thus the TR does not need to be changed for every amendment of 10646.
The dynamic table needs clear and precise versioning (version number and
date at least, as additional info the included amendments. It also
needs a version history .
Fujimura: wait for requests from programming language standards writers
asking for advise on the use of characters in identifiers.
Ken: can not agree, update must be done fast and with the knowledge of
WG2 specialists. Language committees should be put on notice that the
table of 10646 characters in identifiers is dynamic and changes with
each amendment. This has to be considered in programming language
standards.
It is important that the implementers and the standards committees work
closely together on such issues. Only one list can exist.
Action: WG20 to request liaison with Unicode consortium in a resolution
to SC22 plenary (and convener's report)
Action: WG20 to request liaison from Unicode Consortium. (Ken).
15. Other business
15.1 Results from JTC1 meeting in Sendai
565
JTC1 resolutions from Sendai
JTC1 N5448
98-06-05
admin
572
SC22 chairman's report on JTC1 plenary and chairmens' forum in Sendai
SC22 N2726
98-06-10
admin
New TD for cultural and linguistic adaptability, consisting of SC2,
SC22/WG20, and JTC1 WG5. WG20 remains in SC22. JTC1 WG5 becomes SC35 -
secretariat needed.
There will be 2 meetings, one to plan in New York Oct. 2-4, 1998, the
other to work out details in Paris on December 2-4, 1998. Nominations
for New York: Alain, Keld, Arnold - resolution.
Direction to the participants: Make sure that the voting procedures are
in line with the technical contents of the work. Make sure that the
agenda is concise and not allow free discussion. Documents should be
requested to be well structured and specifically targeted to the task.
15.2 CAW results from Ottawa
557
Resolutions from the CAW - January 1998
CAW
98-01-22
admin
moot, see 15.1
15.3 Convenor's report for SC22 plenary
566
Convenor's report to SC22 plenary Sept. 98
Winkler
98-06-19
admin
Add liaison request between SC22 and Unicode.
Add liaison reconfirmation request with SC32/WG3 on SQL.
Add liaison request between SC22/WG20 and SC35 with the relevant working
groups.
16. Review of Priorities and Target Dates
reviewed and approved
17. Review of Actions Items from this meeting
reviewed and approved
18. Approval of Resolutions
approved
19. Adjournment
meeting is adjourned.
____________________ end of SC22 N2768 ____________________________