SC22/WG20 N417


Minutes of the WG20 meeting #9 - Copenhagen

ISO/IEC JTC1/SC22/WG20

Internationalization


September 29, 1995

DATE: September 25-29, 1995

LOCATION: Danish Standards Association

Baunegårdsvej 73

Hellerup

AGENDA ITEMS:

Introduction by Convenor

Japan:                                                                 
T.K. Sato              HP                                              
A. Kido                IBM                                             
Canada                                                                 
A. LaBonté             Trésor de Québec                                
Denmark                                                                
K. Simonsen            RAP (consultant)                                
USA                                                                    
M. Kung                SGI                                             
Liaison                                                                
A. Wallace             IBM, COBOL                                      
Cooperations                                                           
Þorvaður Kári          CEN TC304 secretary                             
Ólafsson                                                               
Convenor                                                               
A. Winkler             Unisys                                          

Appointment of chairperson, secretary, and drafting committee

Chair: Winkler

Secretary: Winkler

Drafting: Simonsen, Sato

Approval of prior meeting's minutes

371R Draft minutes of the WG20 meeting #8 in Paris, May       Winkler        
     15-19, 1995                                                             

The minutes were approved without changes.

Future Meeting Schedule and Plans

Recognition of new documents and assignment to agenda items

 411 Cultural element specification WD #2 (IS 14652)          15             
 412 Summary report of CEN/TC304/PT01 project team report     16.1           
 416 Input to WD#3 of 14651 (A9505-08)                        13             
 418 Revised input for TR 10176                               12             
 419 Some text for TR 10176 WD5 (A9505-12, A9410-07,          12             
     A9410-05)                                                               
 421 Non editorial comments to TR 10176 WD4                   12             

Approval of Agenda

 375 Preliminary agenda for Meeting in Copenhagen             Winkler        

Agenda was approved with additions. Final agenda is document N401.

Convenor's report about the SC22 plenary, Sep. 18-22, 1995, Annapolis and JTC1 plenary in Kista

 379 Resolutions of the 8th plenary meeting of JTC1, June     SC22 N1882     
     13-16, 1995 in Kista, Sweden                             JTC1 N3586     
 380 SC22 secretariat report to the SC22 plenary Sep-95       SC22 N1883     
 381 SC22 Program of Work                                     SC22 N1884     
 382 Revised agenda for SC22 plenary September 18-22, 1995    SC22 N1887     
     in Annapolis                                                            
 383 Retention of projects, not reaching CD stage within 3    SC22 N1886     
     years of NP approval                                                    
 408 Draft Resolutions from SC22 plenary in Annapolis,        SC22           
     September 18-22, 1995                                                   
 409 Rationale for the inclusion of paragraph numbers in      SC22 N1967     
     SC22 standards                                                          

Convenor reported about the SC22 plenary with information about the resolutions that have effect on WG20:

AN - concurrent PDTR registration and ballot for 10176, AO - change of titles, AP - request for short identifiers, AQ - NP for APIs for I18N, BD for paragraph numbering of standards, BE & BF - cooperation with CEN TC304

Lots of discussion and work on proposals for electronic document transfer and use of WWW.

Liaison Reports

X3L2 for project JTC1.22.30.02.02 (International string ordering) requested by X3L2.

 414 Liaison report from WG4 - COBOL                          A. Wallace     

Ann gives report based on N414. Expresses desire to synchronize with WG20 the availability of standards for I18N with their use in COBOL. Identifier of 10646 need case information for use in COBOL. Ann is writing most papers for I18N functionality, only numeric formatting is written by somebody else.

New verb VALIDATE will be based on external locales, API standard is also needed, more features will be included in next release of the standard.

Question is: should WG4 include features, even if no standard is in sight, or would that be too fast and thus incompatible with future standards.

Should there be a Right-case function - this is beyond the compiler functionality, more an orthographic application.

Should there be a I18N-tag for data items that are culture dependent? Good idea.

Timing of sorting standard - CEN

WG20 has composed an answer to COBOL's liaison statement and submits it as N422 to WG4.

Ann wants the tables of the 10646 character properties made available in machine and human readable form so that COBOL implementers have access to them (for free). We could make them an informative annex for a TR or write the character properties standard asap.

No specific report, the amemndment to the shell and utilities is very much I18N oriented, it progresses very slowly.

 386 Resolutions SC2 meeting Helsinki                         SC2 N2616      
 387 Resolutions SC2/WG2 meeting Helsinki                     SC2/WG2 N1254  

No additional report besides resolutions and minutes which are available. Korean "old" characters might make problems with short identifiers.

Revision of C standard with input from WG20. Enhanced character set support, make better approach for character handling, localization. Amendment for internationalization (MSE) has been published. Proposal for POSIX alignment.

CD ballot, object oriented locales, new string class,object oriented APIs.

Nothing to report, merge of the organizations not finished.

Input for 10646 - second CD with fewer specifications, no composition. Complex things will go into TR. Symbols on numeric keypad - decimal point is a function, not a character.

See 16.1 and N413. Agreement on necessary cooeration, especially for sorting.

New copies of their sorting standard. Quite advanced in their specification, also for different languages. Are harmonizing with CEN.

 391 Liaison document on alphabetical ordering of             ISO/TC37,      
     multilingual terminological and lexographical data       SC3/WG3        
     represented in the Latin alphabet                        N58, N59, N60  

Alain will write personal letter and send it to Winkler for forwarding to Hans Wellisch.

 385 ITU-T recommendations and ISO standards dealing with    Stefan Fuchs   
     character coding                                                       

Review of prior meetings action items

SD-5 List of Action Items                                    Winkler        

The action items from prior meetings were reviewed and the list updated.

Framework and Requirements for Internationalization TR 11017

354R Disposition of comments to TR 11017, including           T.K. Sato      
   R conclusions of Paris meeting (May 95)                                   
359R Proposed text including comments from Paris and e-mail   T.K. Sato      
     on PDTR 11017                                                           
360R Proposed addition of Management Summary to               T.K. Sato      
     PDTR 11017                                                              
361R Proposed changes to section 6.2 (SCRIPT) of              T.K. Sato      
     PDTR 11017                                                              
362R Proposed revision of section 4.2.7 (cross cultural       T.K. Sato      
     friendliness) of PDTR 11017                                             
366R Final draft for ISO/IEC DTR 11017: Framework for         T.K. Sato      
     internationalization                                                    
 395 Addition of Annex C to PDTR 11017                        T.K. Sato      
     (Bi-directional text)                                    R. Belhadj     
 396 Disposition results on section 5.6 and 5.7 of            T.K. Sato      
     PDTR 11017                                                              
 397 Summary of differences N277 (PDTR) to N366R (DTR 11017)  T.K. Sato      

No SC22 resolution necessary for further processing, but clear understanding that the TR will be submitted for DTR ballot after Miles Ellis has edited it for proper English.

Discussion on inclusion of long annex on bidi presentation. If bidi, what about Hangul, Thai, etc.... We decided against such examples.

N396: add explanation of "customization" in 5.6

Management overview : many editorial changes.

More editorial changes throughout the document. After addition of bidi contribution, the TR text will be frozen. No more changes are allowed.

Convenor will get the DTR document and forward it to IETF and/or SC22.

Revision of TR 10176

 377 ISO/IEC PDTR 10176, 2nd edition: Guidelines for          A. Kido, M.    
     preparation of programming language standards            Noda           
                                                              SC22 N1931     
 389 Broadening the revision criteria for TR 10176 to         Winkler        
     accommodate WG11 issues                                                 
 392 Proposed disposition of comments to WD5 of PDTR 10176    Akio Kido      
     (Comments: B. Meek, A. Winkler)                                         
 394 More comments to TR 10176 from Brian Meek and Keld       Brian Meek,    
     Simonsen                                                 Keld Simonsen  

Oct.95-Nov.   Kido           preparation of WG#6 for concurrent registration    
95                           and approval, send to Winkler.  Kido to request    
                             NB comments from people that sent technical        
                             comments now                                       
 End Nov. 95  Winkler        send to SC22 secretariat for registration and      
                             approval ballot                                    
April 96      Kido           prepare comment disposition on ballot comments     
                             for discussion in Kyoto                            
June 96                      second PDTR ballot (final PDTR)                    
September 96  Kido           disposition of comments from final PDTR ballot     
                             and discussion in WG20 meeting in Vienna           
November      ?              Editing for proper English                         
1996                                                                            

Kido explained the current organization of the document. Statements and built-in functions are in the body of the document, internationalization specific details are described in annexes, nomative and non-normative.

After discussion it was agreed that all I18N functionality should be described in the body of the TR as desired in the programming language, details might be in an annex.

Next discussion about binding of cultural conventions to processes or threads. Multi-locale processing must be synchronized - each thread must know which locale is to be used in the processing, otherwise default locale applies. Agreement: description of need for multi-locale support is in main body, a description of how C or POSIX handles this subject is in an annex (with or without coding example).

Discussion of character set support: character types. Must be coding independent, although languages might need single byte data types in addition to multi-byte types with conversion function betweenthe two. Characyrt boundry detection function is also needed. Programming language dependednt, subtypes can be implemented.

Keld recommends not only code independent data type but also encoding dependent data type for things such as UTF-16 data. Ann sees no need for coding dependent data-type, translation can be done in the I/O interface. Fortran has N-data type for implementation dependent type where the meaning is defined by sub-type (attribute). Charmap dependence?? Mike supports UCS data type to make applications portable to various platforms. Kido: POSIX binds the charmap to the character set independent character convention. Ann wants a data type that allows the use of combining sequences. - high level. Keld and Mike volunteered to prepare text for the document this evening for discussion tomorow.

Equivalence discussion: Glyph shape or meaning of character? Kido: programming language should have equivalence table for characters to be treated equaly, e.g. Latin uppper case A and Latin lower case A and Greek upper case A...etc. Alain - Keld: no, also no equivalence of accented letters. Ann: COBOL does not look at different A's - they are expected to be the same. Problem: there are about 16 blanc characters -are they the same?? Keld: different handling of identifiers from text. Ann: source code portability is important when transfering the program from Russian to Latin. Ann and Alain: source code should be locale independent.

We stopped the discussion onsource locale, but will have to come back to it.

Back to equivalence: Cyrillic A is the same as Greek A. Ann: Language syntax is ISO 646 invariant. If other character sets are used, these characters should not be eqyivalent. Alain: it is necessary to declare the UCS repertoire as the language independent portable character set. Ann: backward compatibility is so important that ISO 646 has to be the portable character set, not 10646. Also, the portable character set ought to be defined in the programming language, not in a locale. Equivalence is in the domain of the language.

Comparison and collation of data in execution.: Ann recommends that the code point is the default, but that programming languages might define "fuzzy" functionality - level of equivalence from sorting standard. Keld: at execution time all comparisons should be locale defined, with varying precision levels according to sorting standard. Literal comparison: at run time, based on the character set to which it is compared? In compile time the default order is locale dependent.

Range: to define ranges in programming languages is an execution time problem. We have no solution, we need to define problem - perhaps 2 different ranges, on of which is binary, the other locale dependent. Keld thinks that he has a proposed solution, we need more contributions and also text in the document to trigger comments. New syntax might be needed for culturally correct range finding - according to locale.

Brian Meeks comments: conflict with Antoni's comments to WD#2, US contribution is needed. Kido may use abreviated words for redundant specifications.

Definitions: Terminology will be in the glossary

Extended identifier list: corrected list will be included, no modification needed at this time

I18N library: include list from TR 11017, perhaps more needed

Multiple culture support: concept in main document, example in annex

Identification announcement mechanism:

Mapping:

Multiple character set support:

Discussion about text for annex B (or 4.7): Long discussion about Keld's proposed text for the guidelines in annex B. The groups agrees that the proposed text is conterproductive. It is better for the future to wait for the API standard to be advanced enough to be referenced in the TR rather than give incomplete guidelines in the TR itseld - that could lead to incomplete and differeing implementations which might have to be changed when the API standard is complete. A list of services will be taken out of the TR 11017 and the standards developer will be informed that these services will be available via the internationalization APIs.

Discussion about character set announcement mechanism: No mechanism or tagging schema is available, we want to stay away from 2022 tags. Better no recommendation than 2022. Applications might have to do the announcement and recognition of character sets. Message must contain the warning that "old" character sets will not go away quickly.

Instructions to the editor: In 4.7 add a description of the model - intrinsic fuctions and platform provided services where available via internationalization APIs. Add a note in which the current status of the API standard is described and a list of the services that will be covered by it. Add a section for the -non-existing - character set announcement method and point to the API standard note. Also point to the list of services described in TR 11017 and ask the convenors to send us their groups' requirements for internationalization APIs and other functionalities.

Kido's question: does the TR 10176 have a message to the developers to programming language standards about the use of culturally correct ordering during execution time. Message: at compilation time use the unchangable default locale, language specific wording is needed in p.l. to allow invocation of LC-LOCALE, and verb to invoce the comparison mechanism.

Discussion of proposed text from Keld and Mike for character handling: abstract?, code independent?, literals?, how to discourage the use of e.g.REDEFINE to get to the coding level and thus make portability impossible, do programming language committees agree with the paper and do they want this kind of guidance?, are there locale dependent literals?, should literals be in the compiler internal codeset = abstract in the sense of this paper?. Decision: rewrite according to discussion, discuss in group, send to PL standard groups for comments.

Discussion on resolution: Doubt, if the document is ready for registration and approval. Majority understanding is that is necessary to get the document registered and open it up for official comments from NB's and from other specialists and working groups.

Discussion of N423 - Guidelines on character data type in p.l. support: agreed text. Sorting data type has to go into the TR (text from Alain).

International ordering of 10646

 388 ISO/IEC WD 14651 International String Ordering (working  Alain LaBonté  
     draft #3)                                                SC22 N1924     
 378 ANS: Alphabetic Arrangement of Letters and the Sorting   ANSI/NISO      
     of Numerals and Other Symbols                            Z39.75-199x    
 393 Comments on working draft #3 of ISO/IEC 14651            Hans           
                                                              Wellisch,      
                                                              NISO, chair    
                                                              AK             
 398 Instructions for ordering tables from NB's               Alain LaBonté  
 391 Liaison document on alphabetical ordering of             TC37           
     multilingual terminological and lexographical data                      
     represented in the Latin alphabeth                                      

Registration means that all changes from the base document on must be traceable through ballot comments from national bodies and their resolutions !!!

Ordering standard should be synchronized with CEN and other groups that create such drafts.

Alain says that he might be better off to get the tables from CEN or from contacts in NB's.

Discussion about a message to be given to programming standards developer, if and how how to inform about the presence of a culturally correct ordering method. The TR 10176 has to deliver a message to all developers of programming languages: all P.L. have a default behaviour (default locale), locale switching invokes user locale at execution time - functionality must be provided.

Compile time locales must not be changable. Programming languages have the right to add characters for use in identifiers in addition to the "letters" as proposed in WG20's paper of extended identifiers.

Discussion and comments to WD#3 in the meeting:

Sato wants list of requirements; and conformance statement for tailored use of the comparison engine.

Arnold promised to get a copy of the LI procedure call standard to Alain and to Keld.

Possibility to allow tailoring in a way, that default sequence applies with exception of the users own script. That would allow simpler definition of tailoring without touching the :foreign" scripts. Nice and simple way for tailoring is needed.

ISO/IEC 10646 Issues

 407 Unique identifiers for characters in ISO/IEC 10646       SC22 N1968     

SC22 is asking SC2/WG2 to define short, unique identifiers, based on a US request at the plenary (Hart / Winkler).

Cultural convention-specification standard

 384 prENV 12005:1995  Procedure for European Registration    CEN TC 304     
     of Cultural Elements (final draft)                       prENV 12005    
 411 Cultural convention specification  IS 14652 WD #2        K. Simonsen    

The working draft was discussed, especially the need for language independent specifications, possibly with an example of a binding to "C". The time schedule for WD #3 and WD #4 were discussed (November 1995 and March 1996 respectively). By mid 1996 we want to have a document for registration.

Other business

 390 CEN and ISO cooperation                                  Þ.K. Ólafsson  
 405 CEN/TC304/PT01 User requirements study in the field of   CEN/TC304      
     character set technology                                                
 412 CEN presentation to SC22 by Þorvaður Kári Ólafsson                      

Þorvaður Kári Ólafsson gava a presentation about CEN and its TC304. CEN is the European equivalent to ISO, but takes instructions from the European Commission. First work was on character sets, later it was extended to cover the use of character sets - thus overlapping work with SC2, SC22, SC18, SC21.

CEN members are the 15 EU countries, Finland, Island, Switzerland. Affiliated members can be any ISO members in Europe, nine are members now.

Character sets: a mandatory set of about 1200 characters for all official languages, including Russian, Ukrainian, Greek, Turish, polytonic Greek, etc. An extended subset of about 3000 characters, based on pages of 10646, all Latin, Greek, also Vietnamese, African, etc.

TC304 next meeting in Barcelona

WG1: ordering rules

WG2: cultural elements, registration, TR on locales

WG3: character sets, subsets

WG4: transformations, conversions, transliterations

User requirement study PT01: to show EU what TC304 is doing and what their plans are.

Interesting projects for WG20:

L/11113a:     Glossasoft project                                               
L/11113b      Message interface                                                
L/1311:       European default locale, based on European sub repertoire of     
              10646                                                            
C/1213b:      General European rules for fallback representation               
C/313:        Tools and transformation tables                                  
L/1311:       European default locale                                          
L/1213a:      European conversion and fallback rules                           
L/111381a     Ordering                                                         
L/111381b     Ordering                                                         
C/211         Cultural elements (unregistered)                                 
C/31311       General model for character transformation                       
L/111382      Ordering of UCS characters                                       
L/1312        More cultural conventions                                        
L/132         Formal specifications techniques for cultural data               
C3131         Guide on conversion between UCS coding forms (APIs only)         
C331          UCS in programming languages                                     
C/3311b       Guidelines for the design of internationalization (for           
              programming languages)                                           
C/3312        Language independent API specification for internationalization  
              and UCS                                                          
C/3311a       Support for UCS in programming languages                         

Ideas for cooperation:

- combined meetings, planning and technical

- allow access to mailing lists

Break out (Keld, Alain, Þorvaður) prepared paper (N414) as a first plan and a listing of overlapping projects and current status. The paper recommends common meetings, cross membership, fast track procedures, and the development of a synchronization mechanism. Questions about what counts more, the European position or the CEN position in case of conflicts.

After discussing the paper, agreement for a resolution to forward to SC22 and to CEN/TC304 was reached.

Specific discussion on cooperation on the sorting standard:

1. Exchange documents October 10

2. Memberbody comments comments November 15

3. Keld to arrange for a meeting of Alain and possibly other WG20 members with CEN members or

4. Arrange for telephone bridge for discussion of differences with good preparations

 399 Modification of DIS balloting procedures                 SC22 N1938     
 403 SC22 ad hoc report on the use of WWW                     SC22 AN-7      
 404 SC22 Recommendations on JTC1's Electronic Document       SC22 N1965     
     Formatting Guidelines                                    (AN-1R2)       
 406 WWW Sample pages                                         SC22 ad hoc    
                                                              AN-8           

Winkler wants to reduce the paper mailing, but also the mailing list of people who don't need the documents any more. The group decided that a request for confirmation can be sent to the recipients of WG20 mailings. This questionnaire could also contain the question about medium.. Some documents should remain on paper, especially working documents. National bodies have to get documents.

 402 NP for Internationalization API standard                 WG20           
                                                              SC22 N1962     

Keld presented a draft document and explained his plan how to keep the standard langage independent, but easy to bind to C.

Review of Priorities and Target Dates

Review of Actions Items from this meeting

The action items were discussed and agreed upon. They will be added to SD-5.

Approval of Resolutions

The draft resolutions were discussed and approved. (SC22/WG20 N420)

Adjournment

The meeting was adjourned at 4:45pm.