SC22/WG20 N774
L2/00-307
Collection of reactions to the WG20 convenor's
"Personal thoughts about the future of WG20"
Part 1: August 30 through September 6, 2000
Akio Kido suggested that I collect all reactions to my proposal about the future of WG20 in one document for easy reference. Due to the interest in this subject, it became a rather lengthy document and I decided to put a linked index in front of it – that allows you to go straight to the contribution that interests you. I did not do any formatting – please apologize, if text in html does not look as good as it could be, but I wanted to maintain the original form of the e-mails the way I received them.
The document got too long – I had to split it into parts:
Parts |
SC22/WG20 |
NCITS/L2 - UTC |
Part 1, from August 30 – September 6, 2000 |
||
Part 2, from September 6 on … |
Index with the latest document on top:
National Body |
Name |
Date |
Content |
Supports |
USA |
Asmus Freytag |
2000-09-06 |
Y |
|
Canada |
Dave Blackwood |
2000-09-06 |
N |
|
Norway |
Keld Simonsen |
2000-09-06 |
N |
|
USA |
Ken Whistler |
2000-09-06 |
Y |
|
Germany |
Marc Küster |
2000-09-06 |
?N? |
|
Sweden |
K.I. Larsson |
2000-09-05 |
Y |
|
USA |
Ken Whistler |
2000-09-05 |
Y |
|
Canada |
Alain LaBonté |
2000-09-05 |
N |
|
Ireland |
Michael Everson |
2000-09-05 |
Y |
|
Ireland |
Michael Everson |
2000-09-05 |
Y |
|
Ireland |
Michael Everson |
2000-09-05 |
Y |
|
Japan |
Akio Kido |
2000-09-01 |
Y |
|
Germany |
Marc Küster |
2000-09-01 |
Y |
|
Sweden |
Ken Karlsson |
2000-09-01 |
Y |
|
UK |
John Clews |
2000-09-01 |
?N? |
|
Japan |
T.K. Sato |
2000-08-31 |
Y |
|
Norway |
Keld Simonsen |
2000-08-31 |
N |
|
Sweden |
Kent Karlsson |
2000-08-31 |
Y |
|
Japan |
Masayuki Takata |
2000-08-30 |
Y |
|
Japan |
Akio Kido |
2000-08-30 |
Y |
|
SC22 N 3164 |
Arnold F. Winkler |
2000-08-30 |
Y |
Individual contributions on e-mail:
Japan, Akio Kido, August 30, 2000
I agree with Arnold's thought.
It is good idea to work in the CLAUI. Some of our standard and TR are
tightly related with ISO/IEC 10646, and without having the involvement
of SC2, we can not maintain those IS and TR and make them alighn
with the latest ISO/IEC 10646. It is moving target to follow ISO/IEC
10646. So we do need to work togather with SC2.
Best regards,
Akio Kido (Globalization CoC, Yamato, IBM & Co-chair person of Li18nux)
Japan, Masayuki Takata, August 30, 2000
As an individual, I totally agree with your thoughts. Thanks for the
good ideas. This is not a small thing for us all, so it will take some
time to achieve conclusion in the Japanese working group. However, I have
no doubt that we will agree with you, at least in principle.
As the Head of Japanese equivalent of WG20, I'll try to find a group
consensus and, probably, delegate Kido-san to discuss in the SC22 Nara
Plenary.
Regards,
TAKATA Masayuki
Sweden, Kent Karlsson, August 31, 2000
I
agree in principle with your suggestions (with the
exception that I would prefer the withdrawal
also of 14652,
assuming that no-one is willing to rewrite it
from scratch
foregoing POSIX compatibility; similar problem
with the
registry standard).
From
a formal point of view, the responsibility of
SC22/WG20 matters within the Swedish NB was very
recently
transferred from our AG22 to our AG2, which
takes care of
TC304, SC2, SC35, and now also SC22/WG20
matters. So from
an NB point of view, a transferral of most WG20
projects
to SC2 would (now) not make any difference.
Kind
regards
/kent
k
Norway, Keld Simonsen, August 31, 2000
> Friends,
>
> After some hefty thinking and soul-searching, I decided to send the attached
> personal contribution to SC22 for consideration at the plenary in Nara. I
> will also send it to CLAUI for consideration at their meeting in October in
> France. I wanted you to see it before the official SC22 distribution.
>
> I do hope, you agree with me, at least in parts.
I think it is good that you have done some thinking about it.
I do think that WG20 has a role to play.
In my mind we are now about to begin the real work of WG20.
WG20 was set up to standardize i18n functionality, that is
APIs and also to find out what i18n is all about.
We have completed (more or less) location of i18n and
(with the usual time that it takes) now standardized
kind of what was standardized in other WGs of ISO wrt. i18n.
Then we have done a littel more, extended some specifications,
and we made 14651.
So now we are "lords in our own house", and we can begin
standardizing APIs and go beyond standard i18n functionality.
There are a lot of functionality to cover, before we can have
truly internationalized, portable applications.
I think the standardization of APIs and formats for data
specifications are best done in SC22, which standardizes
libraries, and also interacts with the many ISO programming languages.
Moving WG20 activities into SC2, as Arnold Winkler proposes, would be an error, IMHO.
APIs are not in the scope of SC2. Neither are sorting or
character attributes. And sorting and character attributes
have for a long time been a SC22 issue, viz. C, and other
programming languages islower(), isupper() etc. I do not
see the kind of expertise in character attributes at SC2 meetings,
but maybe they are available in Unicode, as the Unicode
Technical Commitee chair, Arnold Winkler, is hinting at, and maybe
we should just leave everything to Unicode, and stop making open
world-wide standards. In that way all our culture, not just
our MacDonald hamburgers, can be really standardized:-)
Kind regards
Keld
Japan, T.K. Sato, August 31, 2000
Arnold, you are going to make what I wanted.
I agree with you in principle. For each details, such as 2375 extension,
I think some more discussion might be necessary.
Sato
United Kingdom, John Clews, September 1, 2000
I'm sending my thoughts via <SC22WG20@dkuug.dk> which is probably
similar in content to the list of individuals.
I think Keld Simonsen sums up several of the things I'd considered
myself, and I find myself agreeing with several of his points, as
noted below: I also throw in some other issues which may be related
to the wider picture.
In message <20000831202309.A3987@rap.rap.dk> Keld wrote:
> I do think that WG20 [still] has a role to play...
> WG20 was set up to standardize i18n functionality...
No other ISO/IEC JTC1 committee is doing this at present, although we
should certainly continue our liaison within and outside of ISO/IEC
JTC1 committees - in fact JTC1/SC22/WG20 seems to be quite good at
that.
It's also an ideal size working group in terms of size, cost
effectiveness, and in what it can get done.
> We have completed (more or less) location of i18n and
> (with the usual time that it takes) now standardized
> kind of what was standardized in other WGs of ISO wrt. i18n.
> Then we have done a littel more, extended some specifications,
> and we made 14651...
Which is certainly our big success story, also involving extremely
valuable liaison and participation with the Unicode Technical
Committee. This still needs more work in its second edition.
> There are a lot of functionality to cover, before we can have
> truly internationalized, portable applications.
>
> I think the standardization of APIs and formats for data
> specifications are best done in SC22, which standardizes
> libraries, and also interacts with the many ISO programming languages.
>
> Moving WG20 activities into SC2, as Arnold Winkler proposes,
> would be an error, IMHO.
> APIs are not in the scope of SC2. Neither are sorting or
> character attributes. And sorting and character attributes
> have for a long time been a SC22 issue, viz. C, and other
> programming languages...
> I do not
> see the kind of expertise in character attributes at SC2 meetings,
> but maybe they are available in Unicode, as the Unicode
> Technical Commitee chair, Arnold Winkler, is hinting at...
Unicode Consortium and the UTC have an extremely important role.
So does ISO, in enabling more international input at expert level
than the UTC does on its own.
It may be useful to have view on this (not necesarily official) from
the UTC, or somebody within it.
The UTC and ISO/IEC JTC1/SC2/WG2 make a valuable complementary pair:
the UTC and ISO/IEC JTC1/SC22/WG20 also make a valuable complementary
pair.
In passing I also notice that comments from Europe that I have seen
tend towards keeping ISO/IEC JTC1/SC22/WG20, and that comments from
the USA and Japan that I have seen tend towards moving away from
ISO/IEC JTC1/SC22/WG20, although I wouldn't read anything too much
into that.
However, it does reminds me that the European Commission has
commissioned Price Waterhouse Coopers, if I have the details correct,
to evaluate future work in CEN/TC304: Information and Communications
Technologies: European Localization Requirements.
Considering the degree of overlap of some aspects of work between
ISO/IEC JTC1/SC22/WG20 and CEN/TC304, and between the Unicode
Technical Committee and CEN/TC304 to a lesser degree (probably
complementing each other rather than overlapping) it may be useful for
ISO/IEC JTC1/SC22/WG20 and/or the UTC to provide some input into that
process in due course as well, to see if this work can also provide a
wider picture of a useful future in ICT standardisation.
Best regards
John Clews
Sweden, Kent Karlsson, September 1, 2000
> -----Original Message-----
> From: Keld Jørn Simonsen [mailto:keld@dkuug.dk]
...
> I do think that WG20 has a role to play.
> In my mind we are now about to begin the
real work of WG20.
> WG20 was set up to standardize i18n
functionality, that is
> APIs and also to find out what i18n is all
about.
> We have completed (more or less) location
of i18n and
> (with the usual time that it takes) now
standardized
> kind of what was standardized in other WGs
of ISO wrt. i18n.
> Then we have done a littel more, extended
some specifications,
> and we made 14651.
>
> So now we are "lords in our own
house", and we can begin
> standardizing APIs and go beyond standard
i18n functionality.
> There are a lot of functionality to cover,
before we can have
> truly internationalized, portable
applications.
>
> I think the standardization of APIs and
formats for data
> specifications are best done in SC22, which
standardizes
> libraries, and also interacts with the many
ISO programming
> languages.
>
> Moving WG20 activities into SC2, as Arnold
Winkler proposes,
> would be an error, IMHO.
> APIs are not in the scope of SC2.
True, API development/standardisation should
not be done in SC2.
But nobody is suggesting that is should.
The suggestion is to
cancel the API standard development of WG20, due
to lack of
interest and lack of quality. What is
troubling is that if
14652 and the corresponing API standard
continue, Linuxers
and C (C++, POSIX) standardisers will be
misguided by them.
The only hope for the i18n file format and API
standards to
be of any use would be to start over from
scratch, essentially
ignore POSIX, but pick up the very best from the
others, and
do something completely new. But I don't
see that happening in
WG20 at this time.
> Neither are sorting or
> character attributes. And sorting and
character attributes
> have for a long time been a SC22 issue,
viz. C, and other
> programming languages islower(), isupper()
etc. I do not
> see the kind of expertise in character
attributes at SC2 meetings,
> but maybe they are available in Unicode, as
the Unicode
> Technical Commitee chair, Arnold Winkler,
is hinting at,
Character attributes and ordering certainly
belongs in SC2.
That's where the expertese about such matters is
to be found
within ISO, not in SC22. C (and C++ and
POSIX) has botched
both character and character string
representation, as well
as character properties. C does NOT specify what
wchar_t is,
leaving open to each implementation to do
whatever, nor does
it specify any other suitable datatype, and what
char is is
locale-dependent. Ada fares a bit better on that
point where
Wide_Character and Wide_String are UCS-2 (except
in non-conforming
implementations). In C (and C++ and POSIX)
islower etc. are
locale-dependent, not character-dependent. C and
POSIX are
definitely the wrong places to look for guidance
regarding this.
> and maybe
> we should just leave everything to Unicode,
and stop making open
> world-wide standards. In that way all our
culture, not just
> our MacDonald hamburgers, can be really
standardized:-)
I find that statement to be uncalled
for. In my experience
Unicode consortium is quite open to input, more
so than
ISO, and definitely more so than W3C, and
extremely much
more so than TC304... And the results from
Unicode consortium
are also more open than those from ISO.
> -----Original Message-----
> From: Ordering@sesame.demon.co.uk [mailto:Ordering@sesame.demon.co.uk]
...
> However, it does reminds me that the
European Commission has
> commissioned Price Waterhouse Coopers, if I
have the details correct,
> to evaluate future work in CEN/TC304:
Information and Communications
> Technologies: European Localization
Requirements.
"European Localisation
Requirements" sounds good. Unfortunately
TC304 has come up with some rather useless
'delivarables':
reports that misrepresent Unicode/10646, and
seem to argue for
increased use of ISO 2022 (currently at about 0%
usage in Europe);
botched MES-1 and MES-2 subsets, lacking MES-3
subsets; a report
on fall-back that is hopelessly outdated;
"euro-locales" based
on POSIX 'locales' and that in addition are
ambivalent to
localisation (common ordering, but language
varying week/month
names); not to mention an internal quarrel about
*exactly* what
constitutes Europe (as if that really mattered
for TC304).
And the involvement of IT (and communication)
industry in
TC304 has, as far as I can tell, been extremely
small.
Kind
regards
/kent
k
Germany, Marc Küster, September 1, 2000
Dear Colleagues,
> > After some hefty thinking and soul-searching, I decided to send the attached
> > personal contribution to SC22 for consideration at the plenary in Nara. I
> > will also send it to CLAUI for consideration at their meeting in October in
> > France. I wanted you to see it before the official SC22 distribution.
> >
> > I do hope, you agree with me, at least in parts.
>
Arnold's thoughts are indeed stimulating and not unfounded. 14651
certainly is WG20's most relevant project at this point in time, and it is
going to become an international standard very soon now. While there is
the need for immediate revision to cover at least the
repertoire of 10646-1:2000, end is in sight.
There is no point in keeping WG20 as a cosy debating club.
[Keld]
> I do think that WG20 has a role to play.
Still, I do agree with Keld that WG20 may have a role to play. Whether it
is currently doing so in the best possible manner is open to debate -- you
know the German views on both 14652 and the API standard --, but that does
neither mean that such work is superfluous nor that there is a lot of
value in WG20s deliverables.
When we agreed in Québec to look for "customers" of WG20's deliverables,
especially for the API standard, it was in this spirit. If no-one is
interested in them, by all means cancel them. Yet, I think it is
worthwhile to try. If that may take, as Kent suggests and I agree, drastic
overhaul of the papers, why not?
Moreover, John is right in pointing to the market study that the European
Commission has ordered on CEN/TC304 and that is to be delivered within a
few months. This study by Price Waterhouse Coopers may or may not filter
out important new areas of work for TC304, it may recommend anything in
between closure and drastic extension of responsibilities. In any case
many of the conclusions they draw on a European level will be of value to
WG20, and it is worthwhile to scrutinize them before taking action either
way.
What I am driving at is not keeping WG20 alive at all cost, quite on the
contrary. I'd, however, counsel patience and level-headed evaluation of
the development of the next six months.
Best regards,
Marc
***************************************************
Marc Wilhelm Kuester
Computing Centre of the University of Tuebingen
Dept. Literary and Documentary Data Processing
Japan, Akio Kido, September 1, 2000
I think what Arnold proposes NOT simple termination of WG20 work,
rather he proposes to work at more appropriate place.
I like to understand the point why some people stick the current WG20.
I'm talking about just organization view point. I personally observe that
other WGs in SC22 might have less interests on the further work of WG20,
since we have quite less participateion of representatives from other
working groups in our meeting. Rather, I observe, they might pay
more attention to ISO/IEC SC2 and Unicode works.
Of course, if we start to work in a new group, we should discuss
our futher works from the scratch. In order to do that, we need to
cancel our projects that have not yet reached to final stage, once.
If an existing projects still has importance to market, we can issue NWI again
with new workable business plan.
I beleive that what we need to discuss is NOT the importance of
some existing work, but is where and how we can contribute internationalization
in a timely manner. We should recognize that some of our projects are delaying.
As convenors report said, we can not put priority to API standard, although
the ISO 3 years timer was already expired.
The reson why I agreed with Arnod is that I think he proposes some
work-able actions.
1) Complete ordering standard ASAP
2) Terminate WG20 activity onece.
3) Bring some work that require maintanance to appropriate groups.
( those who are interesting in the maintanance work can join the group ).
That proposal would also impley,
a) if no one interesting in some WG20 works which require maintanance,
those standard or TR should be frozen.
b) the project that we can not put priority in the current WG20, has a
chance to re-evaluate its importance and re-start in a new group.
( Of course, if the projects can not have enough support and
participation, that projects should be dead project, we should
not re-start those canceled projects. )
Best regards,
Akio Kido (Globalization CoC, Yamato, IBM & Co-chair person of Li18nux)
Ireland, Michael Everson, September 5, 2000
I am in complete agreement with Arnold's contribution. The only thing I
would say is that I would like the plan for how SC2 is to take over
responsibility for the 14651 table to be elaborated. On the other hand
maybe that is an SC2 matter.
Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie
Ireland, Michael Everson, September 5, 2000
>Character attributes and ordering certainly belongs in SC2.
Maintaining ISO standards on the character attributes would unfairly burden
SC2 and it would be impossible to make timely changes, such have been made
numerous times as the bugs in the bidi algorithm have been ironed out.
Industry (UTC) is the right place for that work, as it is eminently
practical. The Generic Ordering Template is easier to maintain and has
already been standardized.
Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie
Ireland, Michael Everson, September 5, 2000
>Moving WG20 activities into SC2, as Arnold Winkler proposes, would be
>an error, IMHO. APIs are not in the scope of SC2. Neither are
>sorting or character attributes.
Character attributes should not be maintained by SC2 because of the nature
of the ballotting process. But you are wrong that sorting is not in the
scope of SC2. ALL of these scripts when presented for encoding have had
ordering scrutinized by WG2 experts in order to put the code tables
together. All the expertise for this is in the UTC and in WG2. To maintain
the default table, it is logical and natural for SC2 to handle this,
whether in WG2 or a new WG4.
The script and linguistic expertise is NOT available in the Programming
Languages subcommittee.
Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie
Canada, Alain LaBonté, September 5, 2000
Outcome of our Québec 2000-09-05 CAC/SC22 meeting on this issue
Generally speaking: I18N is a fundamental requirement on programming
languages (PL) and PLs don't take care enough about it (currently APL,
COBOL, C, POSIX, ADA and FORTRAN communities have dealt with such issues to
acertain point, and most others have notr at all; those who did something
did not completely do what needs to be done), that maybe the main problem.
We lose something if we weaken SC22/WG20 too much, if we do not cancel it.
I18N issues need to be reminded all the time to the PL community at least
in Plenaries. The CPL community think this would be a big loss and probably
a mistake at least for this reason. If SC2 takes the lead of most
of SC22/WG20's program of work, programming language i18n will be
neglected even more than today.
On the other hand we know that most, if not all, SC22/WG20 experts are
already working too in SC2, and SC22/WG20 and SC2 work is already
integrated in some countries, including Canada, due to the small community
of experts. That would continue anyway.
There is a need for the i18n community to keep a handle on PL activities.
SC22/WG20 needs to reflect on the reshuffling of current and maintenance
work to perhaps have a greater impact on ISO/IEC PL activities. Canada
would be in favour of reexamining all work having this as a goal.
Secific work:
IS 14651 Sort Standard: can be anywhere... it has to be in SC22 or in
SC2... SC22's advantage would be to maintain the thing open to PL standards
development more. It is sure that SC2's strong collaboration is required,
and if it were in SC2, strong collaboration would also be required from the
PL community.
TR 14652 Specification Method for Cultural Conventions: It is indeed POSIX
oriented but the POSIX WG always said it belonged to WG20... The POSIX WG
now has a lot of challenges and it would not be timely to transfer that
project there (to WG15). Perhaps we need to de-emphasize the perception
that this has more to do with POSIX than with PLs, a perception which
should be wrong, otehrwise the TR has at least partly failed. Controversial
issues should be removed and the TR should be enhanced in WG20, with strong
collaboration with SC2.
ISO/IEC 15435 API standard project: we believe in Canada that an I18N API
standard is required, and that otherwise kitchen-made solutions will rather
tie customers to some developers, which is the opposite goal of
international standards. If the current proposal is not OK, then we should
at least try, in a short-time study with a precise deadline, if at all
possible without annoying intellectual property, to find the commonalities
of what is being done by producers and see if we can make a standard with
it or at least a TR. That would belong in SC22 in Canada's opinion.
ISO/IEC 15897 Cultural Registry: Canada believes that this should be
managed in the same way as the Character set Registry with an advisory
group. This advisory group could very well be formed with a mix
representation from SC22/WG20, SC2 and perhaps SC35 (User interfaces). As
everybody said in the past, what we need is an independent "IBM green book"
(National Language Design Guide volume 2). We believe that this registry is
about this and should be marketed better. This data is required by a lot of
communities in JTC1 and even electronic commerce standards community
demonstrated an interest in this in the BT-EC report.
TR 10176 Programming languages standards guidelines: its annex on
identifier-related characters is in our opinion linked to SC2's interests,
while all the rest belongs to SC22. Again a strong collaboration between
SC22 and SC2 is required. The place for maintaining this appears to be in
SC22/WG20.
Other issues: a lot of I18N issues belong to the user interface domain.
This is dealt with in SC35. We should remember that the whole domain of
I18N is a horizontal issue in JTC1 and that cultural and linguistic
adaptability remains a strategic thrust of that super-committee. The TD on
CLAUI will hold a meeting in Southern France in October. That should be an
opportunity for SC2 and SC22's convenors and editors to reflect on all
these problems and come up with a plan for the reshuffling of those
activities in the whole of JTC1, but more particularly in SC22/WG20 and SC2.
USA, Ken Whistler, September 5, 2000
Many thanks to Alain for providing a timely report of the deliberations
on this topic at the CAC/SC22 meeting.
I have a few observations on some of the conclusions that Alain
and his colleagues reached.
> Outcome of our Québec 2000-09-05 CAC/SC22 meeting on this issue
>
> Generally speaking: I18N is a fundamental requirement on programming
> languages (PL) and PLs don't take care enough about it (currently APL,
> COBOL, C, POSIX, ADA and FORTRAN communities have dealt with such issues to
> ascertain point, and most others have not at all; those who did something
> did not completely do what needs to be done), that maybe the main problem.
> We lose something if we weaken SC22/WG20 too much, if we do not cancel it.
> I18N issues need to be reminded all the time to the PL community at least
> in Plenaries. The CPL community think this would be a big loss and probably
> a mistake at least for this reason. If SC2 takes the lead of most
> of SC22/WG20's program of work, programming language i18n will be
> neglected even more than today.
My main concern here is that the problem of internationalization
is somewhat misconstrued here as a "requirement on programming
languages." Some of the deep trouble that WG20 is in is the result
of attempting to conceive internationalization as being in the
domain of formal programming languages, and thereby setting up an
agenda to create standards that can be grafted back onto a whole
host of PL's. This is, I am afraid, bound to fail, since it is taking
an inherently complex field, full of user-specific cultural behavior,
and trying to find a way to bolt on extensions to existing programming
language standards (some of them *very* old, like COBOL and FORTRAN)
to deal with it.
I, instead, see the *appropriate* adaptation of the programming
languages to consist essentially of making sure they interoperate
with 10646 data and program text, since that seems to be the way
the world is heading. This should consist of specifying that the
languages will work with UTF-8 (as well as Shift-JIS, or whatever)
as program text, and to allow arbitrary textual content into comment
fields, for example. And the C standard has adjusted the definition
of wchar_t to take UCS into account already.
*Maybe* some adaptations to extend identifier syntax should be
allowed -- but that would depend on the language. (For example,
there really is no point in messing with FORTRAN in this way.)
Otherwise, most internationalization extensions for PL's are just
asking for trouble, if they weren't designed in from the start.
Does that mean I don't care about internationalization? Not at all.
It is just that I am convinced it is a *software design* issue,
and not a programming language issue at all.
The very best internationalized software I *ever* worked on was
the Metaphor Data Interpretation System. It was Unicode-based,
had multiple language support, both for user messages and for
all aspects of the GUI, including complete dynamic forms generation
that adjusted all graphic objects to the translated text. It
supported localization formatting hierarchies, from individual
cells in spreadsheets, through applications, through a user's
desktop preferences, to network system settings. It had provisions
for user-settable locale-specific collations.
Now how did Metaphor do such a thing? Did it depend on internationalization
in the programming language? Hardly. The entire system was programmed
using C -- but all direct OS calls were forbidden (you had to go
through a strictly controlled set of Metaphor kernel routines, so that the
system architects could guarantee portability and stability), and
likewise all locale-related library calls were also forbidden (so that
the Metaphor system was not inexplicably sensitive to differences
in machine set-up that could not be filtered through explicit
user preferences).
Nothing in WG20's program of work related to providing standard
extensions (API's or whatever) for PL's would have helped Metaphor
one whit in that regard -- all such extensions would also have been
tossed so that the system architects could do the software design
that they wanted. C was just treated as it should be -- as a general
purpose programming language widely available on multiple machine
platforms. After that, it is up to the system architects and
software designers to do what they need to do, using the general
purpose programming language as a basic tool for instantiating
algorithms on real machines.
Does this mean that internationalization should *never* be a part
of a PL? Well, no -- just that if you want to do that, it needs
to be carefully built into the language by formal language
designers, and preferably from the very start. Java is the
best example we have to date of a language done this way, and even
that has significant flaws. But creating a "standard" for
internationalization, and then telling all the language committees
to add it to their languages so they will better support
internationalization, is just a recipe for failure. That is why
I have been so opposed to the proposed API standard, 15435.
>
> ISO/IEC 15435 API standard project: we believe in Canada that an I18N API
> standard is required, and that otherwise kitchen-made solutions will rather
> tie customers to some developers, which is the opposite goal of
> international standards.
You are *always* tied to some developers. Someone has to implement
the behavior behind an API, whether you make it an international
standard or not. If you make a particular internationalization API
an international standard and then succeed in getting one or more
language committees to graft it onto their formal language standard,
then all you have managed to do at that point is to push the problem onto
the developers at Microsoft, Symantec, IBM, Borland, Sun, and the gnomes
maintaining Gnu C who then have to implement those extensions. And
customers will in turn be tied to those developers when they use
those tools.
If you don't mandate a particular API in an international standard
connected to a programming language, then *other* developers will come
forth with class libraries and components to do internationalization.
And yes, if a customer chooses to use them, they will be "tied" to
some particular developers. But guess what -- those developers are
going to be offering class libraries and components whether or not
an ISO I18N API standard is ever created -- and customers have been
and will continue to be choosing such libraries to accomplish what
they need to do in their applications.
If the worry is that this is all too chaotic, and standards-making
would lead to better interoperability in this area, I would argue
that at this point this is a little bit like trying to sweep back
the sea from the beach. Everyone would be better off if those of us
who care about internationalization and interoperability worked at
providing usable, reliable resource lists online for developers to
depend on in building class libraries and components.
As an aside, that is why I brought Graham Rhind's address resource
as an exhibit to the Denmark meeting -- it shows the kind of information
collection, compiling, and publication that is actually useful progress
in dealing with internationalization. Instead of just poopoohing the
inevitable mistakes in any compilation of that scale, and then
dismissing the effort, it would behoove those involved in international
standards in this area to ask themselves why internationalization
software engineers are immediately attracted to such compilations
as useful for their work, but show little interest in developing
standards for an "internationalization API".
Why does the IBM Green Book get an honored place on the shelf of
every internationalization engineer in the UTC, while the
proposed API for 15435 gets laughed at?
> If the current proposal is not OK, then we should
> at least try, in a short-time study with a precise deadline, if at all
> possible without annoying intellectual property, to find the commonalities
> of what is being done by producers and see if we can make a standard with
> it or at least a TR. That would belong in SC22 in Canada's opinion.
There *is* no commonality at the API level. A C library is different
from a C++ library is different from a Java class library is
different from a software component like a Java Bean. How these
things are structured is a matter of software design which WG20
is ill-equipped to handle -- and which, for that matter, the
PL standards committees are also not prepared to deal with.
The commonality is in the set of problems that people are trying to
solve in software and the kinds of data they need to generate the
tables for parsers, formatters, renderers, converters, transliteraters, and
translators.
In any case, I would urge WG20 to *first* do a market relevance
study *before* starting down the road to do some project to compare
all the commercial internationalization libraries looking for
commonalities that could be turned into a standard. If the market
is not clamoring for a standard in this area, then JTC1 should not
be laboring to produce a standard whether it will be used or not.
--Ken
Sweden, K.I. Larsen, September 5, 2000
Arnold,
I completely agree with your views in your Aug. 30 contribution. Are you
going to add the issue to the SC2 Plenary agenda?
Incidentally it seems we foresaw this development here in Sweden, since we
decided earlier this year to transfer responsibility for SC22/WG20 matters
from its traditional Swedish WG into our Character Set WG.
Best regards, and see you in Athens!
KI
USA, Ken Whistler, September 6, 2000
From: Kenneth Whistler [kenw@sybase.com]
Sent: Friday, September 01, 2000 4:40 PM
Subject: Some technical issues regarding the future of SC22/WG20
================================================================
Arnold Winkler has recently raised a number of issues regarding the future
of SC22/WG20 and the standards that it maintains or has under
development, for consideration at the upcoming SC22 plenary in Nara.
Chief among the issues he raised is whether WG20 is now at the
end of its useful life, and whether it should be sunsetted, with
its various projects redistributed over time to other committees as
appropriate for maintenance.
I want to review some of the technical issues that may have a bearing
on where such maintenance should be done, and to further consider
whether some of the projects currently under development in WG20
have enough technical merit to warrant their continuation in some
other committee, should WG20 itself be dissolved sometime in the
not-so-distant future. (Presumably any such dissolution would be
judiciously staged, over a 1-to-2 year period, to allow completion,
termination, or transfer of responsibilities, as appropriate.)
The charter of WG20 was fairly broad: standards in the area of
internationalization, as reflected in the first published TR
developed by WG20: TR 11017, "Framework for internationalization".
However, the committee has, in recent years, focused on a few
significant areas, so I will concentrate my comments on those areas
that have, de facto, constituted the majority of WG20's work.
1. Collation
WG20 developed ISO 14651, soon to be approved and published as an
international standard. This standard needs an immediate
amendment, to deal with the larger repertoire of characters added
for 10646-1:2000 (= Unicode 3.0). The question arises as to the
appropriate venue for that maintenance, if not WG20. The alternatives
being argued are SC22 or SC2.
This issue is actually rather easy to resolve on technical grounds. The
character-related expertise in SC2, and in particular in SC2/WG2
(maintainer of ISO 10646) is exactly what is needed to be able to
do the extensions of the tables required for ISO 14651. And that is
in fact the main work that will need to be done for 14651 maintenance.
The architecture for string ordering in 14651 is complete -- 14651 is
just in need of extension of the weights listed in the tailorable
template table, to keep up with the continual additions of characters
to 10646. The best way to accomplish that is to keep that standard
with the committee that actually does the additions of the
characters -- they know what the characters are and would best be
able to do timely coordination of updates for a related standard that
needs to add those characters to its tables.
Furthermore, among the active participants in WG2 are the experts
on collation (with implementation experience) who actually ended
up authoring much of the content of 14651. Comparable experience is
not obviously available in the SC22 committees other than WG20.
Furthermore, because of the current close working relationship
between WG2 and the Unicode Technical Committee, WG2 is also the
best place to maintain a standard that should stay in synch with
the Unicode Collation Algorithm maintained by the UTC, to prevent
unanticipated "drift" between the two standards.
2. Locale Extensions
WG20 is developing TR 14652, "Specification Method for Cultural
Conventions". The specifications defined in 14652 are very closely
modeled on the definition of locale in ISO 9945, the POSIX standard,
and as reflected in related documentation such as XPG4 from X/Open.
In effect, it was conceived of as an extension to the locale
constructs: to add more internationalization elements, as mentioned
in TR 11017, into a formal syntactic construct that could be used
to generate machine-readable locale definitions. So it adds
definitions for LC_NAME, LC_ADDRESS, LC_IDENTIFICATION, etc. to
the older groupings LC_COLLATE, LC_CTYPE, LC_MESSAGES, LC_MONETARY,
LC_NUMERIC, and LC_TIME. Furthermore, it attempts to extend the
preexisting categories with new keywords to deal with collation
as defined in 14651, with the new large character set defined in
10646, and new internationalization issues such as monetary
formats involving the euro sign.
It is pretty clear that the impetus and rationale for 14652 derive
from the POSIX side. As such, it logically belongs in SC22/WG15 for
further development, rather than in SC2. The participants in SC2,
while interested in internationalization issues related to locales,
have no particular interest or expertise in the POSIX-specific
syntax extensions covered by 14652, nor do they have any expertise
in ISO 9945 itself, which has to be closely tracked in the development
of 14652, to avoid superfluous inconsistencies. SC2 also has no
established history of working liaison relationships with SC22/WG15--
a situation which would bode ill for trying to develop what is
effectively a POSIX extension in a committee ill-suited to do so.
3. Character Properties
The most contentious issue regarding DTR 14652 is the effort to
extend LC_CTYPE to cover the repertoire of ISO 10646-1. The contending
positions effectively reflect a worldview divide among the participants
regarding character properties:
Position A: Character properties have not traditionally been covered
by character encoding standards, and have not been viewed as the
domain of the ISO committee responsible for encoding characters: SC2.
Instead, character properties are an implementation issue, traditionally
dealt with in the standards most directly concerned with character
implementation -- namely the formal language standards -- and are
dealt with in ISO by the working groups under SC22. In the context
of 14652, the appropriate place to define character properties is
LC_CTYPE, where the properties would be usable in a POSIX context as
part of locale definitions.
Position B: Character properties for the *universal* character set --
namely ISO 10646 (= Unicode) are inherent to *characters*, and should
*not* be defined in locales. The locale model and LC_CTYPE were an
attempt to provide a mechanism for dealing with properties of characters
in alternate encodings, but that model does not scale well for dealing
with properties for the universal repertoire of 10646. Furthermore,
it is inappropriate to assert that character properties are defined
in locales, and are thus subject to locale-specific variation, since
such a position would lead to inconsistent and inexplicable differences
in application behavior, depending on locale, in ways that have
no bearing on the usually understood issues of locale-specific
formatting differences, etc. Because character properties are closely
tied to the characters themselves, responsibility for defining them
should belong with the character encoding committees, rather than
with the language committees -- and thus in SC2, rather than SC22.
It is clear that among the rather large community of implementers
of 10646 (= Unicode), Position B has much more widespread support
than Position A. Position A is, however, a vocally held minority
opinion among those committed to the extension of the POSIX framework.
In point of actual fact, the *real* work on standardization of
10646 character properties is being done almost entirely
by the Unicode Technical Committee, which for years now has been
publishing machine-readable tables of character properties and
associated technical reports that are in widespread implementation
in many products. A very few character properties, most notably
"combining" and "mirroring", are also formally maintained by SC2/WG2 in
ISO 10646 itself, and those properties are tracked in parallel by
the UTC.
On balance, it would seem far preferable to conclude that within
JTC1 any responsibility for character properties should belong
to SC2, rather than SC22. Once again, this is a matter of expertise
regarding the huge number of characters in 10646. That expertise
is in SC2, and not in SC22. And the implementation experience
regarding character properties resides in the UTC, which has a
firm working relationship with SC2, but no close ties to SC22.
Regarding LC_CTYPE in particular, the maintenance or extension of
LC_CTYPE should be remanded to WG15, along with all of DTR 14652,
but with the following recommendations: Rather than attempting to
independently extend LC_CTYPE definitions to cover 10646, a mechanism
should be developed whereby POSIX implementations using LC_CTYPE
can make use of the more widespread and better researched and
reviewed character property definitions developed by the UTC, in
cooperation with SC2/WG2's development of 10646. This should be
done by *reference*, rather than by enumerating lists of characters
in SC22 standards or TR's, because of the danger of those lists
getting out of synch or introducing errors that cause interoperability
problems. Furthermore, this practice of dealing with character
properties by reference to UTC and/or SC2 developed standards
for them, should be recommended to *all* the SC22 committees, as
the generic way to deal with character properties in formal
language standards.
4. Internationalization API Standard
WG20 has a project on the books, 15435, to develop an API standard
for internationalization. To date, there has been very little
evidence proffered that there is any actual demand for such a
standard. There is no list of IT companies requesting it to solve
some interoperability problem. The big OS and tools vendors are not
requesting it. The Linux internationalization community has rejected it
in favor of other options. The Java community has no interest -- they
already have a sophisticated internationalization architecture. The Unicode
Technical Committee, which has very widespread representation from
the implementing community, has indicated zero interest in the
15435 project.
No one in WG20 but the project editor seems to be doing any active
work to develop the API standard for internationalization, and the
committee feedback to date has largely been that the quality of
the drafts is poor. Fundamental questions regarding the nature
of the API design have not been resolved. Furthermore, there has
been a lot of hand-waving over the issue of how closely tied the
proposed API is to the locale extension constructs of DTR 14652.
The API under development for 15435 is locale-centric, in that
it requires information in an "FDCC-set" defined a la DTR 14652,
assuming API behavior will depend on that information, resident
in some implementation-defined "database".
Modern internationalization libraries have largely eschewed that
kind of locale-centric design as too constrained, instead breaking up
the problem of internationalization support into more modular
designs that separate out different aspects of the problems
involved.
Furthermore, the proposed API standard aspires to platform
independent design. That, however, inappropriately conflates the
issue of designing appropriate behavior for internationalization
with the problem of designing appropriately abstracted API's
for that behavior on distinct platforms. In actual practice,
implementers are tending to make use of available libraries that
surface correct internationalization behavior (such as the
ICU classes) and then writing whatever wrappers are necessary to
abstract that behavior into their systems. The days of trying
to define complex behavior via ISO API standards, to be rolled
out by language compiler vendors in standard C libraries and such,
are being overtaken by object-oriented design and software
component models.
At this point, WG20's project 15435 should just be abandoned as
a well-intentioned but obsolete project that has no demonstrated
need or support for its development.
5. Cultural Registry Standard
WG20 is also charged with the maintenance of the cultural registry
standard, ISO 15897. That registry needs a firm review and
resolution process to ensure its correctness and market relevance.
WG20 should be able to provide the definition of such a resolution
process, along the lines provided by ISO 2375 for the character
set registry. Once the review is done, and ISO 15897 has been
appropriately updated, it should be a stabilized standard, requiring
little further work or attention.
It will then be the responsibility of the registering agency (DKUUG)
to follow the registration process and to make the cultural element
registry worthwhile.
6. Identifiers
An issue that WG20 has had to deal with fairly recently is the
list of recommended characters for identifiers, in Annex A of
TR 10176, "Guidelines for the preparation of programming
language standards". Because the list of recommended characters
for identifiers is based on the repertoire of ISO 10646, this
is another area where repeated maintenance into the future can
be foreseen, as the repertoire of 10646 continues to expand.
Once again, because of the location of character expertise regarding
all the characters added to 10646, the logical source for recommendations
about how to extend the list in Annex A in the future is SC2. This
is supported by the additional fact that determination of which
characters are and are not appropriate in identifiers implicitly
depends on specification of a constellation of properties
for those characters -- again an area in which the expertise is
located in SC2.
However, there is somewhat of a conundrum here, since the remainder
of the content of TR 10176 is clearly in the domain of SC22, and the
TR as a whole is inappropriate for maintenance in SC2. Perhaps
some kind of understanding could be arranged between the SC's
to guarantee that modifications to Annex A or TR 10176 should only be made
with timely, coequal input from SC2.
A better solution, in the long run, would be to sever the contents
of the exact table in Annex A, which has to track character repertoires
and properties that are (or should be) the responsibility of SC2,
from TR 10176 per se, and instead insert a reference there to a
standard list maintained by SC2, either in the context of 10646
itself or in some associated TR to be developed by WG2 for this
purpose. That would more appropriately divide the responsibilities
for the part of TR 10176 associated with formal language syntax
and design and the part which is attempting to track the universal
character encoding repertoire as it expands over time.
Another reason for moving in this direction is the particular interest
that the Unicode Technical Committee has in the identifier content
problem. The Unicode Standard has detailed recommendations regarding
identifiers, and the Unicode Technical Committee is currently working
on even more detailed specifications regarding identifiers and
identifier-like constructs for use in various contexts on the Worldwide
Web and the Internet. It is in JTC1's interest to keep this particular
technical issue active in a venue, namely SC2/WG2, where the character
encoding expertise is available and the working relation with the UTC
is strong. Even though on the surface it might seem that programming
identifier syntax clearly belongs to SC22, the real issue is not the
syntax per se (which is quite simple), nor the concept of an identifier
and its relation to other programming language constructs (which the
UTC and SC2 have little interest in and consider to be long ago
fixed and decided by the SC22 standards). No, the *real* issue that
remains open and problematical is how to classify and distribute all
the thousands of additional characters in 10646, and how to deal
with the complex ramifications of inclusions of various compatibility
characters which may or may not change under various kinds of
identifier normalization processes. That is where the UTC and WG2
expertise would be most helpful, and where joint development of
Unicode and ISO standards would be most likely to minimize
interoperability problems for identifiers in different programming
languages and Internet and Web protocols.
This entire issue, is, by the way, also of intense interest to
the Database standards arena, where it is of direct relevance
to the SQL standard, for example. So the SC22 working groups are
not the only JTC1 groups with an interest in standard,
interoperable results in this area for 10646 characters.
7. Case Mapping and Case Folding
WG20 has not spent much time dealing with case mapping and case
folding issues, although those clearly have an internationalization
angle, because of local differences in case mapping preferences.
The one point where this has been dealt with by WG20 is in the
LC_CTYPE specification in DTR 14652. This is because LC_CTYPE is
the location of the information used by the tolower() and toupper()
case mapping transforms for C (and by extension, other languages).
As a result, PDTR 14652 includes tables of case pairs for all
of the 10646 characters that have case pairs.
However, the inclusion of these case mappings explicitly in the
"i18n" LC_CTYPE definition in DTR 14652 has been controversial in
the committee, in part because of a small number of unexplained
inconsistencies between those tables and the case mappings provided
by the Unicode Consortium on its website. The Unicode case mappings
are very widely implemented in many products, and are being treated
by the industry as a de facto standard. So it is problematical for
DTR 14652 to be proposing slightly different case mappings for
a standards document that contradict widespread practice.
This is once again an area where the JTC1 standards arena would be
better served by using references to de facto practice, rather than
trying to reinvent the wheel with long lists in other standards or
TR's, subject to the introduction of error or drift that can
introduce interoperability problems. Perhaps here the SC22 language
working groups could work with SC2/WG2 to find a way to get the
de facto Unicode tables to be referenceable through an SC2 TR of
some sort, to avoid the synchronization issues of trying to maintain
two (huge) lists separately.
The area of case folding is related to case mapping, but is subtly
different. WG20 has not dealt with this issue, but it is clear
that SC22 language working groups need to deal with this. In particular,
COBOL, Pascal, and other languages that have case-insensitive
identifiers, need to be able to do reliable case-folding during their
parsing/lexing phases of program text interpretation. For that, they need
reliable definitions of case-folding as applied to 10646 characters
for the domain of characters allowed inside identifiers for each
language.
While WG20 has not touched on this issue and the SC22 working groups
are starting to search for an answer, the Unicode Technical Committee
and the IETF have moved ahead, creating de facto solutions that will
see widespread implementation in the near future.
The Unicode Technical Committee has already published CaseFolding.txt, a
machine-readable file with recommendations on exactly how to do
case-folding for all Unicode 3.0 characters (i.e. 10646-1:2000 characters).
The SC22 committees should be reviewing that file, and the associated
case mapping information available in UnicodeData.txt and in
SpecialCasing.txt -- also available on the Unicode website -- before
concluding that new standardization efforts need to be initiated in
SC22 (whether in WG20 or in other working groups), to repeat the
work involved in creating those files, which are already freely available
to all implementers.
The UTC and the IETF are currently working on the even thornier
problem of determining how best to define identifiers in a context
(such as internationalized domain names) where certain characters
are disallowed (such as punctuation that has other reserved uses in
URL syntax), where case folding is required, where normalization of
data is also required (disallowing of equivalent sequences that might
otherwise appear identical), and where even visual look-a-likes of
otherwise different characters are to be avoided if possible because
of the confusion they can pose for user entry and the possibility
of spoofing. This is an area where intimate knowledge of all the
characters in 10646 and their interaction of properties and appearances
is required. Yet again, it would behoove the SC22 working groups
to participate in the joint UTC/IETF effort in this area through
review and feedback, rather than trying to reinvent the wheel in
a committee context where less relevant expertise would be available
to start with.
Germany, Marc Küster, September 6, 2000
Dear Alain,
> There is a need for the i18n community to keep a handle on PL activities.
>
> SC22/WG20 needs to reflect on the reshuffling of current and maintenance
> work to perhaps have a greater impact on ISO/IEC PL activities. Canada
> would be in favour of reexamining all work having this as a goal.
>
IMHO Canada is quite right on this emphasis.
That said, SC22/WG20 would benefit from an increased "customer focus". But
who are WG20's direct customers? Not necessarily enterprises or
individuals, but first and above all the other PL (+ POSIX) working groups
who need to consider i18n.
That has happened, but to my personal experience rather without WG20's
direct involvement. E. g., let's have a look at the new i18n features of
C++ with its intelligent facet mechanism. I cannot remember that these
extension to the C++ standard library has ever been discussed in the
context of WG20's own API standard. (I'll gladly stand corrected if these
have been an issue in the past, prior to my personal involvement).
On a national level, i. e. in our national SC22 mirror committee, we have
decided to look into i18n features of the different programming languages,
just as before we have studied different OO-techniques (for, while many
PLs nowadays claims to be object oriented, the differences between the
realizations are significant).
This is a kind of work that would have to be performed before WG20 rushes
at an API standard that, as Ken rightly points out, would be best ignored
if it is made without taking into account what has been done elsewhere --
especially in Java. Even then, it is doubtful if a formal WG20 standard is
needed at all.
This kind of work can largely be performed online, making a significant
reduction in WG20's meeting schedule feasible.
> Specific work:
>
> IS 14651 Sort Standard: can be anywhere... it has to be in SC22 or in
> SC2... SC22's advantage would be to maintain the thing open to PL standards
> development more. It is sure that SC2's strong collaboration is required,
> and if it were in SC2, strong collaboration would also be required from the
> PL community.
>
Agreed. It would be best, however, to keep 14651 located within the SC22
framework. Standards are not only developed by individuals or individual
working groups. They are firmly bound to an organizational structure --
and that is SC22 and, in many countries, its national counterparts.
Best regards,
Marc
WG20 convenor, Arnold Winkler, August 30, 2000
Personal thoughts about the future of SC22/WG20 - Internationalization
for consideration by the SC22 plenary in Nara
From: Arnold F. Winkler (convenor)
Date: August 30, 2000
The following contribution to the SC22 plenary holds my very personal thoughts about the work of SC22/WG20 (Internationalization) and what I see as the best way to serve the programming language community in SC22.
I think, it is time to wrap up WG20's life.
WG20's most important work will hopefully be completed this fall:
* making the world aware of I18N in TR 11017
* carrying ISO 10646 and I18N into programming languages in TR 10176
* establishing a culturally correct sorting method for ISO 10646 encoded data in IS 14651
When we started working on these projects, and when we asked for projects for a cultural specification standard and an API standard, mainly using POSIX syntax, there was no other method on the market. This is not true any more, object orientation and Java, and the web, and W3C, and LINUX, and even Microsoft's I18N have changed the playing field for ever. WG20 "inherited" the CEN registration for cultural conventions as IS 15897, once again a bit late for the modern languages and implementations.
In my (and the US) opinion, WG20 should not do much more new development work. It could go away totally, when the sort standard is approved and when we have found good homes for the maintenance of the completed work.
I would not touch TR 11017, unless somebody makes a comprehensive contribution that covers the full extent of I18N technologies and requirements as presented in the marketplace today. The web, the proliferation of ISO 10646, access technologies for disabled persons in all countries - these are subjects that could, but don't NEED to be addressed in TR 11017, in case somebody has the time, resources, and interest to do a revision.
TR 10176 is fine, the amendments due to extended character repertoire (Annex A) could easily be done by SC2. That's where the experts are.
IS 14651, the sort standard, will also need amendments once it is approved, to keep up with the repertoire additions in ISO 10646. Again, it is the maintenance of the table and could/should be done by SC2.
The cultural elements stuff (specification, API, registry) is in my opinion outdated and most likely almost unnecessary. With lots of input from the US (Ken Whistler), and valuable additions from Japan (Takata), both 14652 and 15435 will get new drafts before the meeting in November in Malvern, Pennsylvania.
ISO/IEC 14652 is now a TR, and could be useful to the specific group it was defined for. However, the US is only interested in ensuring that compliance with this document is never a requirement for modern programming languages, such as Java.
Project 22.15435, the API standard, should be withdrawn. There is no interest in the user community and the project has not seen a ballot document for 3 years.
One concern is the registry ISO/IEC 15897 - DKUUG is the registration authority. I believe that no real standards work is needed, but good registration procedures need to be established. We are currently looking into the SC2 registration process for character sets - ISO 2375 is being distributed to WG20 as a template for a working process with all the ingredients: submission process (who - individuals, companies, NBs), review process (who, time), resolution of difficulties, etc... If we can get this set up correctly, the registry will be helpful, especially if it can be made available on the web.
And any additional work would be related to character properties - much better located in SC2 where we find all the experts. We had a short discussion in the last meeting and I was told that I had "no vision" for new work. I guess, this is right, but nobody else came up with anything either that fit into the WG20 scope. The UK pushed transliterations, the WAP pictograms came up, and user interfaces - none of which is within the knowledge base of WG20 and other subjects are already placed in other WGs in JTC1 or ISO or other SDOs.
There will be a meeting of the Technical Direction (CLAUI) for cultural and linguistic adaptability and user interfaces - October 19-20 in France. I will NOT be able to go there to represent WG20. This would be the best place to find competent homes for the maintenance of the WG20 completed work and agree on the registration process, at least in principle.
I would like to see WG20 :
* complete the sort standard ISO 14651
* find home(s) for the maintenance of its completed work (TR 11017, TR 10176, and sort), preferably in SC2
* agree on registration processes for the registration of cultural elements in ISO/IEC 15897 by adjusting the ISO 2375 process
* move the project TR 14652 for the specification of cultural conventions to SC22/WG15
* withdraw ISO 15435, the API standard
* and go out of business in about 11/2 years.
This would mean for SC22:
* Agree with this plan in principle
* Encourage WG15 to take over TR 14652
* Withdraw project 22.15435
* Ask SC2 for specific support in the complex issues of character properties as they apply to identifiers in programming languages
* Move the maintenance of IS 14651 and TR 10176 to SC2 (provided SC2 agrees, e.g. at the CLAUI meeting)
* Dissolve SC22/WG20 when all above items are completed and the registry is operational.
Best regards
Arnold
Norway, Keld Simonsen, September 6, 2000
On Wed, Sep 06, 2000 at 02:38:42PM +0200, Marc Wilhelm Küster wrote:
>
> That has happened, but to my personal experience rather without WG20's
> direct involvement. E. g., let's have a look at the new i18n features of
> C++ with its intelligent facet mechanism. I cannot remember that these
> extension to the C++ standard library has ever been discussed in the
> context of WG20's own API standard. (I'll gladly stand corrected if these
> have been an issue in the past, prior to my personal involvement).
We did in WG20 decide (but later reverted) that we wanted a C++ binding,
but we did not explicitely discuss the facet mechanism of C++ for
this. I have gone 2 times to WG21 to discuss the i18n API with them
and one German representative (Dietmar?) promised to help, but later
declined due to lack of time. We have later decided just to do
a C version of the PAI.
Keld
Canada, Dave Blackwood, September 6, 2000
The fact that POSIX has dealt with internationalization issues at all is a
tribute to those involved. The problems that we have encountered however
are not from a lack of caring but a lack of expertise. While many
internationalization experts may be willing to devote time to WG20 and/or
SC2, relatively few of them are willing to attend WG15 meetings (and more
importantly IEEE PASC and Austin Group meetings where the real technical
development is done) to explain the issues and help develop the solutions.
It is insufficient to simply have a liaison between working groups whose
primary role is to report what one group is doing that may be of interest to
the other. We need real, ongoing and substantive involvement.
We have also seen many requests for the operating system to fix what are
essentially application problems. The subtle differences between dictionary
and telephone book sorting across cultures is beyond the functionality that
can be expected from an OS, as is the storage, format, and presentation of
dates very, very far into the past or very, very far into the future as may
be required for astronomical calculations, etc.
WG20 could be more effective if it worked to incorporate i18n solutions into
existing PL and OS standards and concentrated less on developing stand-alone
i18n standards that are based on invention rather than existing practice and
consequently are rarely implemented fully by PL and OS vendors. A C
compiler conforms to the C standard, a POSIX OS conforms to the POSIX
standard, what conforms to an i18n standard?
Dave
--
D. J. Blackwood, Chair
Canadian POSIX Working Group
USA, Asmus Freytag, September 6, 2000
At 12:18 PM 9/5/00 -0400, Alain LaBonté wrote:
>Outcome of our Québec 2000-09-05 CAC/SC22 meeting on this issue
>
>Generally speaking: I18N is a fundamental requirement on programming
>languages (PL) and PLs don't take care enough about it (currently APL,
>COBOL, C, POSIX, ADA and FORTRAN communities have dealt with such issues
>to ascertain point, and most others have not at all; those who did
>something did not completely do what needs to be done),
This statement excludes the forward looking work of languages such as C++
and even more so Java.
A more important issue is that while I18n is indeed a fundamental
requirement for PL it is a fundamental requirement for all aspects of IT -
languages, operating systems, applications, data formats, query languages,
markup languages, ....
Burying this work in SC22 has the effect of isolating it from all those
fields of application that are not SC22 developed programming language
standards.
>If SC2 takes the lead of most of SC22/WG20's program of work, programming
>language i18n will be neglected even more than today.
I'm not sure that I agree. I see a lot of the impetus for strong support
for internationalization go hand in hand with adoption of support for
10646/Unicode. Since the majority of new work on i18n is built upon the use
of 10646/Unicode, it would be natural for those doing the work to look to a
single SC2.
From a JTC1 perspective, the question of where certain work is being done
must address the need of all of JTC1 and its liaison organization (such as
IETF and W3C) and not just the needs of a particular SC to motivate its
working groups to get internationalization support added to their
programming language standards.
>On the other hand we know that most, if not all, SC22/WG20 experts are
>already working too in SC2, and SC22/WG20 and SC2 work is already
>integrated in some countries, including Canada, due to the small community
>of experts. That would continue anyway.
The point is that a move to WG2 with it's larger community of experts would
in all likelihood be quite positive from an organizational point of view
and would help to elevate the visibility of the I18n efforts.
>Specific work:
>
>IS 14651 Sort Standard: can be anywhere... it has to be in SC22 or in
>SC2... SC22's advantage would be to maintain the thing open to PL
>standards development more. It is sure that SC2's strong collaboration is
>required, and if it were in SC2, strong collaboration would also be
>required from the PL community.
>
>TR 14652 Specification Method for Cultural Conventions: It is indeed POSIX
>oriented but the POSIX WG always said it belonged to WG20... The POSIX WG
>now has a lot of challenges and it would not be timely to transfer that
>project there (to WG15). Perhaps we need to de-emphasize the perception
>that this has more to do with POSIX than with PLs, a perception which
>should be wrong, otherwise the TR has at least partly failed.
>Controversial issues should be removed and the TR should be enhanced in
>WG20, with strong collaboration with SC2.
>
>ISO/IEC 15435 API standard project: we believe in Canada that an I18N API
>standard is required, and that otherwise kitchen-made solutions will
>rather tie customers to some developers, which is the opposite goal of
>international standards.
My sense is that the nature of APIs itself is still under strong debate and
hasn't settled into a consensus where one could do an i18n API set without
inadvertently taking sides in the larger debates of object-oriented vs.
procedural and whether C++ style or Java style etc. When the first POSIX
standard was written, the world was a simpler place and such an effort made
a lot of sense. Nowadays it's all more difficult.
>TR 10176 Programming languages standards guidelines: its annex on
>identifier-related characters is in our opinion linked to SC2's interests,
>while all the rest belongs to SC22. Again a strong collaboration between
>SC22 and SC2 is required. The place for maintaining this appears to be in
>SC22/WG20.
The problem of identifier guidelines is so firmly linked with character
issues, that a way needs to be found to separate that part and move it into
SC2. Identifiers are not only needed in programming languages, but in many
other types of languages and internet related services (domain names).
Since the issues connect with the character set standard on which they are
based, SC2 is the right place.
>Other issues: a lot of I18N issues belong to the user interface domain.
>This is dealt with in SC35. We should remember that the whole domain of
>I18N is a horizontal issue in JTC1
This is indeed the case. While I am firmly in support of moving character
related maintenance and standard into SC2, there are many i18n areas that
should be placed in other places. In all cases, though, if programming
languages are not central to the issue, the work should probably be taken
out of SC22.
A./
For more reactions please see the links on the top of this document.
Arnold
September 11, 2000