Title: |
Summary of Voting on JTC 1/SC 34 N 320 Rev – FCD Ballot to ISO/IEC 19757-2 – Document Schema Definition Languages (DSDL) – Part 2: Grammar-based validation – RELAX NG |
Source: |
|
Project: |
ISO/IEC 19757 – Document Schema Definition Languages (DSDL) |
Project editors: |
J. Clark |
Status: |
|
Action: |
Project Editors are requested to review comments and take them into consideration when preparing revised text. |
Date: |
26 November 2002 |
Summary: |
Based on the result of voting, this document has been APPROVED. |
Distribution: |
SC34 and Liaisons |
Refer to: |
|
Supercedes: |
|
Reply to: |
Dr. James David Mason
|
SC 34 Voting Summary on JTC 1/SC 34 N 320 Rev
FCD to ISO/IEC 19757-2 – Document Schema Definition
Languages (DSDL) –
Part 2: Grammar-based validation – RELAX NG
|
P-Member |
APPROVAL OF THE DRAFT AS PRESENTED |
APPROVAL OF THE DRAFT WITH COMMENTS AS GIVEN ON THE
ATTACHED |
DISAPPROVAL OF THE DRAFT FOR REASONS ON THE ATTACHED |
Acceptance of these reasons and appropriate changes in the
text will change our vote to approval |
ABSTENTION (For Reasons Below): |
|
Brazil |
|
|
|
|
|
|
Canada |
X |
|
|
|
|
|
China |
|
|
|
|
|
|
Denmark |
|
|
|
|
|
|
France |
|
|
|
|
|
|
Ireland |
|
|
|
|
|
|
Italy |
|
|
|
|
|
|
Japan |
|
|
|
|
|
|
Republic of Korea |
|
X |
|
|
|
|
Netherlands
|
X |
|
|
|
|
|
Norway
|
X |
|
|
|
|
|
United Kingdom
|
|
|
X |
X |
|
|
United States
|
|
X |
|
|
|
|
TOTAL |
3 |
2 |
1 |
1 |
|
Japan
Infinite nameclass <attribute> not having <text> as the content should be prohibited.
Relax the constraints on <interleave> without causing non-determinism.
Disallow <interleave> occuring in <oneOrMore>.
United Kingdom
UK Vote: DISAPPROVAL OF THE DRAFT FOR
REASONS APPENDED BELOW.
Acceptance of these reasons and
appropriate changes in the text will change our vote to
approval
General
The UK is concerned about the
potential confusion between the short names assigned to
the
existing ISO TR 22250-1 (RELAX Core)
and the proposed Part 2 for ISO 19575 (RELAX NG).
The UK notes that the formal
standard does not fully conform to the technical report's
recommendations, so that RELAX NG
cannot be said to be an extension of RELAX Core. If
the TR is not withdrawn, the short
form of one of the two documents should be changed to
avoid the expectation that the
standard is dependent on a TR. (It is noted that there is
no
connection between the short form of
the name and the full title of the document in either
case. Perhaps there should be, and
RELAX NG should be changed to something like DSDL -
GBV.)
Clause 3
1) Terms should be ordered
alphabetically to simplify the finding of relevant
definitions.
2) Add definitions for the following
terms used without definition in the text:
_ match
_ weakly match
_ content type
_ in-scope
grammar
_ mixed
sequence
_ union
Clause 3.1
The definition of resource is
ambiguous: the word “potentially” does not clarify whether
a
resource must or must not be
addressable by a URI. If the resource is something that
has
identity but cannot be addressed
using a URI (e.g. “Acts Ch5 V2”) how can it be used by
19757-2?
Clause 3.4
The term “another URI” is misleading
given that another URI could still be a relative URI.
Change to “a complete
URI”.
Clause
3.10
Change “an NCName” to “a local
name”. (The definition of local name introduces the need
for
it to be an NCName: the important
point for the name definition is that the second part is
the
local name, not that it conforms to
the NCName rule.)
Clause
3.14
It is not clear what form a
“specification” takes. Either explain the format or remove the
words
“specification
of”.
Clause
3.22
The term equivalence relation is not
explained in the text, or in the definitions. There is no
clear explanation as to why a
datatype should consist of a “set of strings”: for example,
rules
for validating dates are certainly
not defined in terms of sets of strings, neither are rules
for
defining ranges of numeric values,
etc, which are typically used to constrain datatypes.
Clause
4.2.1
The word “thus” should not be used
in ISO standards (see definition of m). The definition of
m
mentions the possibility of
consecutive strings, which is not permitted in XML or the rules
in
19757-2. The definition might be
better stated as:
_ m ranges over sequences of
elements and strings; a sequence with a single member is
considered the same as that member;
there are sequences ranged over by m that cannot
occur as the children of an element
because the sequences ranged over by m may
contain consecutive strings and may
contain strings that are empty.
Clause
4.2.2
The definition of the subscript
c in p :c ct is not explained, and neither is the
term content-type
(see comment on clause
3).
Clause
4.2.3
For the third listed entry,
something needs to be said about the de-duplication implied by
the
union process.
Clause 6
The URL for identifying the
specification in ISO 19757-2 should start by a reference to
ISO,
and not to an outside organization.
I would suggest it takes the form:
http://www.iso.ch/jtc1/sc34/ISO19757/Part2/1.0
The text of the second paragraph
will make it impossible to use any extensions or updates
to
IETF RFC 2396 to identify resources.
Given that we know of at least one planned extension,
to allow the full Unicode code set
to be used for parts of URI’s that will form an
Internationlized Resource Identifier
(IRI), it would seem wise to add add the phrase “or any
IETF approved standard that replaces
or extends this specification” after the bracketed
reference to RFC
2732.
The EBNF constructs defined for the
full grammar in this clause are not formally defined
within the standard. Their use is
explained and constrained by rules defined in the Relax NG
Tutorial, which is not a normatively
referenced part of this standard (or of the OASIS Relax
NG specification). A brief
explanation of the purpose of each of the elements permitted in
the
full model, and a full explanation
of any rules that constrain their application, should be
included in this
clause.
Clause 7.2
This rule would seem to require that
any xml:lang attribute associated with an element would
need to be removed as part of the
simplification process. Why should this attribute, or any
other needed to be added for the
purpose of managing schema objects within specific
applications, have to be
removed?
Clause 7.5
The phrase “a type attribute is
added with value token” can be misread. Suggest changing
it
to “a type attribute whose content
is the token ‘value’.” (Alternatively, use fonts that
distinguish
type and value as the code to be
entered, rather than having them in the same face as the
adjacent
text.)
Clause 7.8
The 2nd and 3rd, and 4th and 5th, sentences of the third paragraph
are self contradictory. The
instruction of the second part of
the 2nd sentence states that “the grammar
element shall have
a start component” while the second
part of the 3rd
sentence reads “all
start elements are
removed from the grammar element”. A
similar conflict occurs when the 5th paragraph
states
that “all define components with the
same name are removed from the grammar element”,
which requires that there be no
definition left for the name. Adding the word “other” after
the
occurrence of “all” in the
3rd sentence is likely to correct the
first error, presuming that the
purpose is to select the first start
element as the valid one and discard all subsequent ones.
However, it is unclear that this is
the case for the define element covered by the 5th paragraph
as the rules in 7.18 for combining
define elements with the same name would seem not to be
apply-able if the rules in 7.8 have
been applied.
Clause 7.9 and
7.10
In 7.10 the value of the ns
attribute of a name is inherited from the nearest ancestor with
an
ns attribute (as you would expect in
XML) but for some reason 7.9 forces the ns attribute to
be empty for the name of an
attribute definition. XML requires that attributes that are
not
assigned a namespace are assigned
the namespace of their parent element. How do the
rules in 7.9 ensure
this?
Clause
7.13
In the 2nd paragraph remove the word
“Similarly” and the following comma, and start a new
paragraph at this point. (The need
to use a different font for codes mentioned above is
highlighted by the problems with the
phrase “An element element”!)
Clause
7.19
It is unclear what the term “the
in-scope grammar of the in-scope grammar” in the 2nd
paragraph means. (Adding a decent
definition of the term in-scope grammar to section 3 may
help here, but the real problem is
the explanation of how in-scope grammars nest, which is
not explained anywhere in this
specification.
Clause 8
The definition of the simplified
grammar element given here would allow the following to be
a
legal
definition:
<grammar><start><notAllowed/></start></grammar>
There does not seem to be a
constraint to be prevent this option within 10.2.6, or an
explanation as to when such a
production might be valid.
NB: The production also allows
<grammar><start><empty/></start></grammar>, but
10.2.6
prohibits the use of
this.
Clause 9.1
The purpose of the “subscript”
mentioned in the second paragraph is not defined.
Clause 9.2 and
9.3
The meaning of all productions in
these sections should be explained textually, as occurs
for
the first two examples, but not for
the rest. Specifically following should be explained:
1) The relationship between (name
choice 1) and (name choice 2) in 9.2
2) The relationship between (choice
1) and (choice 2) in 9.3.1
3) The role of the optional
[cx2] attribute in the definition of
(value) in 9.3.8, and why this
is always a valid
attribute.
4) How (token equal) can determine
the equality of unsorted token lists such as “a b c
d” and “b d c
a”
Clause
10.2.1
Referring to both x/p as a path and
to p on its own as a path is confusing. x/p is a path but p
is
really only a descendent which may
or may not be a direct child of the parent (x). It might
be
better to use p for parent, c for
child and d for descendants and refer to the paths p/c and
p//d
rather than x/p and
x//p.
foo and foobar are not paths but
names, one of which is undefined. While there is a case
for
leaving foo in the list, foobar
should definitely be removed from the penultimate
paragraph.
Clause
10.3
It is not clear whether the
restrictions shown apply to the full syntax as well as the
simplified
syntax (partly because there are no
constraints specified in the standard for the full syntax).
If
these restrictions only apply to the
simple syntax this should be clearly stated.
Textual explanations of the meanings
of each of the constraints should be provided for those
without a mathematical
background.
Annex A
It should be made clear whether the
schema is defined using the full or the simplified syntax.
Doesn’t the reference to any within
the definition of any (the last definition) conflict with
the
rules in 7.20 that restrict looping
references?
Annex B
The example is unrealistic, and
fails to show key features of the language. At least one of
the
elements should have an attribute
other than a namespace attribute for which specific values
have been defined as valid. At least
one of the elements should have nested elements, and at
least one of the nested elements
should be allowed to contain data that is managed using a
datatype.
Bibliography
The URLs assigned to the two
referenced documents need to be switched to refer to the
correct
documents.
The U.S. is concerned about potential confusion between 19757-2, Part 2 (RELAX NG) and TR 22250-1 (RELAX Core). We request that the name of 19757-2, Part 2, be changed to something different (and more appropriate to an International Standard), such as, for example, "International Standard Schema Language (ISSL).