Submission from the UK of an initial Working Draft for the Proposed DSDL
standard that identifies users requirements for the proposed standard in Annex
1. This document is submitted as originally supplied and although the User
Requirements are contained in an annex which is marked as normative, the UK
does not consider that these requirements, which are instructions to the
Project Editor, should remain as normative requirements on the users of the
published standard. SC 34 may like to consider whether these requirements
should be contained in a separate User Requirements document that could form
definitive instructions to the editor.
TITLE: |
U.K. National Body Contribution to First Working Draft of Document Schema Definition Language (DSDL) |
SOURCE: |
G. Williams, U.K. |
PROJECT: |
|
PROJECT EDITOR: |
M.
Bryan |
STATUS: |
First Working Draft |
ACTION: |
This document was included in the NWI comments, but the U.K. intended to have it distributed separately in its entirety to serve as a base document for further development. |
DATE: |
|
DISTRIBUTION: |
SC34 and Liaisons |
REFER TO: |
|
REPLY TO: |
Dr. James David Mason |
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work.
In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1. Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International Standard requires approval by at least 75% of the national bodies casting a vote.
International Standard ISO/IEC 13240 was prepared by Joint Technical Committee JTC1, Information technology.
SGML Document Type Definitions (DTDs) allow document structures to be formally modelled but do not allow details of data types or data relationships to be recorded in an XML-compatible way. While the W3C XML Schema Definition language (XSD) does allow data types to be used to validate the contents of SGML elements and values of attributes, it does not allow the relationships between the values of different attributes and contents of elements to be validated. A new, compact, efficient and XML-based document type definition for the integrated description of document structures, data types and data relationships will make it possible to automate the processing of structured information resources to the level required by business users, which has a higher level of requirements than those identified from the publishing community for which SGML was originally developed. The standard will also define the scope and notation for converting and interworking a core subset of document structure, data type, and data relationship constraint models among the three notations: DSDL, DTD declarations and XSD.
This International Standard, known as the Document Schema Definition Language (DSDL), allows the definition of document structures, data types and data relationship constraints that can be applied to data represented using the ISO/IEC 8879 Standard Generalized Markup Language and its derivatives, such as ISO/IEC 10744, Hypermedia/Time-based Structuring Language (HyTime), and the W3C Extensible Markup Language (XML).
To be defined
ISO 8879:1986, Information processing -- Text and office systems -- Standard Generalized Markup Language (SGML)
W3C Extensible Markup Language (XML) (http://www.w3.org/TR/REC-xml)
W3C XML Schema Part 2: Datatypes (http://www.w3.org/TR/xmlschema-2/)
DSDL
Document Schema Definition Language
SGML
Standard Generalized Markup Language (ISO/IEC 8879)
XML
W3C Extensible Markup Language
Any references in this document to industry and proprietary standards, products, user groups, and publications are not normative, and do not imply endorsement by ISO, IEC, or their national member bodies or affiliates. Any brand names or trademarks mentioned are the property of their respective owners.
The formal definitions are expressed as using the W3C XML subset of SGML.
The formal definitions are part of the text of this International Standard and are protected by copyright. In order to facilitate conformance to DSDL, the formal definitions may be copied as specified in the following copyright notice: Copyright (C) 200? International Organization for Standardization. Permission to copy in any form is granted for use with conforming DSDL systems and applications as defined in ISO/IEC ????, provided this notice is included in all copies. The permission to copy does not apply to any other material in this International Standard.
Note 5. This document uses editorial conventions mandated by the ISO with which the reader should be familiar in order to understand the implications of certain words.
The text describing each construct emphasizes semantics, while the formal XML definition provides the rigorous syntactic definitions underlying the text descriptions.
Note 6. For this reason, it is recommended that the reader refer to the XML definitions while reading the textual descriptions. Although the XML definition always follows the related text, the user may find it helpful to read the XML first in some cases.
When a construct is first introduced, it is described in the text. If the construct occurs in the formal XML specification, both the formal XML name and a full name in English are presented, as follows:
This standard is designed to provide the following functionality:
The following DSDL components can be used to describe
documents conforming to the WebSGML subset of ISO/IEC 8879:
Possible DSDL
element/attribute |
Defined in clause |
Equivalent ISO
8879 Construct |
Equivalent XML
DTD construct |
Equivalent XML
Schema element |
<attribute |
|
[143] attribute definition |
AttlistDecl |
<attribute |
<attribute
|
|
[144] attribute name |
Name |
<attribute
|
<attribute
|
|
[35] declared value |
AttType |
<attribute
|
<attribute
|
|
[147] default value |
DefaultDecl |
<attribute
|
<attribute
|
|
[147] default value ["FIXED"] |
DefaultDecl |
<attribute
|
<attribute
|
|
[147] default value
["IMPLIED"|"REQUIRED"] |
DefaultDecl |
<attribute
|
<characterSet
|
|
[173] character set description
|
EncodingDecl |
encoding
|
<comment |
Should this be <annotation? |
[91] comment declaration |
Comment |
N/A |
<data
|
|
From Relax-NG |
N/A |
<simpleType
|
<element |
|
[116] element declaration |
elementdecl |
<element |
<element
|
|
[30] generic identifier |
Name |
<element
|
<element
|
Do we need this? Does it need to conflate with type? |
[125] declared content |
contentspec |
<any |
<element
|
|
From Relax-NG |
N/A |
<element
|
<element
|
|
Extension based on W3C XML Schema that generalizes the specifically named options provided in Relax-NG |
N/A |
<element
|
<element
|
|
From W3C XML Schema (Relax-NG uses a separate ref element) |
N/A |
<element
|
<externalEntity |
|
[108] external entity specification |
GEDecl |
N/A |
<externalEntity
|
|
[102] entity name |
Name |
N/A |
<externalEntity
|
|
[73] external identifier |
ExternalID |
N/A |
<externalEntity
|
|
[41] notation name |
NDataDecl |
N/A |
<group
|
|
[127] model group (with modifications based on W3C XML Schema that generalize the specifically named options provided in Relax-NG) |
children (as modified by W3C XML Schema) |
<complexType
|
<inclusion name |
|
[104] parameter entity name |
PEDecl |
N/A |
<inclusion
|
Do we still need to separate out the definition of external parameter entities from their call, or should we move these two properties to the <include element? |
|
PEDef |
N/A (moved to the import request) |
<include
|
|
[60] parameter entity reference |
PEReference |
<import (but unnamed, with direct reference to the source, see above) |
<localEntity
|
|
[101] entity declaration |
GEDecl |
N/A |
<localProcess
|
Do we need this? |
[44] processing instruction |
PI |
N/A |
<markedSection
|
|
[93] marked section declaration |
CDSect |
N/A |
<notation |
|
[148] notation declaration |
NotationDecl |
<notation |
<notation
|
|
[41] notation name |
Name |
<notation
|
<notation
|
|
[149] notation identifier |
ExternalID |
<notation
|
<permittedValue
|
|
Based on W3C XML Schema enumeration and Relax-NG value elements. Extends [145] declared value [name token group] to constrain contents of text fields as well as attribute values |
Enumeration (as extended to element content by W3C XML Schema and Relax-NG) |
<enumeration
|
<schema
|
Do we need a public identifier? |
[110] document type declaration [external identifier] |
doctypedecl External ID |
<schema + <import or <include |
<schema
|
|
[111] document type name |
doctypedecl Name |
N/A |
<text |
|
[47] character data |
#PCDATA |
|
The following extensions could be made if it is decided that
DSDL should be able to express all constructs in SGML document instances as
well as the WebSGML subset.
Possible DSDL
element/attribute |
Defined in
clause |
Equivalent ISO
8879 Construct |
<applicationInfo |
Do we need this? |
[199] application-specific information |
<attribute source |
|
[147] default value
["IMPLIED"|"REQUIRED"| |
<capacitySet publicIdentifier |
Do we need this? |
[180] capacity set |
<characterDescription |
Do we need this? |
[176] character description |
<characterDescription startingFrom |
Do we need this? |
[177] described character set number |
<characterDescription for |
Do we need this? |
[179] number of characters |
<characterDescription becomes |
Do we need this? |
[178] base character set number, "UNUSED" or literal |
<externalEntity
|
Do we need this? |
[109] entity type |
<externalEntity
|
Do we need this? should the data attributes be defined as the contents of the entity defintion? |
[149.2] data attribute specification |
<dataTagGroup elementName |
Do we need this? Could the data tag details somehow be added directly to the element declaration? |
[133] data tag group |
<dataTagGroup paddingTemplate |
|
[137] data tag padding template |
<dataTagTemplate |
|
[136] data tag template |
<delimiterAssignment name literal |
Do we need this? |
[191] general delimiters |
<delimiters |
Do we need this? |
[190] delimiter set |
<element documentTypes |
|
[28 document type specification |
<element end-character |
Do we need this? |
[17] NET-enabling start-tag |
<element mixed |
Do we need this? |
[25] mixed content |
<element omitStart |
|
[123] start-tag minimization |
<element omitEnd |
|
[124] end-tag minimization |
<element rankStem |
Do we need this? |
[120] rank stem |
<element rankSuffix |
Do we need this? |
[121] rank suffix |
<element unclosed |
Do we need this? |
[17] unclosed start-tag |
<exclusions elementNames |
|
[140] exclusions |
<explicitLink sourceDocType resultDocType |
|
[158] explicit link specification |
<features |
Do we need this? |
[195] feature use |
<functionChars |
Do we need this? |
[186] function character identification |
<idLinkSet |
|
[168.1] ID link set declaration |
<implicitlink sourceDocType |
|
[157] implicit link source |
<inclusions elementNames |
|
[139] inclusions |
<linkRule sourceElementNames |
|
[163.1] link rule {source element specification] |
<linkRule resultElementNames |
|
[166.1] explicit link rule {result element specification] |
<linkSet name |
|
[164] link set name |
<linktype |
|
[154] link type declaration |
<linktype name |
|
[155] link type name |
<linktype href publicIdentifier |
|
[73] external identifier |
<markedSection status |
|
[93] marked section declaration |
<namingRules |
Do we need this |
[189] naming rules |
<quantities |
Do we need this? |
[194] quantity set |
<reservedName changeFrom changeTo |
Do we need this? |
[193] reserved name use |
<schema sgmlDeclaration |
Do we need this? |
[171] SGML declaration |
<sgmlDeclaration name |
Do we need this? |
[171] SGML declaration |
<shortRefDelimiters |
Do we need this? |
[191] short reference delimiters |
<shortRefSet name |
|
[150] short reference mapping declaration |
<shunnedChars useControls |
Do we need this? |
[184] shunned character number |
<simpleLink |
|
[156] simple link specification |
<syntax
publicIdentifier |
Do we need this? |
[183] public concrete syntax |
<useLink linkSetName postLinkSetName |
|
[165] source element specification [USELINK] |
<useMap name elementNames |
|
[152] short reference use declaration |