TITLE: |
ISO/IEC TR 9573-11 2nd edition |
SOURCE: |
Project Editor |
PROJECT: |
15.07.02.11.00 |
EDITOR: |
Yushi Komachi and Samarin Alexander |
STATUS: |
Text for DTR processing |
ACTION: |
DTR processing |
DATE: |
2002-05-22 |
DISTRIBUTION: |
SC34 and Liaisons |
REFER TO: |
|
REPLY TO: |
|
DTR text (2002-05-22) to
The 1st edition of ISO/IEC TR 9573-11:1992 was published in 1992-09-15. Considering the actual works in the ISO system for standard development, the TR was modified to be the "ITSIG exchange DTD" by activities of the SGML group of ISO/ITSIG (Information Technology Strategies Implementation Group).
Responding to the user requirements for interchanging standards documents in XML environment, ITSIG instructed its XML project to develop XML DTD being based on the SGML DTD.
This 2nd edition of ISO/IEC PDTR 9573-11 includes both the SGML and XML DTDs. The clauses except 2, 13, Annex C, D, E are almost identical to the corresponding clauses of "ITSIG exchange DTD, version 0.93, 1998-03-03".
This document describes the harmonized set of ISO DTDs which are designed for use within the ISO system for standards development. The base DTD of this set was created by ITSCG AWG1 during 1994 and 1995. At present, the SGML group of ITSIG is responsible for maintenance of these DTDs.
The following standards contain provisions that, through reference in this text, constitute provisions of this Technical Report. At the time of publication, the editions indicated are valid. All standards are subject to revision, and parties to agreements based on this Technical Report are encouraged to investigate the possibility of applying the most recent editions of the standards indicated below. Members of IEC and ISO maintain registers of currently valid International Standards.
ISO 8879-1986, Information processing — Text and office systems — Standard Generalized Markup Language (SGML)
ISO/IEC 10744:1997, Information technology — Hypermedia/Time-based Structuring Language (HyTime)
The lifecycle of standards comprises three phases: development, publication, and refuse & dissemination. Each phase has different requirements for handling the standards as electronic documents.
The development phase requires simplicity of tools for capturing the content. The experts outside the ISO/CS may use a wide variety of very different tools. The presentation of the standards is of secondary importance; the content and the structure (if feasible) are of primary importance.
The publication phase requires flexibility and automation of routine work. The result must be of a professional publishing quality and enriched for further electronic dissemination.
The refuse & dissemination phase requires independence of a document instance in electronic form from the processing environment. No sophisticated application should be required to reprocess the document. SGML applications are already being used for standards production at the ISO/CS and by many member bodies, and also for the delivery of some commercial electronic products. Some working groups have good experience with SGML. So, it is necessary to define DTDs for use as the interfaces between the various phases.
We know that it is not practical to provide a single DTD for everything. It was decided therefore to have a "common" DTD for exchange by published standards and several "in-house" DTDs, i.e. the member bodies could use their own variations of the base DTD (see figure 1).
Figure 1 — SGML applications in our business process
Standards from different member bodies look different and different publishing software is used. By preference, the difference between the exchange DTD and the individual variations should be minimized. Equivalent transformations of DTD fragments will help us to overcome software limitations. For example, we can replace in some cases a "simple" element type by an attribute and vice-versa. Also it is possible to change the names of element types and attributes for the in-house DTDs, e.g. to translate them to German.
To simplify the maintenance of the several very close DTDs, most of them are implemented as a derivation from one base DTD. The authoring DTDs are either simplified versions of the base DTD or widely-used DTDs such as HTML 3.2.
The guidelines are ordered by the priority (from the highest to the lowest) agreed during the ITSCG AWG1 meeting in Berlin:
The general guidelines are:
We expect that there is small number of differences between base and in-house DTDs. For example, the base DTD uses CALS table model, while an in-house DTD uses the application-dependent table model. We would like to handle simple differences by one base DTD which is included with some variations into in-house DTD. The base DTD is considered for exchange by published standards. Of course, document instances shall be converted from base to in- house.
DTD. SGML technique to make such simple modifications is described below. The following DTD contains only one element type abc which is define through parameter entity abcname:
The following DTD uses the previous one, but redefines the name of the element type:
This technique is limited, but it is enough in the most our cases.
Replacing (via parameter entity std.table) of the table model is demonstrated in the following example:
International Standards are complex technical documents. Their structure, presentation and some guidelines on the contents are given in the ISO/IEC Directives, Part 3, 1997. The table 1 contains the list of components (small building blocks) could be found in the standards. As a rule, a component contains text, special characters, drawings, etc. as well as other components. For example, paragraph and highlighted phrase are components. A component is presented in the DTD as an element type.
Component | Display | Inline | Referential | Float |
---|---|---|---|---|
paragraph | + | |||
list items (entries) | + | |||
note | + | |||
example | + | |||
warning, remark | + | |||
phrase | + | |||
cross-reference | + | + | ||
formula | + | + | ||
footnotea) | + | + | + | + |
figure | + | + | ||
table | + | + |
To present the logical structure of the document, the components are placed into so-called "containers". For example, a list is a (ordered or unordered) group of list items (or entries). The clauses and subclauses are another type of containers. They contain components and another containers, but not a text.
A standard contains normative and informative provisions. The ISO/IEC Directives, part 3 explains "who is who". To reflect this business rule, some element types in the DTD contain an attribute status which can be either normative or informative. If an element type does not contain this attribute then the status of the element is inherited from the parent element in the document instance.
The first question is which displayed components are allowed inside a paragraph? And so on. Try to answer on all these questions in the table 2.
paragraph | list | note | example | warning | footnote | floats | displayed formula | |
---|---|---|---|---|---|---|---|---|
paragraph | no | yes | yes | yes | no | yes | yes | yes |
list item | yes | yes | yes | yes | no | yes | yes | yes |
notea) | yes | yes | no | no | no | no | yes | no |
example | yes | yes | no | no | no | yes | yes | yes |
warnig | yes | yes | no | no | no | yes | no | no |
footnotea) | yes | yes | no | no | no | no | no | no |
floats | yes | yes | yes | yes | yes | no | no | yes |
(-//::ISO::CS//DTD std::base::0.93//EN)
The name on each element type is taken from ISO 8879:1986 and/or ISO/IEC TR 9573-11:1992 for simplicity. Possible approach is to follow architectural forms and "rename" an element type in the following way:
Most of element types have attribute "ID" for internal cross-references. The element types which generate an additional text, e.g. autonumbering, have attribute "GTEXT" which keeps the generated text, e.g.
The top-level structure of a standard
The top level structure is very in-house dependent. Imagine a national standard which is approved euronorme, which is approved international standard. Such a document may contain several forewords, endorsement notes, copyrights, etc. We decided to keep top level structure as simple as possible, but bearing in mind the interchange aspect.
This is a root element in a standard.
An attribute "LANGUAGE" specifies the language of the standard.
With two parameter entities (std.name and std.model) one can change the element type name and adjust its content model. See clause 8 for detail.
This element type is described in clause 6.
This element type is used as a placeholder for the first cover page. There is no structure for title pages which are different for each organization.
The content model is simple text.
It is considered that the title page is constructed from the information which is available in the profile. This is mandatory element, i.e. a standard shall contain one title page.
This element type is used as a placeholder for the table of contents generated by the publishing system.
The content model is simple text.
The attribute "LEVEL" permits control over the level to which the subclauses are entered in the table of contents. The default value is "1" which means only the clauses from the body of the standard and annexes.
This is an optional element, i.e. a standard may contain one table of content.
This element type is used to identify foreword clause of the standard.
The content model is rather simple. It cannot contain any subdivisions and any numbered components like notes, figures, etc. Only paragraphs and lists are allowed in foreword.
This is mandatory element, i.e. a standard shall contain one foreword. Almost whole foreword is a set of boilerplate texts.
This element type is used to identify introduction clause of the standard.
The content model is rather simple. It can contain any subdivisions - unfortunately, ISO/IEC Directives does allow this. It can't contain any numbered components like notes, figures, etc.
This is an optional element.
This element type is used to identify the body of the standard.
A body is structured by top-level (<H1>) titled numbered subdivisions of text, i.e. by clauses. There are four special clauses (<SCOPE>, <CONF>, <REFS> and <DEFS>). The body may contain a "warning statement" at the beginning.
The attribute "COLS" specifies the number of columns. Only one column and two column (default) page layouts are recommended for monolingual standards. The attribute "STATUS" has fixed value "NORMATIVE".
This is mandatory element. A standard shall have a body. With a parameter entity (body.model) one can change the content model. See clause 8 for detail.
This element type is used to identify a normative annex.
An annex contains either a mixture of basic components or at least two top-level subdivisions of text.
The optional attribute "COLS" specifies the number of columns. Only one column and two column (default) page layouts are recommended for monolingual standards. The attribute "STATUS" has fixed value "NORMATIVE".
This optional and repeatable element, i.e. a standard may contain several normative annexes.
This element type is used to identify an informative annex.
An annex contains either a mixture of basic components or at least two top-level subdivisions of text.
The optional attribute "COLS" specifies the number of columns. Only one column and two column (default) page layouts are recommended for monolingual standards. The attribute "STATUS" has fixed value "INFORMATIVE".
This optional and repeatable element, i.e. a standard may contain several informative annexes.
This element type is used to identify a special informative annex with the bibliography.
Some text and the list of bibliographical references are allowed.
The optional attribute "COLS" specifies the number of columns. Only one column and two column (default) page layouts are recommended for monolingual standards. The attribute "STATUS" has fixed value "INFORMATIVE".
This optional element. It is the last annex.
This element type is used as a placeholder for the index generated by the publishing system.
The content model is simple text.
This is an optional element, i.e. a standard may contain one index.
This element type is used as a placeholder for the last cover page. There is no structure for the last cover pages which are different for each organization.
The content model is simple text.
It is considered that the last cover page is constructed from the information which is available in the profile. This is mandatory element, i.e. a standard shall contain one last cover page.
Six nested levels of subdivisions of text are allowed. The top level of subdivision is clause. Another levels are subclauses. The subdivisions are always numbered. They may or may not contain a title.
These element types are used to identify the clauses and subclauses of the standard.
For each titled subdivision contain a title <HT>, optional comment for title <CT> (except for dedicated ones), and a mixture of the basic elements - e.g. paragraphs, notes, various types of lists, figures, tables, etc. - or a lower level of text subdivision (either titled or untitled). These element types are typical containers.
The content model is reflected to the several editorial rules. At first, the empty subdivision is not allowed. At second, lower level subdivision shall appear minimum two times. And, at third, hanging text (i.e. a text without a title) is not allowed. In general, the content model looks as: (ht, ct?, ((text)+|(Hx,Hx+))).
As we decided to remove inclusions, the float components (e.g. figure, table, index entry) should be incorporated into the content model. The float components are allowed in the running text, between displayed components and between subdivisions of text. In the following example, there are three possible places for the float elements. All these positions are allowed.
There is a special case - terminology list, which can be alone in a subdivision. Finally, the content model looks terrible.
EXAMPLE 1
This element type is used to identify the title of titled subdivisions of text.
Only a running text and simple (no exotic) inline components are allowed inside the title.
This is mandatory element.
This element type is used to identify the comment to the title of titled subdivisions of text.
Only a running text and simple (no exotic) inline components are allowed inside the title comment.
This is optional element.
These element types are used to identify untitled numbered subdivisions which are very similar to the titled numbered subdivisions of text.
For each untitled subdivision contains a mixture of the displayed components - e.g. paragraphs, notes, various types of lists, figures, tables, etc. - as well as a lower level of untitled text subdivision. Again, these element types are containers.
Do not treat the untitled numbered subdivisions of text like the "numbered paragraphs". The subdivisions are nested, while the paragraphs are sequential.
This element type is used to identify the special clause titled "Scope" in English. No more differences with the element type <H1>.
This element type is mandatory and is the first clause in the body.
This element type is used to identify the special clause title "Conformance" in English. No more differences with the element type <H1>.
This element type is optional and is the next after the scope.
This element type is used to identify the special clause which contains the normative references used in the standard.
This element type consists of the boilerplate text followed by the list of references (see 5.5.15).
This element type is optional and is the next clause after the conformance, if any.
This element type is used to identify the special clause which contains the definitions used in the standard.
This element type consists of the boilerplate text followed by terminology list (see 5.6.1).
This element type is optional and is the next clause after the normative references.
This element type is used for the several purposes (badly inherited). At first, to identify the display component which is a block of the text and, at second, to identify a container for this text block and another display components. In the last case it serves as an additional lowest level subdivision of text.
This element type can contain a running text (with inline components), float components, all displayed components except paragraph itself.
This element type is used to identify the notes.
Note shall not contain footnotes.
The attribute "NUMBER" controls the numbering of a note. The attribute "STATUS" has fixed value "NORMATIVE". That means that this element is always normative.
A note is attached to the some part of a document, either to a displayed component like paragraph, or to a subdivision. In the first case the note shall be inside the paragraph. In the second case, the note shall be the last "child" in the subdivision.
This element type is used to identify examples.
This element may contain a running text and another display components.
The attribute "NUMBER" controls the numbering of an example.
This element type is used to identify a displayed component with "manually formatted text" such as a computer source, listing, etc.
It is recommended to use proper entities to put nonbreakable spaces and the line breaks inside a text to achieve the desired formatting.
This element type is used to identify the display components with warnings, cautions, remarks, etc. to draw readers' attention for a particular text. The reasons may be different - precautions, security, etc. which are distinguished by an attribute "TYPE".
This element may contain a running text and another display components.
The attribute "TYPE" has the several values for the different presentations.
Actually such warnings are attached to some piece of a document, e.g. a subdivision (if they are outside a displayed component), or, even, the entire document (general warning in the beginning of the body of the document).
These element types are used to identify two traditional lists.
The list consists of a mixture of the list items and list paragraphs.
The attribute "FORMAT" controls different presentations of lists. For <OL>:
alpha items are enumerated by lower case Latin characters;
arabic items are enumerated by Arabic digits;
roman items are enumerated by roman digits;
auto automatic selection of the enumeration method.
For <UL>:
bullet items are prefixed by bullet symbol;
emdash items are prefixed by emdash symbol;
sl items are not prefixed;
auto automatic selection of prefix.
The lists may be nested up to four levels.
EXAMPLE The following markup
resulting is
a) the x-orientation ...;
b) the x-nominal surface stress;
but not on:
c) the y-orientation ...;
d) the y-nominal surface stress;
This element type is used to identify the individual items within ordered and unordered lists.
This element may contain a running text and another display components.
This element type is used to identify the paragraph which breaks the lists.
This element may contain a running text and another display components.
This element type is used to identify the description list which is a list of pairs of a term and a definition.
This element contains only description list entries.
The attribute "FORMAT" controls different presentations of list:
ol presentation simulates an ordered or unordered list;
varl presentation simulates a old variation list;
syml presentation simulates a old symbols list;
auto default presentation.
The description list can mimic complex cases of the ordered list. The interruption of ordered list may be more com- plex and the part of the list could be repeated several types (so-called "branching").
EXAMPLE 1
resulting in
Tex | The batch typesetting system developed by Prof. D. Knuth. The batch typesetting system developed by Prof. D. Knuth. The batch typesetting system developed by Prof. D. Knuth. |
DCF | The batch typesetting system from IBM. The batch typesetting system from IBM. The batch typesetting system from IBM. The batch typesetting system from IBM. |
EXAMPLE 2 See ISO 75-1:1993, clause 11:
resulting in
h) The orientation ...;
i) The nominal stress;
resulting in
a) reference to this part ...
b) to m) see ISO 6721-1:1994, clause 12;
n) if a fixed ...
This element type is used to identify an entry in the description list. This is just a container.
This element contains a pair of a term and a definition.
This element may appear only in the description list.
This element type is used to identify a term inside description list entry.
This element contains simple text.
This element type is used to identify definition inside description list entry.
This element contains running text or others display components.
This element type is used to identify the bibliography list.
This element contains only bibliographical entries.
This element should appear only in the bibliography.
This element type is used to identify an entry in the bibliographical list. This is just a container.
This element contains only references on external documents.
This element may appear only in the bibliography list.
This element type is used to identify the list of normative references.
Only several entries are allowed inside this element type.
This element should appear only in the clause for normative references (see 5.4.8).
This element type is used to identify an entry in the reference list. This is just a container.
This element contains only references on external documents.
This element may appear only in the reference list.
EXAMPLE The reference list.
Bla-bla-bla
This element type is used to identify the terminology list. This element type is a simplified version of similar element type to be used for terminological standards. This element type follows ISO 10241:1992.
Terminology list is the set of concepts. The concepts may be organized in the different ways. There are types of relationships between concepts: hierarchy (a concept contains one or many subcontracts) and grouping (many concepts have are grouped with one title). These two type of relationships may be intermixed. So, on each level of hierarchy we can find a mixture of single concepts (element types <C1>, <C2>, <C3>, <C4> and <C5>) and groups of concepts (element types <CC1>, <CC2>, <CC3>, <CC4> and <CC5>).
This element contains several concepts (element type <C1>) or several concept groups (element type <CC1>).
This element may appear inside element <DEFS> (special clause for definitions), in a clause or in an annex.
Historically, the terminology list "shares" numbering with subdivisions of text. To avoid the duplication of numbers, we need to restrict the use of <TL>. A numbered subdivision of text can contain either deeper subdivisions or a terminology list.
The heading of subdivisions and terms now have the different presentations. This allows to avoid confusion between them.
EXAMPLE 1 The terminology list without subdivision.
Bla-bla-bla
For the real example, see Annex A and Annex B which contain markup and presentation of the terminology for ISO/ IEC Directives, part 3, 1997.
This element type is used to identify a concept on the different level of hierarchy.
Each such element type contains one or several terms, their description (optionally) and deeper single concepts or concept groups.
This element type is used to identify a group of concepts under the same title on the different level of hierarchy.
Each such element type contains group title, text (optionally) and deeper concepts or concept groups.
This element type is used to identify a term within the terminology list entry.
This element contains a mixture a running text and simple inline components.
This element may appear only in the concept.
This element type is used to identify a description of a term within the concept.
This element contains a mixture a running text, inline and displayed components.
This element may appear only in the concept.
This element type is used to identify the phrases which are highlighted by different typographical presentations.
This element may contain a running text and all inline components.
The attribute "FORMAT" specifies the following presentations:
none presentation as for running text;
bold presentation for the bold text;
italic presentation for the italic text;
boldit presentation for the bold italic text;
uline presentation for the underlined text;
oline presentation for the overlined text;
code presentation for the proportional text;
This element type is used to identify a term.
This element may contain a raw text.
The attribute "LANG" specifies the language of the term, e.g. Latin for biological terms.
This element may appear in the running text.
This element type is used to generate the internal (i.e. within a document) cross-reference such as "see table A.1". From the different possibilities what may be a generated text, e.g. just "A.1" or "table A.1", the shortest form has been chosen.
This element is empty, because it is considered that a cross-reference will be generated by the authoring or publishing system.
The attribute "REFID" contains an ID of the element to be referenced. Some variations in the presentation are managed by the attribute "FORMAT". The attribute "GTEXT" contains the text generated by the publishing system.
It was decided to implement this element type as the HyTime architectural form "clink" (contextual link). It is very difficult to create HyTime-compliant document - "only computer will like to write HyTime documents." Right now, there is no publishing or authoring SGML-based system which can work in this way. But, there is document delivery software which can handle HyTime (e.g. SoftQuad Explorer, Panorama and EBT DynaText soon). Nevertheless, HyTime constructs in our case are more or less simple.
EXAMPLE 1 An internal reference on a table:
EXAMPLE 2 To reference on the several elements we need to use indirect addressing via the architectural form "nameloc" which contains the IDs of all referenced elements, e.g. an internal reference on tables:
This element type is used to insert external graphic file into the document. The external file is described as an exter- nal entity with the public identifier.
It is empty element.
The attribute "NAME" specifies the entity name. The attribute "POSITION" specifies the position of the graphic relative to the insertion point. With attribute value "INLINE", the left lower corner of the graphic is to be placed at the insertion point. With attribute value "BELOW", the top edge of the graphic is to be placed below the insertion point.
The graphics may be used to mimic an unusual character inside a running text and as a figure without title.
The commonly agreed notations should be used to specify graphic formats, for example:
EXAMPLE 1 Insert a file in encapsulated PostScript format is:
EXAMPLE 2 Insert a file
EXAMPLE 3 Insert a file
This element type is used to identify a footnote. It is allowed only in running text, because it produces a footnote reference (i.e. a marker) which should be attach to something. A footnote is informative (as opposite to normative) component of a standard.
Footnote may contain a mixture of the running text and another displayed elements except notes and footnotes.
The attribute "STATUS" has fixed value "INFORMATIVE".
But footnotes inside tables (or table notes) and figures (or figure notes) are differ. The use of footnotes in figures and tables will be discussed later.
In the DTD we want handle carefully the references on external documents to provide good input to the on-line hypertextual systems. All external documents (in the most cases normative documents) are given with their titles and publication dates either in "Normative references" clause or in the informative annex "Bibliography" (in this case we consider the reference as external). In the text such documents are indicated by their reference number only (in this case we consider the reference as internal). Such separation allows impose some control. To refer to another external documents we can use a bibliographical reference.
This element type is used to specify the bibliographical reference on an external document. Such a reference is just a text with a document title, etc.
Just a running text with highlighted phrases.
This element may be used in reference and bibliographical entries as well as in the running text.
This element type is the hypertextual reference to an external document as entire. A reference may be represented by some text, which may be generated from in-house database. Actually the problem is focused on public identifiers for all our documents.
Just a running text with highlighted phrases.
This element type is the HyTime contextual link (architectural form "clink"). The attribute "REFID" is always the indirect reference (i.e. via the name location architectural form "nameloc").
EXAMPLE 1 The element <EXTDOC> always refers to an element <NAMELOC> which must contain an external entity.
Results in: "... ISO 8879: 1986, Information processing - Text and office systems - Standard Generalized Markup Language (SGML) ..."
EXAMPLE 2 The reference to the multi-part standard as a whole should mention all its parts. In the following example the last element <NAMELOC> serves for such a reference. This element contains all IDs of the elements <EXTDOC> and defines so-called aggregated location. In such a way the element <XREF> points simultaneously on the several elements <EXTDOC>.
EXAMPLE 3 A reference to particular elements of an external document, e.g. "conformément à 3.1.1 de l'ISO 1234:1984" is more complicated. The element <NAMELOC ID=x1234> defines the reference to the external document and the element <EXTDOC> uses this reference. The element <NAMELOC ID=y1234> defines the reference to particular element on this external document. The element <NAMELOC ID=z1234> defines the aggregated location which consists of the previous element <NAMELOC> and the element <EXTDOC>. The element <XREF> uses the last element <NAMELOC> to point simultaneously to the external document as a whole and to the particular element of this external document. This is a possible way to guarantee that a referenced external document is "mentioned" via an element <EXTDOC>.
This element type is used to identify an index entry which is attached to some place of a text.
Only a raw text.
We hope that the index entries will be generated by an authoring or publishing system. Sometimes index entries may be complex to support multilevel indexes, etc. But we decided to consider them as text, for now.
We would like to keep figures as simple as possible. Although ISO/IEC Directives, part 3 do allow logical grouping of figures (only two levels). For example, figures from ISO 10059-1:1992 are not implementable.
drawing drawing a) aaaaaa b) bbbbbbbb Figure 4 - Bla-bla-bla
Also it is not possible to do physical grouping of figures, i.e. put them together with proper alignment.
The element contains a title and a body (i.e. the rest).
This element type is used as container for a figure except its title. As a rule it contains drawing and some text which annotates this drawing.
The element contains a mixture of the running text, inline and displayed components except notes and footnotes.
This element type is used to identify title of a figure.
The element contains a mixture of the running text and inline, except notes and footnotes.
This element type is used to identify footnotes inside a figure to annotate a drawing. Such footnotes (actually, figure notes) are special because they contain the normative information while the ordinary footnotes contain informative information.
The element contains a mixture of the running text, inline and displayed components except notes and footnotes.
Markers shall appear on the drawing and in the beginning of a figure note. These markers can be the part of the drawing or may be attach to the drawing as annotations. In the first case we need to guarantee that we can use the same marker (e.g. a number in the circle) in the drawing and in the text. In the second case we meet with overlaying components one of them should be described in SGML.
As simple as possible with CALS tables. One can find their definition in MIL-M-28001B (26 June 1993). This document is available on several FTP sites. The content model for cells allows figures within tables.
The element contains a title, table header (optional), table body, and table footer (optional). This content model is simplified CALS model for tables.
This element type is used to identify a title of table.
The element contains a mixture of the running text and inline, except notes and footnotes.
This element type is used to identify footnotes inside a table. Such footnotes (actually, table notes) are useful to attach more text to a cell. They contain the normative information while the ordinary footnotes contain informative information.
The element contains a mixture of the running text, inline and displayed components except notes and footnotes.
These element types are used to identify the mathematic formulae.
We decided to use SGML markup for formulae. It was agreed to adopt DTD fragment for mathematics from ISO 12083:1994 as a replacement for the currently used mathematics described in ISO/IEC TR 9573-11:1992.
This element type is used to identify the tolerances. There is no yet a subDTD to markup all possible tolerances as presented in the ISO/IEC Directives, Part 3.The examples below are just a proposal.
Of course, such markup is pretty awful, but we are not going to reuse these tolerances in other applications.
The examples are in the following table.
Result | <TOL>markup</TOL> |
---|---|
6,3 mm 12,5 % | 6,3 mm <symdev>12,5 %</symdev> |
6,3 × (1 12,5 %) mm | 6,3 × (1 <symdev>12,5 %</symdev>) mm |
(30 ± 1,5) mm | (30 <symdev>1,5</symdev>) mm |
3 mm+0,2 mm -15 % | 3 mm <uppdev>+0,2 mm</uppdev><lowdev>-15 %</lowdev> |
Chemistry shall be processed as formulae.
EXAMPLE 1
EXAMPLE 2
EXAMPLE 3
EXAMPLE 4
For designation of internationally standardized items, please, see ISO/IEC Directives, Part 2. Generally, a designation (element type <DESIGN>) consists of a description (textual?), International Standard Number Block and one or more Data Blocks. The current definition of the designation system forces to make a presentational model, e.g. the number of a relevant part of Standard shall be indicated in Data Block, while the number of standard shall be put in International Standard Number Block. The proposed content model joins this information in the element type <EXTDOC> to allow to reference on the standard. But the visual result is the same.
Individual Data Block can be "linked" to their descriptions, if any.
See the markup of the example E.7.1 from the Directives, Part 2.
EXAMPLE
... main scale 58 %deg;C to 82 °C :
In this designation the elements have the following meaning:
There is a need for text block displayed component to cover some exotic cases, e.g. long quotation.
It is proposed to add attribute(s) to keep the graphical presentation(s) of a formulae.
It is proposed to define the figure body as a table.
In the ITSIG exchange DTD for Standards the bibliographical, monitoring, and other information about a document (i.e. metadata) must be attached to the document instance. Roughly speaking, it is catalogue-like description of a document. In standard development business such information appear in the several types of applications:
As a rule, this information resides in a particular PMDB and is requested for others PMDBs and applications. New ISONET manual describes more that hundred fields which may be potentially exchanged. The ISONET manual gives two representation - conventional and SGML. The least can be used in ITSIG exchanged DTD to encode metadata. Historically, we call document's metadata as "profile".
This article describes the implementation of profile in the SGML project. As it was decided on the SGML group meeting on December 1997, we have to progress with the use of new ISONET manual for exchange of metadata. At regional and national SDOs there are procedures when an International Standard is endorsed as their standard. It is very important that profile of International Standard would be easily reused by Member bodies' SGML-based production system.
We already understood, that it is not convenient to use the same profile in the different applications. For example, ISO reference number is usually presented as "ISO 1234-1:1994", while for SGML publishing system it should be encoded as "ISO 1234-1 : 1994". Another example, special procedure as fast-track is a just a flag in PMDB, but it is a particular boilerplate text to be inserted on the cover page for publishing system.
We separate profile extracted from databases (database-oriented), which we have to encode with conformance with new ISONET manual and profile to be used by publishing system (document-oriented) which should be convenient for publishing needs. Easy extensibility of profile should be taken into account in the implementation.
It is important to emphasize that only database-oriented profile should be used for interchange of data between the different organizations involved. Document-oriented profile is internal choice of each organization. Of course, it may be coincident with database-oriented profile in the simple cases.
In the example of database-oriented profile only four groups are used (see clause A.1 for the DTD):
00 data elements for the file label;
10 data elements for projects;
20 data elements for products (documents in our case);
40 data elements for committee information.
EXAMPLE 1
ISO
1
:ST
ZX
ISO/TC
SC
There are several changes to the latest edition of ISONET manual:
As a rule, database-oriented profile contains a lot of information for the main document and just a few elements for the referred documents. An example for referred document is below:
EXAMPLE 2
ISO
ST
Document-oriented profile contains the metadata about the document itself (i.e. the main document) and it may contain the metadata about the documents which are referred in the main document. The database-oriented profile for a particular document has to be embedded in the document-oriented profile. So, the document-oriented profile structure is look in the following way:
At present, document-oriented profile used at ISO/CS is very simple - only three elements are required: <FIELD> to keep a single piece of information, <ISOREF> to keep references like "ISO 1234:1999 - Dummy title", and <XP> to reuse content of previous elements. A unique ID is assigned for the each element.
An example of document-oriented profile for referred document is below.
EXAMPLE 1
Note, that now all elements have SGML unique identifier for internal references. The additional element type (<XP>) is designed for to "copy something from profile by reference". For example, to the reuse of the publication year is "... <XP TYPE=BODY REFID='R0.6.1'> ... ".
The element type (<ISOREF>) is used to define the different types of a reference on ISO Standard. The reference may be a truncated (in different ways) reference number, a title, or a combination of both. Everything is possible with the element type <ISOREF> which contains four attributes ("VARA", "VARB", etc.) for different variations of the reference number and the content with the title of the Standard (see below):
Note in that definition, the element <XP> is already used to "copy" the title of the standard. Actually, this element can refer not only on the content of an element in profile, but attributes of an element as well. Any combination of these two options is possible also. So the reference like this "... <XP TYPE=ISOREF REFID="R0.E"> ... " produces " ... ISO 10013: 1995, Guidelines for developing quality manuals ...". Of course, this is a feature of the publishing system at the ISO/CS. In another publishing system, implicit generation of all variations may be necessary.
ISO Standard contains many boilerplate texts in may variations. There is some logic in presentation (English or French), in boilerplate texts (JTC1 or not), etc. To provide conditional generation of a document in the SGML-based publishing system we use so-called marked section. The transformation procedure from database to document profile also generates the several parameter entities which are used for conditional processing of the document. Below is an example of a boilerplate text fragment for foreword clause:
The separation of profile into database-oriented and document-oriented has the following advantages:
The base DTD contains the several groups of parameter entities to allow some modifications of the base DTD. Some entities are intended for internal organization of the base DTD. The derived DTD is usually defined as some modifications of the parameter entities, the base DTD and additional element types.
The parameter entity std.name is the general identifier (i.e. name) of the top element type in the base DTD.
The parameter entity std.model is the content model for the top element type in the base DTD.
The parameter entity body.model is the content model for the element type <BODY>.
The parameter entity std.profile is the complete definition of the element type <PROFILE>.
The parameter entity std.web is the complete definition of the element type <LINKS>.
The parameter entity std.entity is the definition of all character entities used in base DTD. Two set of character entities are predefined. Each of them contains several hundreds of character entities. The first set is based on ISO 8879:1986 and it has the public id "-///ISO/CS//DTD std::entity:8879//EN". The second set is based on ISO/IEC TR 9573 and has the public id "-//ISO/CS//DTD std::entity:9573//EN".
The base DTD uses the several groups (display, inline, include and float) of element types. Each group has base part, which is the default definition, and the local part, which is local addition. The combination of these two parts is used in the derived DTD.
The parameter entity base.list contains the element types for general list components.
The parameter entity base.note contains the element types for note components.
The parameter entity base.xmp contains the element types for example components.
The parameter entities base.display, local.display, and display are, respectively, base, local, and combined definitions of the group of element types for display components.
The parameter entities base.inline, local.inline, and inline are, respectively, base, local, and combined definitions of the group of element types for inline components.
The parameter entities base.float, local.float, and float are, respectively, base, local, and combined definitions of the group of element types for float components.
The parameter entities base.include, local.include, and include are, respectively, base, local, and combined definitions of the group of element types for include components.
The parameter entity m.ph is the content model of "phrase". It is used for inline components.
The parameter entity m.par is the content model of "paragraph". It is used for display components.
The parameter entity m.pseq is the content model of "sequence of paragraphs". It is used for containers.
The parameter entity m.entry is the content model of "table cell". It is used in table subDTD.
The parameter entity std.page may be use to add attributes to the element types: <BODY>, <ANNEXN>, <ANNEXI>, and <ANNEXBL>. Usually, an attribute for the page layout is added.
The parameter entity std.xref may be use to add an attribute to the element type <XREF>. Usually, an attribute for the cross-reference format is added.
The parameter entities base.df and base.f are used to specify the names of displayed and inline, respectively, element types for formulae component.
The parameter entity std.math is the definition of the formulae component.
The default definition of formulae component contains two text-only element types: <DFORMULA> and <FORMULA>.
The mathematic subDTD from ISO 12083:1995 has public id "-//ISO/CS//DTD m12083//EN". To use this subDTD, the derived DTD should define the entity std.math, for example:
The parameter entities fig.model, fig.include, and fig.exclude are used to specify the content model, in-clusion, and exclusions, respectively, of element type <FIGURE> for figure component.
The parameter entity fig.title is the name of the element for figure title.
The default definition of figure component is the following:
The parameter entities tab.model, tab.include, and tab.exclude are used to specify the content model, in-clusion, and exclusions, respectively, of element type <TABLE> for table component.
The parameter entity tab.title is the name of the element for table title.
The default definition of table component is based on CALS subDTD for tables. This subDTD has public identifier "-/ /USA-DOD//DTD MIL-M-28001B/table//EN".
To simplify the maintenance of the several very close DTDs, most of them are implemented as a derivation from one base DTD. Using the base part and local part of the base DTD, derivation of DTDs can be carried out.
Authoring DTD for SGML native tools is a simplified version of the base DTD. The following simplification is pro- posed: less number of tags, no formulae, no exotic element types.
Another authoring DTD is designed for use with word processors which can save a document in SGML format (e.g. SGML Author for Word by Microsoft). Such a DTD is a simplified twice. At the first, the number of element type is reduced to simplify the authoring. At the second, the structure is flatten, i.e. an element type for display component may not contain another element type for display component. Thus a paragraph may not contain a note. This simplification reflects the current functionalities of word processors. Another simplification is avoiding the nesting. Instead of general element type for nested list (ordered or unordered), four element types are introduced: list level1, list level 2, etc.
The current complexity of HTML is enough to use it for presentation of standards. The implementation (i.e. mapping) of the base DTD into HTML 3.2 has to be defined.
Take into account ISO/DIS 12200 and ISO 10241:1992 although they contain ambiguities and drawbacks. Automatic separation of a multilingual version into monolingual versions.
We need to create a DTD for multilingual standards.
Automatic merging of monolingual versions into a single multilingual version.
NOTE: Also it may be common part, e.g. illustrations.
Exactly the same SGML structure for each monolingual part (upto which level of details).
The exceptions for the previous requirement should be allowed.
Presentation requirements may affect the DTD.
The public entities used in the set of ISO DTDs for standard are given in the following table.
Public Id | Comment |
---|---|
-//ISO/CS//DTD std::entity::8879//EN | All character entries in accordance with ISO 8879:1986 |
-//ISO/CS//DTD std::entity::9573//EN | All characters entities in accordance with ISO/ IEC TR 9573 |
-//ISO/CS//DTD std::in::92//EN | ISO/CS publishing DTD, version 0.92 |
-//ISO/CS//DTD std::base::92//EN | Base DTD, version 0.92 |
-//ISO/CS//DTD std::exchange::92//EN | ITSIG exchange DTD, version 0.92 |
-//ISO/CS//DTD std::in::93//EN | ISO/CS publishing DTD, version 0.93 |
-//ISO/CS//DTD std::base::92//EN | Base DTD, version 0.93 |
-//ISO/CS//DTD std::exchange::93//EN | ITSIG exchange DTD, version 0.93 |
-//ISO/CS//DTD m12083//EN | Mathematics subDTD from ISO 12083:1995 |
-//USA-DOD//DTD MIL-M-28001B/table//EN | Table subDTD from CALS |
-//ISO/CS//DTD isonet::0.01//E | ISONET based DTD for database profile |
The XML-DTD is derived from the above SGML structure.
Major modifications for XMLization are:
The XML-DTD is modularized for a more feasible DTD exchange. The modularization is based on the logical structure of the original SGML-DTD (see Figure 2).
ITSIG/stdex94.dtd (MATH, ARTWORK, Figure, Terminology) +--- ITSIG/m12083.dtd (Formula) +--- ITSIG/tl93a.dtd (Terminology) +--- ITSIG/isonet10.dtd (ISONET) +--- ITSIG/se9573.dtd (Entities) | +--- ent9573/isolat1.ent, ent9573/isolat2.ent, ent9573/isonum.ent, | +--- ent9573/isodia.ent, ent9573/isopub.ent, ent9573/isobox.ent, | +--- ent9573/isotech.ent, ent9573/isogrk1.ent, ent9573/isogrk2.ent, | +--- ent9573/isogrk3.ent, ent9573/isogrk4.ent, ent9573/isocyr1.ent, | +--- ent9573/isocyr2.ent, ent9573/isoamsa.ent, ent9573/isoamsb.ent, | +--- ent9573/isoamsc.ent, ent9573/isoamsn.ent, ent9573/isoamso.ent, | +--- ent9573/isoamsr.ent, ent9573/isomfrk.ent, ent9573/isomopf.ent, | +--- ent9573/isomscr.ent, ent9573/isocs.ent +--- ITSIG/calstab.dtd (Table) +--- ITSIG/stdb94.dtd (NOTATION, parameter entities, structure, displayed elements, terminology list, figure, table, formula) |
All the XML-DTD module files are shown in Annex D. A file with the extension ".dtd" is a driver or a DTD module translated from the corresponding SGML-DTD file and a file with ".mod" is a new DTD file developed for the modularized XML-DTD representation.
(1) stdex.dtd [DTD Driver] (2) stdex-model.mod [Model Module] (3) stdex-profile.mod [Profile Module] (4) isonet10.dtd [Isonet Module] (5) se9573.dtd [Entity] (6) stdex-base.mod [Base Element Module] (7) stdex-notation.mod [Notation] (8) stdex-tpage.mod [Title Page] (9) stdex-lpage.mod [Last Cover Page] (10) stdex-toc.mod [Table of Contents] (11) stdex-index.mod [Index] (12) stdex-foreword.mod [Foreword] (13) stdex-intro.mod [Introduction] (14) stdex-body.mod [Body] (15) stdex-annex.mod [Annex] (16) stdex-nest.mod [Nested Subdivisions] (17) stdex-disp.mod [Displayed Components] (18) stdex-tl-simple.mod [Terminology List Simple Module] (19) stdex-tl.mod [Terminology List Module] (20) stdex-inline.mod [Inline Components] (21) stdex-artwork-simple.mod [Artwork Simple Module] (22) stdex-artwork.mod [Artwork Module] (23) stdex-ref.mod [Referential Components] (24) stdex-float.mod [Float Components] (25) stdex-figure-default.mod [Figure Default Module] (26) stdex-figure.mod [Figure Module] (27) stdex-table.mod [Table] (28) calstab.dtd [Cals Table] (29) stdex-specific.mod [Very Specific Components] (30) stdex-math-simple.mod [Math Simple Module] (31) stdex-math.mod [Math Module] (32) stdex-math-extension.mod [Math Extension Module] (33) stdex-tol.mod [Tolerance] (34) stdex-chem.mod [Chemistry] (35) stdex-listing.mod [Listing Module]
The referencing relationship of the modules is illustrated in Figure 3.
stdex.dtd [DTD Driver] +--- stdex-model.mod [Model Module] +--- stdex-profile.mod [Profile Module] | +--- isonet10.dtd [Isonet Module] | +--- se9573.dtd [Entity] +--- stdex-base.mod [Base Element Module] | +--- stdex-notation.mod [Notation] | +--- stdex-tpage.mod [Title Page] | +--- stdex-lpage.mod [Last Cover Page] | +--- stdex-toc.mod [Table of Contents] | +--- stdex-index.mod [Index] | +--- stdex-foreword.mod [Foreword] | +--- stdex-intro.mod [Introduction] | +--- stdex-body.mod [Body] | +--- stdex-annex.mod [Annex] | +--- stdex-nest.mod [Nested Subdivisions] | +--- stdex-disp.mod [Displayed Components] | +--- stdex-tl.mod [Terminology List] | +--- stdex-inline.mod [Inline Components] | | +--- stdex-artwork.mod [Artwork] | +--- stdex-ref.mod [Referential Components] | +--- stdex-float.mod [Float Components] | | +--- stdex-figure.dtd [Figure] | | +--- stdex-table.dtd [Table] | | +--- calstab.dtd [Cals Table] | +--- stdex-specific.mod [Very Specific Components] | | +--- stdex-math.mod [Math] | | +--- stdex-math-extension.mod [Math Extension Module] | | +--- stdex-tol.mod [Tolerance] | | +--- stdex-chem.mod [Chemistry] | +--- se9573.dtd [Entity] +--- stdex-listing.mod [Listing Module] |
NOTE The values of gtext attributes are generated by a particular processor. Those attributes generation and rendering can be done, for example, by an XSL processor.
For non-ISONET base bibinfo description, the following document profile is prepared.
This is a complete SGML document which contains the terminology from ISO/IEC Directives, Part 3, 1997.
Boiler-plate text..
This is an example for tagging of the English terminology from ISO/IEC Directives, Part 3, 1997.
Boiler-plate text..
For the purposes of this part of the ISO/IEC Directives, the terms and definitions given in ISO/IEC Guide 2 (some of which are repeated below for convenience) and the following apply.
document, established by consensus and approved by a recognized body, that provides, for common and repeated use, rules, guidelines or characteristics for activities or their results, aimed at the achievement of the optimum degree of order in a given context
[ISO/IEC Guide 2:1996, definition 3.2]
standard that is adopted by an international standardizing/standards organization and made available to the public
[ISO/IEC Guide 2:1996, definition 3.2.1.1]
publication for which the required support for approval as an International Standard cannot be obtained, or for which there is doubt on whether consensus has been achieved
publication of work still under technical development, or where for any other reason there is the future, but not immediate, possibility of agreement on an International Standard
informative publication containing collected data of a different kind from that which is normally published as an International Standard
publication containing material on general matters related to international standardization
those elements setting out the provisions to which it is necessary to conform in order to be able to claim compliance with the standard
those elements that identify the standard, introduce its content and explain its background, its development and its relationship with other standards
those elements that provide additional information intended to assist the understanding or use of the standard
element the presence of which in a standard is obligatory
element the presence of which in a standard is dependent on the provisions of the particular standard
expression in the content of a normative document, that takes the form of a statement, an instruction, a recommendation or a requirement
[ISO/IEC Guide 2:1996, definition 7.1]
provision that conveys information
[ISO/IEC Guide 2:1996, definition 7.2]
provision that conveys an action to be performed
[ISO/IEC Guide 2:1996, definition 7.3]
provision that conveys advice or guidance
[ISO/IEC Guide 2:1996, definition 7.4]
provision that conveys criteria to be fulfilled
[ISO/IEC Guide 2:1996, definition 7.5]
developed stage of technical capability at a given time as regards products, processes and services, based on the relevant consolidated findings of science, technology and experience
[ISO/IEC Guide 2:1996, definition 1.4]
This annex is the result of interpretation of SGML- tagged example of terminology given in Annex B.
document, established by consensus and approved by a recognized body, that provides, for common and repeated use, rules, guidelines or characteristics for activities or their results, aimed at the achievement of the optimum degree of order in a given context
NOTE 1 Standards should be based on the consolidated results of science, technology and experience, and aimed at the promotion of optimum community benefits.
[ISO/IEC Guide 2:1996, definition 3.2]
standard that is adopted by an international standardizing/standards organization and made available to the public
[ISO/IEC Guide 2:1996, definition 3.2.1.1]
NOTE 1 International standards published by ISO and IEC are written with a capital "I" and "S", i.e. "International Standard".
publication for which the required support for approval as an International Standard cannot be obtained, or for which there is doubt on whether consensus has been achieved
NOTE 1 The content of a type 1 Technical Report, including its annexes, may include information that is of a normative nature, although the document itself is not of a normative nature.
publication of work still under technical development, or where for any other reason there is the future, but not immediate, possibility of agreement on an International Standard
NOTE 1 The content of a type 2 Technical Report, including its annexes, may include information that is of a normative nature, although the document itself is not of a normative nature.
informative publication containing collected data of a different kind from that which is normally published as an International Standard
NOTE 1 Such data may include, for example, data obtained from a survey carried out among the national bodies, data on work in other international organizations or data on the "state of the art" in relation to standards of national bodies on a particular subject.
publication containing material on general matters related to international standardization
those elements setting out the provisions to which it is necessary to conform in order to be able to claim compliance with the standard
those elements that identify the standard, introduce its content and explain its background, its development and its relationship with other standards
those elements that provide additional information intended to assist the understanding or use of the standard
element the presence of which in a standard is obligatory
element the presence of which in a standard is dependent on the provisions of the particular standard
expression in the content of a normative document, that takes the form of a statement, an instruction, a recommendation or a requirement
NOTE 1 These types of provision are distinguished by the form of wording they employ; e.g. instructions are expressed in the imperative mood, recommendations by the use of the auxiliary "should" and requirements by the use of the auxiliary "shall".
[ISO/IEC Guide 2:1996, definition 7.1]
provision that conveys information
[ISO/IEC Guide 2:1996, definition 7.2]
provision that conveys an action to be performed
[ISO/IEC Guide 2:1996, definition 7.3]
provision that conveys advice or guidance
[ISO/IEC Guide 2:1996, definition 7.4]
provision that conveys criteria to be fulfilled
[ISO/IEC Guide 2:1996, definition 7.5]
developed stage of technical capability at a given time as regards products, processes and services, based on the relevant consolidated findings of science, technology and experience
[ISO/IEC Guide 2:1996, definition 1.4]
See the attached file 1.
See the attached file 2.
See the attached file 3.