TITLE: | Report of the SC34 WG3 Meeting, Montreal, 1–4 August 2003 |
SOURCE: | Patrick Durusau, acting convenor |
PROJECT: | All WG3 projects |
PROJECT EDITOR: | All WG3 editors |
STATUS: | Working Group Report |
ACTION: | For information |
DATE: | 7 May 2003 |
DISTRIBUTION: | SC34 and Liaisons |
REFER TO: | Documents in internal references |
REPLY TO: | Dr. James David Mason (ISO/IEC JTC1/SC34 Chairman) Y-12 National Security Complex Bldg. 9113, M.S. 8208 Oak Ridge, TN 37831-8208 U.S.A. Telephone: +1 865 574-6973 Facsimile: +1 865 574-18964 Network: masonjd@y12.doe.gov http://www.y12.doe.gov/sgml/sc34/ ftp://ftp.y12.doe.gov/pub/sgml/sc34/ Mrs. Sara Desautels, ISO/IEC JTC 1/SC 34 Secretariat American National Standards Institute 25 West 43rd St - 4th floor New York, NY 10036 Tel: +1 212 642 4937 Fax: +1 212 840 2298 Email: sdesaute@ansi.org |
ISO/IEC JTC 1/SC34/WG3 meet for four very productive days in Montreal, Canada and tenders the following report of its activities:
A timetable for pending work was agreed upon by the WG and is set forth below.
Document | Responsible | Deadline |
---|---|---|
TMQL requirements 1.1 | Garshol | 2003-08-11 |
TMCL requirements & use cases | Moore, Nishikawa | 2003-08-11 |
CXTM requirements, final draft | Ahmed | 2003-08-18 |
ISO 13250-2 (DM), review draft | Garshol, Moore | 2003-09-01 |
ISO 13250-3 (XTM), committee draft | Garshol, Moore | 2003-09-15 |
ISO 13250-2 (DM), committee draft | Garshol, Moore | 2003-10-01 |
TMQL use cases, first draft | Barta | 2003-10-01 |
CXTM, editor's draft | Ahmed | 2003-10-15 |
RM requirements, next version | Durusau | 2003-10-15 |
ISO 13250, intro annex, editor's draft | Pepper, Naito-san | 2003-11-01 |
Query language survey | Garshol, Durusau | 2003-11-01 |
RM, editor's draft | Newcomb |
ISO 13250-2 and ISO 13250-3Work to do: apply agreed issue resolutions from London and Montréal, rewrite to ISO style. Will be sent out for a brief review period (a few days), then sent to ISO as a committee draft for ballot.
TMQL documentsAll three should be ready before the Philadelphia meeting, so that they can be used as foundcation for decisions about TMQL at that meeting, with a goal to having a real TMQL proposal Q1 2004. Survey should list query languages, grouped by evaluation model, and list main features supported, plus references to specs etc.
Editors need to write a first draft to be reviewed by the NBs, who should have comments ready for the Philadelphia meeting.
If this is only going to be an introduction to topic maps that does not actually standardize anything it cannot be a part of the standard, though it can be an annex. We agree that this is worthwhile to have in the standard, and so think it should become an annex rather than a full part.
(Added after the WB meeting and does not constitute part of the WG activities) Note: After this decision by the WG, Jim Mason confirmed that introductory material should appear in an annex and advised that ISO 8879 should be followed as a guide on this issue. ISO/IEC Directives Part 2: Rules for the structure and drafting of International Standards, Section 6.4.1.1 in particular appears applicable to some of the content of the proposed part 1. It is suggested that the editors of Part 1 may wish to consider what material from the currently proposed part 1 meets ISO requirements to appear in part 1 and what should appear in an annex prior to the WG3 meeting in December and revise their proposal to WG3. That could assist WG3 in resolving this issue at the meeting in December without further delay. (Added by Patrick Durusau, acting convenor for WG3 at this meeting.)
Proposed content:
Note: In all of the following where there are numbered steps it is sufficient to proceed only as far as the first step which results in a non-equal comparison. If all the steps are exhausted, the compared items are considered equal under the comparison algorithm.
Handling of NULL value
Ordering of Information Item Types and Basic Types
Comparison Algorithm For Strings
Comparison Algorithm For Collections
Comparison Order For Locator Items
Canonical Sort Order For Topic Items # SAM issue topic-identity-required.
Canonical Sort Order For TopicName Information Items # SAM issue items-parent-required
Canonical Sort Order For VariantName Information Items
Canonical Sort Order For Occurrence Information Items # SAM issue prop-parent
Canonical Sort Order For Association Information Items
Canonical Sort Order For AssociationRole Information Items
NOTE: Need examples that illustrate the different requirements.
All reporting national bodies indicated there were no users that would be impacted by dropping HyTM from ISO 13250. Unfortunately, a report was not available from Canada so no decision is recommended at this time.
The meeting of WG3 for XML 2003 was discussed and it was concluded that it would take three (3) days to cover all the outstanding materials. To efficiently cover all the materials in the time allowed will require advance discussion and preparation by all participants. The estimated time for each segment and issues to be covered are as follows:
Whitespace is forbidden in URIs unless it is escaped.
When encountering a mergeMap only load the external topic map if that topic map has not been loaded before with the same set of added themes. For each XTM document, topicRef elements pointing to external documents for which there is no mergeMap reference in the same XTM document are considered to imply a mergeMap reference to that external document with no added themes.
XTM should follow RFC 2396.
The id of a member element only becomes a source locator if that member element only has a single player. If it has more than one player the id is ignored.
Keep the content model as it is.
This is not an error, instead it is treated as a reference to a non-existent topic.
Arbitrary XML markup is allowed inside the resourceData and baseNameString elements, and it must be preserved by XTM processors.
XTM processors are required to do namespace processing.
The #FIXED attributes will be declared as optional, but with specific required values, if given.
The RELAX-NG schema is the normative schema.
Full XPointer references are disallowed on all elements.
We don't ignore unknown elements.
XTM 1.1 uses the same namespace URI as XTM 1.0. Note that this is only acceptable so long as XTM 1.1 is backwards compatible with version 1.0.
A version attribute should be added to the topicMap element. If an XTM processor finds an XTM document in a version it does not support that is an error.
Add the parent property as a computed property.
Items are required to be reachable from the topic map item.
The name should be 'subject locator'.
XML-Data-Representation: decided to write up two different approaches and to be reviewed by the committee
The following is drawn from slides prepared and reviewed by the WG3 group in Montreal.
Kal Ahmed also prepared a report on the issue of structured scope for the topic maps data model. That report is attached hereto as an appendix for the convenience of the reader.
Requirements document edited and issued by wg
Sections 1, 2, and 3 were not reviewed at the meeting, being considered more appropriate for individual reading and commenting.
Section 4 was thought to be too general. Preferrably it should be broken up into three use cases like the one in section 5, and each use case presented complete with ontology and example queries.
Section 4.1 was considered a good use case that offered something different from the other three.
Section 4.2 could perhaps be changed to "Visualization of arbitrary topic maps," and then be extended with specific queries stolen from TMNav and the Omnigator. Lars Marius Garshol volunteered to supply Omnigator queries.
Section 4.3 was considered too specialized, and should probably be changed to a different content generation use case. The Italian Opera Topic Map web site was suggested as one alternative. Lars Marius Garshol volunteered to supply the ontology and specific queries.
It was thought that the XTM making up the data should not be given verbatim in the document, but rather be made available through a link. The ontology would perhaps best be represented with a diagram showing the structure, or, if this is too difficult, as AsTMa= examples.
The list of queries in section 5.2 was thought to be very good in general. We still need more use cases and queries, but diversity and range of the queries was considered very good. The number of queries for one use cases was also thought to be about right.
In section 5.2 the results should be taken out, and instead specified in terms of the data model. For example, query 1 has a query result that is "a list of topic items".
Query 3 in 5.2 should be specified more precisely, and the same applies to most of the rest of the queries. What are the "heads" mentioned in query 5, and also in several of the later queries? To query 11 should be added a note that TMQL is unlikely to support this query. In query 14 there is a typo: "No are duplicates allowed".
Section 5.3 should be called "XML output". The DTDs should be cut, since they may be thought to be input to the query processor, which of course is not the case. Query 2 should be replaced with one that has namespaces and attributes, and where some of the attribute values are computed by the query. A third query with some conditional content should be added.
Section 5.4 should be called "topic map output" (see the updated requirements document for clarification). In query 1 the reference to fragments should be removed. We should also consider the relationship between TMQL and XTM Fragments more carefully.
The meeting was not sure what to make of this section, and decided to ask the author what the intent behind it was before forming any opinion on the section.
The hyphen in Ann Wrightson's name should be removed :-).
For more use cases and queries it might be worthwhile to look at the RDF query use cases and test cases and also the XML Query use cases.
The references section appears to be missing. Stylesheet bug?
The following discussion notes are rather cryptic without being present at the meeting or having access to the document under discussion. The discussion document was not available at the time of the posting of these notes.
TMCL - Use Cases and Requirements
2.1
2.2
SR2
SR3 data typing
SR4 by -> on. Also applies to general typing issues.
3.3 general : comment on the forcefulness of this section.
sr5 what can be scoping topic.
sr6 for example needs to be added. add 'no more than', 'topic map wide properties'
sr7 ditch.
sr8 keep and add Rnx. clearer defintion of equality. refine last para to indicate that tmcl may create named complex conditions.
sr9 (sr10) permissive and (sr9) prescriptive validation. redo along lines of sr10.
sr11, s12 ditch.
----------------------------rnx tmcl shall define 2 levels (tmcl lite). One level makes use of tmql, the other level (tmcl lite) makes use of types for selectors.
the syntax of tmcl full extends the syntax of tmcl lite. A tmcl lite schema is also a tmcl full schema. There is more work here on defining tmcl lite.
graham shall encourage those tasked with actions from london to actually do something.
r1 re-phrase to say constraints upon instances of the TM DM.
rn1 tmcl will have a defined data model preferably the TM DM.
rn2 tmcl should enable alternative serialisations of a TMCL schema.
rn3 tmcl must define a) a syntax for expressing constraints b) a model for internal representation for these constraints, c) behaviour of TMCL validation.
r2 clarify use of violation. defn. causes an exception.
r3 composition of schema rather than 'merge'.
r4 relax must to should.
r7 a topic map.
r9 remove 'comparing and'
r10 change to type-hierarchy constraints. remove complex definitions.
r11 definitions changes to 'constraints on'
r12 thoughts : link between DSDL, TM DM issue for XML representation and TMQL basic types. - requirement to DSDL WG1 regarding use of DSDL in basic data typing. - tmcl should not specify which basic type system is used.
r12 XML Schema to Relax-NG note: try to colocate reqs under headings.
r13 re-word. use tm info items and literals.
r14 keep 1st sentence only.
r15 lose 2nd sentence.
r16 steal from LMG - same as TMQL.
R18 Anything retrievable by TMQL can be constrained.
r19 type. and re-write.
r20 sp. re-write to reference tmcl model.
New suggested requirement: ability to state evaluation heuristics against a set of constraints.
Scope of TMCL:
Contrasting TMRM and TMCL merging:
tm1 (references ->) tmcl_schema1
Requirement:
Process Issues:
This document was prepared to summarise the results of an informal working group within the ISO/IEC SC34 WG3 group. The group was given the task of reviewing the existing use of scope in ISO 13250:2003 and making a proposal for how scope should be treated in the next version of the standard. The primary context for this work was as input to the development of ISO 13250-2, Topic Maps Data Model but the working group also considered the requirements of and possibilities presented by the development of XTM (ISO 13250-3), TMQL (ISO 18408) and TMCL (ISO 19756).
The working group found that because the semantics of scope are underspecified in the existing version of ISO 13250, a number of interpretations have been applied by topic map users. These differ in the way in which a scope of a characteristic is evaluated against an application context and in the way in which a scoped characteristic may be treated as a result of such an evaluation.
The common model for evaluating the scope of a characteristic is to define a process which takes as its input the set of topics which define the scope and a set of topic which define a user context against which the scope is to be evaluated. The process then applies simple set operations to determine whether the characteristic is in-scope or out-of-scope. The three most common processing algorithms are:
For clarity, the following table shows the way in which a scoped characteristic would be evaluated against a user context in each of these three mechanisms.
Inputs | Evaluation Algorithm | |||
---|---|---|---|---|
Scope | Context | ALL | ANY | EXACT |
A,B | A | out-of-scope | in-scope | out-of-scope |
A,B | A,B | in-scope | in-scope | in-scope |
A,B | A,B,C | in-scope | in-scope | out-of-scope |
A,B | C | out-of-scope | out-of-scope | out-of-scope |
A,B | EMPTY | out-of-scope | out-of-scope | out-of-scope |
The committee found that in practice, each of these approaches has benefits and drawbacks and that for maximally reliable interchange, a topic map author should be able to declare the model to be applied to the scopes that they create.
The committee found that while there is common agreement that a characteristic that is in scope is considered to be a valid statement about the subject represented by the topic that it is a characteristic of, there is not the same clarity for out-of-scope characteristics.
An out-of-scope characteristic may be treated under a negation or a no negation model. Under the negation model the processor infers for an out-of-scope characteristic that the inverse is true, so if topic T has characteristic C which is out-of-scope, this model would infer (NOT C) applies to T. Under the no negation model, the processor infers only that the characteristic does not apply.
The committee makes the following recommendations.
The Topic Maps Data Model must make it clear that when a characteristic is out of scope, the application should infer only that the characteristic does not apply and should not infer that the the statement is negated.
The topic map author should be allowed to create a scope as a logical expression consisting of topics composed by AND and OR operators only.
The Topic Maps Data Model must define a processing model for evaluating a user context against a scope to determine if the scoped characteristic is in-scope or out-of-scope. This processing model may allow scope to be evaluated against either:
While the former option is the easier to implement, the latter option provides the greatest expressive power in the processing of scopes.
When scopes are compared for equality, they must be compared in a normalised form. The Topic Maps Data Model must define the algorithm for this normalisation process.
The following recommendations are made with regards to the XTM syntax
The content model of the scope element must be altered. The scope element content model must allow references to topics and other nested scope elements to be freely mixed. The scope element must have an optional attribute used to specify the compositor used for the list of child elements (AND or OR).
<occurrence> <scope compositor="AND"> <topicRef xlink:href="english"/> <scope compositor="OR"> <topicRef xlink:href="beginner"/> <topicRef xlink:href="intermediate"/> </scope> </scope> <resourceRef xlink:href="...."/> </occurrence>
The topicMap element should have an additional, optional attribute to specify
the default compositor for a scope which consists of a list of themes only. This
compositor would then be applied to all
This means that the following would be equivalent to the example given above
<topicMap compositorDefault="AND"> <topic> ... <occurrence> <scope> <topicRef xlink:href="english"/> <scope compositor="OR"> <topicRef xlink:href="beginner"/> <topicRef xlink:href="intermediate"/> </scope> </scope> <resourceRef xlink:href="...."/> </occurrence> ... </topic> </topicMap>
Because the default would apply to any scope element with no explicitly set compositor, the following would be equivalent to both of the preceeding examples.
<topicMap compositorDefault="OR"> <topic> ... <occurrence> <scope compositor="AND"> <topicRef xlink:href="english"/> <scope> <topicRef xlink:href="beginner"/> <topicRef xlink:href="intermediate"/> </scope> </scope> <resourceRef xlink:href="...."/> </occurrence> ... </topic> </topicMap>
The following issues have not yet been addressed by the group.
Consider the following topic:
<topic> <baseName> <scope> <topicRef xlink:href="#A"/> </scope> <baseNameString>foo</baseNameString> </baseName> <baseName> <scope compositor="OR"> <topicRef xlink:href="#A"/> <topicRef xlink:href="#B"/> </scope> <baseNameString>foo</baseNameString> </baseName> </topic>
The two base names do not have the same scope, but the first is logically redundant as whenever it applies, the second also applies. In this case, should the first name be removed ?
Author's Note: My feeling is that duplicate elimination rules should be concerned with the elimination of exact duplicates only and not with the elimination of redundant characteristics as this could lead to hard-to-explain side-effects for topic map authoring processes.