TITLE: | Topic Map Constraint Language |
SOURCE: | Mr. Graham Moore; Ms. Mary Nishikawa; Mr. Dmitry Bogachev |
PROJECT: | CD 19756: Information Technology - Topic Maps - Constraint Language (TMCL) |
PROJECT EDITOR: | Mr. Dmitry Bogachev; Mr. Graham Moore; Ms. Mary Nishikawa |
STATUS: | Editors' Draft |
ACTION: | For review and comment |
DATE: | 2004-10-16 |
DISTRIBUTION: | SC34 and Liaisons |
REFER TO: | N0548 - 2004-10-16 - Topic Map Constraint Language (TMCL) Requirements and Use Cases |
REPLY TO: |
Dr. James David Mason (ISO/IEC JTC 1/SC 34 Secretariat - Standards Council of Canada) Crane Softwrights Ltd. Box 266, Kars, ON K0A-2E0 CANADA Telephone: +1 613 489-0999 Facsimile: +1 613 489-0995 Network: jtc1sc34@scc.ca http://www.jtc1sc34.org |
Topic Map Constraint Language [TMCL] provides a means to express constraints on topic maps conforming to ISO/IEC 13250:2000 [13250]; these will be over and above the constraints currently defined in the Topic Map Data Model [DM].
TMCL is designed to allow users to constrain any aspect of the topic map data model [DM]. TMCL adopts TMQL [TMQLreq] as a means to express both the topic map constructs to be constrained and topic map structures that must exist in order for the constraint to be met. TMCL defines TMCL-Schema and TMCL-Rule. TMCL-Schema provides a type based model of constraints. TMCL-Schema is defined in terms of a more abstract model TMCL-Model. TMCL-Rule provides a generlised model of constraint based on TMQL. For each langauge a model, semantics and syntax are defined.
Both TMCL-Rule\Schema define sets of constraints. In general these constraints consist of terms that identify parts of the Topic Map to be constrained and terms that define the predicate or truth that must hold for the Topic Map to be considered to be consistent.
TMCL-Rule and TMCL-Schema are used to constrain instances of the Topic Map Data Model. If the topic map is valid in respect to the constraints being tested then validation is said to have succeeded. More formally it can be said that :
Given: TopicMap: tm1 Schema : sc1 Then: Validate(tm1, sc1) => (true, notifyItem*) | (false, conflictItem+, notifyItem*)
[NOTE 1] We might want to return a topic map here rather than true or false.
[gdm] Maybe, but not convincing. We should provide machinery that the result data model can be expressed as a topic map but not mandate it.
The Validate function is defined as follows:
The following model constructs describe the constraint language that underlies the TMCL-Schema section of 2.3. The Model is split into three parts; the set of predicate types used to identify the thing to be constrained (TMQL core predicate set), the set of constraints for different TMDM constructs (extended TMQL predicates) and the schema constructs that use predicates and constraints to define a complete schema.
[gdm] The format of the following needs fixing.
The predicates define which TMDM construct is to be constrained. The constructs here are a set of TMQL core predicates. They are designed so that they can be composed together to form complex expressions. There is no OR or NOT. Define an extra schema rule for OR. NOT needs some consideration.
[ISSUE 1] Are the TMQL core predicates built into TMCL?
[mn] See section 4 in A Proposed Foundational Model for Topic Maps. Lars Marius Garshol. ISO/IEC JTC 1/SC 34N0529.
[NOTE 2] The set of TMQL core predicates are designed so that they can be composed together to form complex expressions. There is no OR or NOT. TMCL needs to Define an extra schema rule for OR. NOT needs some consideration.
topic-predicate(loc-predicate* srclocs, loc-predicate* resrefs, loc-predicate* subjinsd, basename-predicate* names, occurrence-predicate* occurrences, type-predicate* types) loc-predicate(RegEx match-rule) type-predicate(topic-predicate type) basename-predicate(type-predicate type, scope-predicate scope, value-predicate value, variant-predicate variants*) variant-predicate(scope-predicate scope, value-predicate value) occurrence-predicate(type-predicate type, scope-predicate scope, value-predicate value | loc-predicate resref) scope-predicate(topic-predicate* topics) association-predicate(role-predicate* roles, type-predicate type, scope-predicate scope) role-predicate(topic-predicate role-type, topic-predicate role-player) value-predicate(RegEx match-rule)
[mn] There is value-predicate(RegEx match-rule) and loc-predicate(RegEx match-rule). RegEx match-rule would be different for these? Needs further explanation.
The following set of constraint constructs utilise the predicates defined above to complete the constraint model.
topic-constraint(loc-constraint* srclocs, loc-constraint* resrefs, loc-constraint* subjinsd, basename-constraint* names, occurrence-constraint* occurrences, type-constraint* types, topic-predicate* one-of, topic-predicate same-topic-as, play-role-constraint playsrole, role-constraint typesrole) loc-constraint(RegEx match-rule, Int maxcard, Int mincard) value-constraint(RegEx match-rule) type-constraint(topic-predicate type) basename-constraint(type-predicate type, scope-constraint scope, value-constraint value, variant-constraint variants*, Int cardMin, Int cardMax, String* OneOf) variant-constraint(scope-constraint scope, value-constraint value) occurrence-constraint(type-predicate type, scope-constraint scope, value-constraint value | loc-constraint resref Int cardMin, Int cardMax, String* OneOf) scope-constraint(topic-predicate* topics) association-constraint(role-constraint* roles, type-constraint type, scope-constraint) role-constraint(topic-predicate role-type, topic-predicate* all-players-from, topic-predicate some-players-from, topic-predicate one-of, Int cardMin, Int cardMax) play-role-constraint(topic-predicate role-type, topic-predicate association-type, scope-constraint scope, Int cardMin, Int cardMax, role-constraint other-players)
The following schema definitions allow constraints to be constructed consisting of predicates for selection and constraints for assertion.
topic-schema(topic-predicate, topic-constraint) association-schema(association-predicate, association-constraint)
[gdm] I really thought that we could simplify this further, but the model is so big that this seems like the minimum of constructs required. A reference model would greatly aid the process here.
This proposal is a variant of Ontopia Schema Language [OSL] and TMSchema [TMS].
[gdm] I see this more and more as a possible syntax and that the model above more clearly expresses and is more expressive than what follows.
The following definition is a language that can be used to constrain classes of topics and associations.
Used to group together a collection of constraints.
TopicMapSchema: TopicSchema * AssociationSchema * SameTopicAs TMQLExp*
Topic Identification is used to identify exactly 1 topic.
TopicIdentification: SrcLocators # URI * SubjectIndicator # URI * SubjectAddress # URI TMQLExp # String
TopicSet: TopicIdentification *
TopicSchema: Type # TopicIdentification SubjectAddressSchema SubjectIndicatorSchema* BaseNameSchema * InternalOccurrenceSchema * ExternalOccurrenceSchema * OneOfSchema? SameTopicAsSchema? PlayRoleSchema * RoleSchema *
Standardized mechanism for capturing additional merging rules.
SameTopicAsSchema: Matches TMQLExp // select $A where connected-to($A, $area), connected-to($this, $area), $A\=$this, gets-benefits($A, $lots), gets-benefits($this, $lots), $B\=$this, has-doctor($A, ?Y) has-doctor($this, ?Y) $C\=$this // $this is already bound.
Constrains the cardinality and shape of subject indicator locator.
SubjectIndicatorSchema: cardMin # Integer cardMax # Integer match # Regular Expression
Constrains the cardinality and shape of subject address locator.
SubjectAddressSchema: cardMin # Integer cardMax # Integer match # Regular Expression
Constrains topic names.
BaseNameSchema: type # TopicIdentification scope # TopicSet cardMin # Integer cardMax # Integer dataType # xsd and custom xml schemas one of # String* match # Regular Expression*
Constrains internal occurrences.
InternalOccurrenceSchema: type # TopicIdentification scope # TopicSet cardMin # Integer cardMax # Integer dataType # xsd and custom schemas one of # String* match # Regular Expression *
Constrains external occurrences.
ExternalOccurrenceSchema: type # TopicIdentification scope # TopicSet cardMin # Integer cardMax # Integer one of # URI* match # Regular Expression *
Constrains classes of association.
AssociationSchema: type # TopicIdentification scope # TopicSet RoleSchema+
Constraints the nature of roles on associations of specific types.
RoleSchema: roleType # TopicIdentification cardMin # Integer cardMax # Integer allPlayersFrom # TopicSet // list of Types somePlayersFrom # TopicSet // list of Types oneOf # TopicSet // list of topics
Constraints the nature of participation in associations.
PlayRoleSchema: roleType # TopicIdentification associationType # TopicIdentification scope # TopicSet cardMin # Integer cardMax # Integer otherPlayers # RoleSchema*
One of is used to defined a controlled vocabulary.
OneOfSchema: one-of # TopicSet
[gdm] Once we agree the model looks ok and TMCL Schema is more solid we need to define this.
To be done...
TMCL-Rule allows to declare set of assertions about topic maps. It is a rule-based language which leverages TMQL constructs for specifying conditions and assertions.
Note: Relation to SchematronTMCL-Rule is close to ISO/IEC 19757-3 (Document Schema Definition Languages (DSDL)- Part 3: Rule-based validation ? Schematron) [DSDL]. Schematron allows do define validation rules for XML documents. TMCL-Rule leverages experience from other rule-based languages and allows specifying constraints based on TMDM.
The RuleSchemaItem collects together a set of rules that can be used to validate a topic map. There is exactly one schema information item in each information set.
RuleSchemaItem: ID #defines schema ID Name? #defines schema Name RuleItem* #set of rules DiagnosticItem* #provides more specific details for assertions and reports
The RuleItem defines set of assertions about topic map. The RuleItem consists of optional context item, optional let items and one or more assertion or/and report items.
RuleItem: ID #defines rule ID Name? #defines rule Name ContextItem? #locates topic map data model information items to be constrained. LetItem* #introduces local variables which can be used in assertions and report items AssertItem* #if test is negative AssertItem generates ConflictItem ReportItem* #if test is positive ReportItem generates NotifyItem
The ContextItem is used to locate topic map data model information items to be constrained. It allows to express assertions in a form of "forevery X,Y... where P(X,Y...) satisfies Q(X,Y,...) Variables defined in ContextItem can be used in LetItems, AssertItems and ReportItems The ContextItem is optional element. If rule does not have ContextItem then assertions are evaluated in the context of full topic map.
ContextItem: ForEvery+ #list of variables Where #TMQL predicate expression with free varaibles from ForEvery list
The LetItem introduces local variable which can be used in AssertItem and ReportItem
LetItem: Variable #variable which receives value Where #TMQL predicate expression which generates value
If rule has ContextItem then AssertItem is an assertion about topic map information items located by the ContextItem. In this case assertion can use variables defined in ContextItem. If rule does not have ContextItem assertions are evaluated in the context of full topic map. If test is negative AssertItem generates ConflictItem
AssertItem: Test #TMQL expression which can include variables from ContextItem and LetItems and returns true or false Message #string which can include variables(and simple path expressions) from ContextItem and LetItems Diagnostics #list of DiagnosticItem IDs, is used for detailed notification
Note 1:
Rules without ContextItem allow to express constraints defined on
full topic map
Example 1: Topic map must have more than 20 topics of "musician" type.
Example 2: Topic map must have a topic for composer who was born in Milan.
Note 2:
If constraint can be formulated in a form of "forevery X,Y...
where P(X,Y...) satisfies Q(X,Y,...)" preferable form of a rule includes
explicit ContextItem.
If rule has ContextItem then ReportItem is an assertion about topic map information items located by the ContextItem. In this case assertion can use variables defined in ContextItem. If rule does not have ContextItem report assertions are evaluated in the context of full topic map. If test is positive ReportItem generates NotifyItem
ReportItem: Test #TMQL expression which can include variables from ContextItem and LetItems and returns true or false Message #string which can include variables(and simple path expressions) from ContextItem and LetItems Diagnostics #list of DiagnosticItem IDs (with 0 or more parameters) is used for detailed notification
DiagnosticItem provides more specific details for assertion and report notifications. DiagnosticItem can include variables and simple path expressions. Variables receive values when diagnostic item is called during rule evaluation. DiagnosticItem can also provide some recommendations for conflict resolution.
DiagnosticItem: Parameter* #list of variables Message #string scoped by language which can include variables and simple path expressions
ConflictItem: RuleID # reference to rule which generates conflict TestMessage # string representing Test item from assertion ContextBinding* # defines binding for variables from ContextItem Message # string DiagosticMessage* # string
NotifyItem: RuleID # reference to rule which generates report TestMessage # string representing Test item from assertion ContextBinding* # defines binding for variables from ContextItem Message # string DiagosticMessage* # string
[gdm] To be done.
TMCL-Rule and TMCL-Schema expressions can be combined in the same TMCL schema. It is also possible to insert rules inside of type descriptions. In this case rules have simplified syntax.
Schema Merging: Given 2 topicmap schemas where they constrain topics of the same type that the new components are union of all the schema components.
Explicit notion of conflict. Leave resolution open.
Define notion of conflict explicitly for each schema construct.
Need use cases for schema merging. ANDing or OR or some combination (selective) merging.
Connection from TM to Schema SuperClass Subclass in Topic Map , constraints in schema Dependency issues. 'Informative' connection between and tm and schema. Schema for connections bewteen map topic and schema Seperate doc on how to connect them. Annex? [mn] Seems too important for an annex.
[gdm] To be done.
TMCL enables topic map authors to specify a schema to which the topic map is conformant. This is achieved by reifying the topic map with a topic and assigning an occurrence to that topic of type TMCLSchemaReference.
The following PSI is used to denote the occurrence type for schema references.
http://www.isotopicmaps.org/tmcl/tmcl.html#TMCLSchemaReference
This is used to type a occurrence on a topic that reifies the topic map in order to reference the schema for this topic map instance. The value of the occurrence must reference a valid TMCL XML representation or a topic of type Schema.
Schema composition is the ability to take two or more schemas and compose them into a single schema. Given that a schema consists of a set of constraints, schema composition merely takes all the constraints from all schemas being composed and returns a single schema that consists of all constraints. Applications are free to identify and remove redundant constraints and through exceptions should any constraints be contradictory.
More formally:
Given Schema : s1, s2 Constraint : c1, c2, c3, c4, c5 s1 := {c1, c2, c3} s2 := {c4, c5} That Compose(s1, s2) => s3 s3 := {c1, c2, c3, c4, c5}
[gdm] To be done.