ISO/IEC JTC 1/SC34 N0344
ISO/IEC JTC 1/SC34
Information Technology --
Document Description and Processing Languages
12
November 2002
JTC
1/SC 34 N 344
This Reference Model for ISO 13250 Topic Maps
(RM4TM) provides a framework for the definitions of Topic Map Applications (TM
Applications). Diverse topic maps that conform to diverse TM Applications that
are defined in keeping with this framework can be interpreted and amalgamated
automatically by independently implemented systems, without losing information,
and with predictable results.
Many of the key advantages of the Topic Maps
paradigm derive from the achievement of its primary objective, the "Subject
Location Uniqueness Objective", which is to make everything known about every
subject in a topic space accessible from a single location within that space.
The achievement of the Subject Location Uniqueness Objective means that the
efficiency with which users can find information is maximized, not only because
the subject's single location, once found, acts as a comprehensive catalog of
the things that are known about it, but also because the subject's location can
be found in terms of any of its relationships to other subjects.
This RM4TM facilitates the development of TM
Applications and systems that can achieve the Subject Location Uniqueness
Objective with respect to all subjects, including those that are only implicit
in interchangeable topic map instances, as well as with respect to subjects that
are relationships (and aspects of relationships) among other subjects.
Moreover, this RM4TM facilitates the development of
TM Applications and implementations that can amalgamate the topic spaces
represented by topic maps that conform to diverse Topic Maps Applications into a
single resulting topic space in which each subject has a single location, there
is no redundant information, and all of the information represented by the
comprising topic maps is preserved.
This RM4TM provides definition requirements for
user-defined Topic Map Applications that allow such definitions to serve as
contracts between topic map creators, users, and system implementers, such that
when the interchange or amalgamation of topic maps fails due to nonconformance
to the definition of a Topic Maps Application, the nonconforming aspects of the
topic maps or system implementations can be identified.
This RM4TM defines:
-
an abstract graph structure for the representation of
relationships between subjects;
-
rules for defining Applications of the Topic Maps paradigm;
and
-
rules for processing the information contained in topic maps.
Note 1: |
See Annex A
for a brief informal overview of this RM4TM.
| |
Editor's Note 1: |
(The glossary hasn't been drafted yet.)
|
3.1 |
The common structural
abstraction for topic maps |
This RM4TM defines an abstract structure, called a
"topic map graph", in terms of which all kinds of topic maps can be uniformly
interpreted, regardless of their governing TM Applications, and regardless of
the TM Application-defined interchange syntaxes in which they may be
representable.
The "topic map graph" form of any given topic map
represents all of the subjects that participate in the topic map explicitly,
even if they were only implicitly represented in the interchangeable form of the
given topic map.
The following subclauses name and define the rules
and cases to which topic map graph components and entire topic map graphs must
conform in order to be considered "well formed", and the additional rules to
which topic map graphs must conform in order to be considered "fully merged".
Topic map graphs that are under construction may or may not be well-formed, but
only well-formed topic map graphs are eligible to become fully merged, in
addition to being well-formed.
3.2 |
Topic map graphs consist
of nodes and arcs. |
A topic map graph consists of nodes and arcs. In a
well-formed topic map graph, every arc is a typed, oriented connectedness of two
nodes, and every node is one of the two endpoints of zero or more arcs.
Note 2: |
This RM4TM uses the neologism
"connectedness" in order to avoid implying that TM Applications must
be implemented in such a way that arcs are represented as a data
structure. For example, The arc abstraction can be fully honored by
the property values of the nodes that serve as its endpoints.
| |
Note 3: |
The reader's understanding of the
remainder of this clause 3
is likely to be aided by referring to the informative "Assertion
Diagrams" Annex B.
| |
An "arc" in a topic map graph is a two-ended
connectedness between nodes that satisfies all of the following criteria:
-
it has two different nodes serving as its two
endpoints, and
-
it is one of the eight forms of connectedness enumerated in 3.3.3
between the nodes that serve as its two endpoints. (This necessarily means
that it is one of the four arc types enumerated in 3.3.1.)
There are four arc types, named "AT", "AC", "CR",
and "Cx". The significance of each type of arc is different.
3.3.2 |
Names of arc types and arc
endpoint types |
The first letter of an arc type's name is the name
of one of its endpoint types. The second letter of the arc type's name is the
name of its other endpoint type. That is, an AT arc has two endpoints, one of
endpoint type "A" and the other of endpoint type "T".
Note 4: |
In a well-formed topic map graph, only
a-nodes serve as "A" endpoint types, only c-nodes serve as "C"
endpoint types, only r-nodes serve as the "R" endpoint types, and
only t-nodes serve as the "T" endpoint types. There is no such thing
as an "x-node", because all kinds of nodes are eligible to serve as
the x endpoints of Cx arcs. The exceptional
character of the x endpoints of Cx arcs is the reason
why "x" is the only endpoint type name that is always
rendered in lower case.
| |
3.3.3 |
Eight forms of
connectedness are possible |
In all instances of each type of arc, the
significance of a node's service as one of the endpoints is different from the
significance of a node's service as the other endpoint. Given two nodes, N1 and
N2, there are eight possible forms of connectedness between them, since there
are four types of arcs. They are enumerated in the following subclauses.
The connectedness of N1 and N2 is an instance of an
AT arc type in which N1 is the A endpoint, and N2 is the T endpoint.
The connectedness of N1 and N2 is an instance of an
AT arc type in which N1 is the T endpoint, and N2 is the A endpoint. (This is
the reverse of Form 1.)
The connectedness of N1 and N2 is an instance of an
AC arc type in which N1 is the A endpoint, and N2 is the C endpoint.
The connectedness of N1 and N2 is an instance of an
AC arc type in which N1 is the C endpoint, and N2 is the A endpoint. (This is
the reverse of Form 3.)
The connectedness of N1 and N2 is an instance of a
CR arc type in which N1 is the C endpoint, and N2 is the R endpoint.
The connectedness of N1 and N2 is an instance of an
CR arc type in which N1 is the R endpoint, and N2 is the C endpoint. (This is
the reverse of Form 5.)
The connectedness of N1 and N2 is an instance of a
Cx arc type in which N1 is the C endpoint, and N2 is the x
endpoint.
The connectedness of N1 and N2 is an instance of a
Cx arc type in which N1 is the x endpoint, and N2 is the C
endpoint.
Note 5: |
The above list of Forms of
Connectedness can be represented in tabular form as follows:
|
N1 |
N2 |
1 |
A |
T |
T |
A |
A |
C |
C |
A |
C |
R |
R |
C |
C |
x |
x |
C | |
2 |
3 |
4 |
5 |
6 |
7 |
8 | | |
Note 6: |
The above enumeration of the Forms of
Connectedness serves two purposes in this RM4TM:
-
It establishes a name ("Form n", where n is an
integer in the sequence 1..8) for each of the Forms of
Connectedness that an arc can represent, as a convenience for use
elsewhere in this document, and possibly in the definitions of TM
Applications.
-
It establishes that the orientation of the
connectedness represented by an arc is an essential aspect of the
definition of "arc" in this RM4TM. For purposes of a TM
Application's definition of a "situation feature" (see 3.4.2),
for example, it is insufficient merely to say that two nodes are
connected by a certain type of arc. The specification of the arc
must also include information as to which node serves as which
endpoint type. In order to represent connectedness equivalent to
the connectedness represented by an RM4TM arc in some "directed
graph" paradigms, at least two directed graph arcs must be used,
plus whatever additional machinery may be required to associate
the two directed graph arcs in order to represent that both
represent different directional aspects of the same connectedness.
By contrast, RM4TM arcs are nondirectional, but
oriented.
| |
3.4.1 |
One subject for each node
|
In topic map graphs, only nodes can represent
subjects, and every node represents a single subject.
3.4.2 |
Situations and
subjects |
A node serves as one endpoint of zero or more arcs.
Note 7: |
A node that serves as the endpoints of
no arcs at all is not well-formed unless it has at least one
built-in SIDP value. (See 3.4.2.)
| |
A node that is the endpoint of zero arcs is said to
be "isolated." In a well-formed topic map graph, only "built-in" nodes (see
Clause 4)
can be isolated.
A node that is the endpoint of one or more arcs is
said to be "situated." A node's "situation" is its service as one of the
endpoints of all of the "connected paths" through the graph to all other nodes
accessible via such paths. (Given node n[0], a "connected path" is a finite
alternating sequence n[0], arc[1], n[1], arc[2], n[3]... n[n] such that each
arc[i] in the sequence connects node[i-1] and node[i].)
Except for the built-in values of the properties of
built-in nodes, all of the values of the properties of nodes are determined by
their situations. Thus, except for the built-in subjects of built-in nodes, the
subjects of all nodes are entirely determined by their situations.
Except for the restrictions on the subjects of
nodes that have special functions within assertion subgraphs (see 3.6.2.2),
TM Applications are free to define "situation features" (features of the
situations of nodes) and how those features, when they occur, affect the values
of the properties of the nodes whose situations include those situation
features. The values of all properties can be affected by such situation
features, including both Subject Identity Discriminating Propertes (SIDPs) and
Other Properties (OPs), in accordance with the specifications provided in the
definition of the TM Application that defines the properties and the situation
features (see 4.7.2.2).
Note 8: |
The situation of a node in a topic map
graph is always and only as visible as the values of its properties
make it. See Clause 4.
| |
Note 9: |
The definition of a situation feature
can include, but is not limited to, the situated node's status as a
role player in one or more assertions. The definition of a situation
feature can also include the situated node's status as another kind
of assertion component node, such as an r-node component of one or
more assertions (see 3.6.2.2).
| |
3.5.1 |
Six cases of well-formed
nodes |
A node that satisfies all the criteria in the
subclauses of one of the six cases described in the following subclauses is well
formed. A node that does not satisfy the criteria of one of the six cases is not
well formed.
3.5.1.1.1 |
Defining Characteristics
of Case 1 nodes |
3.5.1.1.1.2 |
The node has at least one
built-in SIDP value (see Clause 4).
|
Case 1 nodes do not have a node type name.
The subjects of Case 1 nodes are not constrained by
this RM4TM.
3.5.1.2.1 |
Defining characteristics
of Case 2 nodes |
3.5.1.2.1.1 |
The node serves as one or
more of the x endpoints of any number of well-formed Cx
arcs. |
3.5.1.2.1.2 |
The node does not serve as
any other endpoint type of any instance of any arc type.
|
3.5.1.2.1.3 |
The node either has at least
one built-in SIDP value, or its situation as a role player causes at least
one SIDP value to be conferred upon it. |
Case 2 nodes do not have a node type name.
The subjects of Case 2 nodes are not constrained by
this RM4TM.
3.5.1.3 |
Well-formed node Case 3
("a-node") |
3.5.1.3.1 |
Defining characteristics
of Case 3 nodes |
3.5.1.3.1.1 |
The node serves as zero or
more of the x endpoints of any number of Cx arcs.
|
3.5.1.3.1.2 |
The node serves as the A
endpoint of two or more AC arcs. |
3.5.1.3.1.3 |
The node may or may not serve
as the A endpoint of one AT arc. |
3.5.1.3.1.4 |
The node does not serve as
any other endpoint of any instance of any arc type.
|
A Case 3 node is called an "a-node" (where "a"
stands for "assertion").
The subject of an a-node is always the relationship
that is specified via the assertion for which it serves as the unique nexus. The
relationship is an instance of the type of relationship which is the subject of
the node that serves as the T endpoint of the AT arc of which the a-node is the
A endpoint, if any. If the a-node is not the A endpoint of an AT arc, the type
of the relationship is unspecified.
3.5.1.4 |
Well-formed node Case 4
("c-node") |
3.5.1.4.1 |
Defining characteristics
of Case 4 nodes |
3.5.1.4.1.1 |
The node serves as zero or
more of the x endpoints of any number of Cx arcs.
|
3.5.1.4.1.2 |
The node serves as the C
endpoint of a single AC arc. |
3.5.1.4.1.3 |
The node serves as the C
endpoint of a single CR arc. |
3.5.1.4.1.4 |
The node may or may not serve
as the C endpoint of a single Cx arc.
|
3.5.1.4.1.5 |
The node does not serve as
any other endpoint of any instance of any arc type.
|
A Case 4 node is called a "c-node" (where "c"
stands for "casting").
Note 10: |
The term "casting" is consistent with
the theatrical metaphor invoked by the term "role player". In an
assertion, the role players are like the actors in a stage play.
Each c-node represents the "casting" of an actor (a role player) in
a specific role (a role type) in a specific stage production (a
specific assertion), which may or may not be a production of a
specific stage play (a specific assertion type). See 3.6.1.
| |
If a c-node serves as the C endpoint of a Cx
arc, then its subject is the playing of a specific role type by a specific
subject in a specific relationship.
If a c-node does not serve as the C endpoint of a
Cx arc, then its subject is the fact that a specific role type in a
specific relationship is not played by any subject.
3.5.1.5 |
Well-formed node Case 5
("r-node") |
3.5.1.5.1 |
Defining characteristics
of Case 5 nodes |
3.5.1.5.1.1 |
The node serves as zero or
more of the x endpoints of any number of Cx arcs.
|
3.5.1.5.1.2 |
The node serves as the R
endpoint of one or more CR arcs. |
3.5.1.5.1.3 |
The node does not serve as
any other endpoint of any instance of any arc type.
|
A Case 5 node is called an "r-node" (where "r"
stands for "role type").
The subject of an r-node is a role type that can be
played by subjects in relationships. The subjects of the c-nodes that serve as
the C endpoints of the CR arcs whose R endpoints are the r-node are the
role-player castings of role players that play the role type.
3.5.1.6.1 |
Defining characteristics
of Case 6 nodes ("t-node") |
3.5.1.6.1.1 |
The node serves as zero or
more of the x endpoints of any number of Cx arcs.
|
3.5.1.6.1.2 |
The node serves as the T
endpoint of one or more AT arcs. |
3.5.1.6.1.3 |
The node does not serve as
any other endpoint of any instance of any arc type.
|
A case 6 node is called a "t-node" (where "t"
stands for assertion "type").
The subject of a t-node is a class of relationship,
including the roles that can be played in instances of the class, and the values
that are conferred on the properties of role players by virtue of their
situations as players of specific roles in instances of the class. The subjects
of all of the a-nodes that serve as the A endpoints of all of the AT arcs of
which a t-node serves as the T endpoint are instances of the class of
relationship that is the subject of the t-node.
Note 11: |
The above well-formedness requirements
for nodes can be summarized in tabular form as follows:
| |
Table 1: |
The Six Cases of Well-formed
Nodes |
Form of
Connectedness |
(node N2) node N1
|
N1 Case 1 |
N1 Case 2 |
N1 Case 3 |
N1 Case 4 |
N1 Case 5 |
N1 Case 6
| |
8 ......... C x |
7 ......... x C |
6 ......... C R |
5 ......... R C |
4 ......... A C |
3 ......... C A |
2 ......... A T |
1 ......... T A | |
node type name (if any). |
Subject constraint (if
any). Subject is: |
requires built-n SIDP value(s)?
|
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
(none) |
(unconstrained) |
yes |
0 |
1+ |
0 |
0 |
0 |
0 |
0 |
0 |
(none) |
(unconstrained) |
no |
0 |
0+ |
0 |
0 |
0 |
2+ |
0 |
1? |
"a-node" |
assertion |
no |
0 |
0+ |
1? |
0 |
1 |
1 |
0 |
0 |
"c-node" |
casting |
no |
0 |
0+ |
0 |
1+ |
0 |
0 |
0 |
0 |
"r-node" |
role type |
no |
0 |
0+ |
0 |
0 |
0 |
0 |
1+ |
0 |
"t-node" |
assertion type |
no | |
|
|
Legend: |
|
0 |
In order to
conform to the well-formed node case described on this row, node N1
is not permitted to serve as the arc endpoint designated by this
column. |
|
0+ |
In order to
conform to the well-formed node case described on this row, node N1
may serve as zero or more of the arc endpoints designated by this
column. |
|
1 |
In order to
conform to the well-formed node case described on this row, node N1
must serve as exactly one of the arc endpoints designated by
this column. |
|
1? |
In order to
conform to the well-formed node case described on this row, node N1
may serve as exactly one of the arc endpoints designated by
this column. |
|
1+ |
In order to
conform to the well-formed node case described on this row, node N1
must serve as at least one of the arc endpoints
designated by this column. |
|
2+ |
In order to
conform to the well-formed node case described on this row, node N1
must serve as at least two of the arc endpoints
designated by this column.
| |
3.6.1 |
Introduction to assertions
|
Assertions are subgraphs of topic map graphs. In a
well-formed topic map graph, every arc is a specific component of a single
assertion, so well-formed topic map graphs consist entirely of assertions
(except, possibly, for isolated "built-in" nodes).
Each assertion represents (asserts the existence
of) a single strongly-typed relationship among the subjects that are its "role
players". Each role player is a subject that plays a specific role in the
relationship. The roles ("role types") themselves are subjects, and so is the
type of relationship of which the relationship is an instance.
The design of assertions in this RM4TM enables
diverse multiple topic map graphs to be amalgamated into a single topic map
graph, such that:
-
each of the original topic map graphs is a subgraph of the
result, and
-
each such subgraph is structurally identical to the
corresponding original, even when one of them makes assertions about
assertions in the other, about which the other made no assertions. Thus, the
integrity of the original topic map graphs is maintained as subgraphs of the
result.
Note 12: |
In order to maintain the integrity of
merged topic maps, it is necessary to establish a common structure
for all assertions. In this RM4TM, the decisions as to which aspects
of the structure of assertions should be "reified" as nodes, and
which aspects should remain "unreified" as arcs, were made by
distinguishing between the aspects of assertions that are
substantive with respect to the relationships that they assert (and
that could conceivably, therefore, need to become role players in
other assertions about those relationships), as opposed to the
aspects of assertions that nobody would want to make other
assertions about unless they were discussing the design of
assertions in general. In the structure of assertions set forth in
this RM4TM, the former aspects are represented by a-nodes and
c-nodes, while the latter aspects are represented as the four types
of arcs (the "eight forms of connectedness").
| |
3.6.2 |
Inventory of the
components of assertions |
An assertion is a subgraph of a topic map graph
that consists of certain arcs and the nodes that serve as their endpoints,
constructed in conformance to the rules set forth in this clause. Every node,
regardless of its node type, is eligible to be a role player (i.e., to serve as
the x endpoint of a Cx arc) in any number of assertions. Every arc
is a component of a single assertion. The entire significance of every arc is
its service as a unique component of a single assertion.
3.6.2.1 |
Inventory of the arcs in
an assertion |
The inventory of arcs that an assertion may have
are defined in the subclauses that follow.
Note 13: |
The assertion type of an assertion may
be specified or unspecified.
| |
Note 14: |
In every assertion, there must be at
least two role types, and therefore there must be at least two
casting nodes.
| |
3.6.2.1.3 |
Exactly as many RC arcs as
there are AC arcs |
Note 15: |
Every casting node must have a role
type, as well as belong to a single assertion.
| |
Note 16: |
Every assertion must have at least one
role player.
| |
3.6.2.2 |
Inventory of the nodes in
an assertion |
3.6.2.2.1 |
Nodes whose subjects are
never dependent on their situation with respect to a given assertion:
|
3.6.2.2.1.1 |
Assertion type nodes
(t-nodes; i.e., T endpoints of AT arcs) |
3.6.2.2.1.2 |
Role type nodes (r-nodes;
i.e., R endpoints of CR arcs) |
3.6.2.2.2 |
Nodes whose subjects are
always dependent on their situation with respect to a given assertion:
|
3.6.2.2.2.1 |
Assertion nodes (a-nodes;
i.e., A endpoints of AT and AC arcs) |
An assertion always includes a single well-formed
a-node which serves as its unique nexus. The a-node's subject is the
relationship that the assertion represents.
3.6.2.2.2.2 |
Casting nodes (c-nodes;
i.e., C endpoints of AC, CR, and Cx arcs)
|
An assertion always includes at least two c-nodes.
The subject of every c-node is that a specific role player (or that no role
player at all) plays a specific role type in a specific assertion.
3.6.2.2.3 |
Nodes whose subjects may
or may not be dependent on their situation with respect to a given
assertion (role player nodes): |
The governing TM Application defines situation
features and their effects on the values of the SIDPs of role players. Except in
cases where a subject (specified by a set of SIDP values) has been defined by
the governing TM Application as being built into a node, a node's subject
depends entirely on the features of its situation (its "situation features" -
see 3.4.2),
on account of which the governing TM Application requires values to be conferred
on the values of one or more of its SIDPs. Therefore, the situations of nodes as
players of certain roles in instances of certain assertion types may or may not
determine their subjects.
Note 17: |
For example, the subject of a node may
be determined by its situation as a role player in a single
assertion, even though it is also a role player in many others. For
another example, the subject may be collectively determined by
multiple assertions, perhaps by virtue of playing a role type or set
of role types in a set of assertions, or perhaps by playing a role
in an assertion in which another roleplayer's subject is
collectively determined.
| |
3.6.2.3 |
What's in and what's not
in an assertion |
The assertion of which a given a-node is the unique
nexus includes all of the nodes and arcs enumerated in the following subclauses,
and it does not include any other nodes and arcs:
3.6.2.3.1 |
All of the AC arcs of which
the given a-node serves as the A endpoint. |
3.6.2.3.2 |
The well-formed c-nodes that
serve as the C endpoints of the AC arcs identified in 3.6.2.3.1.
|
3.6.2.3.4 |
The well-formed r-nodes that
serve as the R endpoints of the RC arcs identified in 3.6.2.3.3. |
3.6.2.3.6 |
The well-formed nodes that
serve as the x endpoints of the Cx arcs identified in 3.6.2.3.5.
|
3.6.2.3.7 |
The AT arc, if any, of which
the given a-node serves as the A end. |
3.6.2.3.8 |
The well-formed t-node that
serves as the T endpoint of the AT arc, if any, identified in 3.6.2.3.7.
|
3.6.3 |
Identity of
assertions |
Two assertions are always considered identical if
they have the same assertion type, and the same role players (or the absences of
role players) play the same roles. Two assertions are never considered
identical, even if they have the same role players playing the same roles, if
either or both of their assertion types are unspecified. This clause provides
the operational definitions of these concepts.
The identity of the relationship instance that is
the subject of an a-node is defined by that a-node's situation as the nexus of
an assertion subgraph. For all a-nodes, every TM Application is required to
define a situation feature and a set of one or more SIDPs that unambiguously,
comprehensively and exclusively reflects the combination of the following:
-
unless the assertion's type is unspecified, the t-node (whose
subject is the type of relationship of which is the subject of the a-node is
an instance) attached to the a-node by an AT arc in which the a-node serves as
the A endpoint; and
-
the set of role-player castings that are the subjects of the
c-nodes that serve as the C endpoints of the AC arcs for which the a-node
serves as the A endpoints,
-
including the role player node attached to each c-node by a
Cx arc in which the c-node serves as the C endpoint, or the lack
thereof, and
-
including the r-node (whose subject is a role type)
attached to each c-node by a CR arc in which the c-node serves as the C
endpoint.
Note 18: |
One of the key features of this RM4TM
is that the merging process does not need to understand the
semantics of assertion types in order to merge identical assertions.
If two assertions have the same type, regardless of what it is, and
the same role players playing the same role types, regardless of
what they are, they can be seen to be identical and automatically
merged. | |
3.6.4 |
Assertion
semantics |
3.6.4.1 |
Semantics of assertion
typing |
3.6.4.1.1 |
When the assertion type is
specified |
A "typed" assertion is an assertion that specifies
its assertion type (i.e., that has an AT arc and t-node). The semantics of a
typed assertion are determined by the subject of its t-node, which is the
assertion type of which the typed assertion is an instance. The subject of the
t-node incorporates the semantics of all of the role types that can have role
players in instances of the assertion type, all of which must be specified in
the definition of the subject of the assertion type, either by reference or
inclusion.
The semantics of a typed assertion may determine or
affect the subjects of some or all of its role players, i.e., the existence of
such an assertion may affect the values assigned to the SIDPs of its role
players (see 4.7.2).
3.6.4.1.2 |
When the assertion type is
not specified |
An "untyped" assertion is an assertion that does
not specify its assertion type (i.e., that has no AT arc). The semantics of an
untyped assertion are determined by its role types, i.e., by the subjects of its
r-nodes. The semantics of its role types may be such that the players of the
role types have values conferred on their OPs (Other Properties -- see 4.4).
However, the role types of untyped assertions must not be defined in such a way
as to require values to be conferred upon the SIDPs of their players (see 5.2.5.3.2).
3.6.4.1.3 |
The subjects of assertion
types and role types are never affected by their instances
|
The existence of a given assertion never implies
anything about the subject which is the assertion type (if any) of which the
assertion is an instance, or about the subjects that are the assertion's role
types. No values can be conferred upon the SIDPs of assertion types or role
types by virtue of their situations, respectively, as the T endpoints of AT
arcs, or as the R endpoints of CR arcs.
Note 19: |
Like all other nodes, the t-node and
r-nodes that represent the subjects that are an assertion's type and
role types, respectively, may have their subjects (i.e., the values
of their SIDPs) built into them, or their subjects may be conferred
upon them by virtue of their situations as role players in other
assertions.
| |
Note 20: |
TM Applications may confer values on
the OPs of t-nodes and r-nodes by virtue of their situations as
t-nodes and r-nodes.
| |
3.6.4.2.1 |
No multiple role players
of a single role type |
In any given assertion, each role type is either
played by a single subject, represented by a single node, or the role type is
"unplayed", i.e., the role type has no role player. Multiple subjects cannot
play the same role in the same assertion.
Note 21: |
However, the subject of a role player
can be a group of subjects, if the governing TM Application defines
the assertion types required to allow the subjects of nodes to be
groups of subjects.
No grouping semantics of any kind are
defined by this RM4TM. This RM4TM requires all groups to be
explicitly represented as nodes. Any other approach would open the
possibility for knowledge about a group to fail to be connected to
the single node whose subject is the group, and that would be
contrary to the Subject Location Uniqueness Objective.
| |
3.6.4.2.2 |
Semantics of nodes'
situations as role players |
A node's situation as a role player in any given
assertion indicates that the subject represented by that node participates in
the relationship that is the subject of the assertion, as represented by the
assertion's a-node. In an asserted relationship, each role player plays a
distinct role; the nature of each role is the subject (called a "role type") of
one of the assertion's r-nodes. The relationship itself is an instance of the
kind of relationship that is the subject of the assertion's t-node, if any. If
the assertion has no t-node, the subject of which the relationship is an
instance is not specified.
3.6.4.2.3 |
All role types are always
represented in any assertion of a given
type |
In the topic map graph, the representation of every
assertion always includes the representation of all of the role types defined by
its assertion type's definition, regardless of whether they are played or
unplayed. (If the assertion type is unspecified, then the set of role types that
the assertion specifies is assumed to be comprehensive for that assertion.)
3.7 |
Well-formedness
constraints on Assertions |
An assertion that does not conform to all of the
following rules is not well-formed:
3.7.1 |
No two role types the
same; each has zero or one role player |
No two c-nodes that participate in the assertion
are connected to the same r-node via the CR arcs for which the c-nodes serve as
the C endpoints.
The role types that participate in any given
assertion instance must always constitute a set, i.e., within any single
assertion, no two role types can be the same. Each role type has a maximum of
one role player.
Note 22: |
If the governing Application defines
assertion types that allow nodes to have subjects that are groups of
subjects, such a group of subjects can be a role player. Still, even
in such cases, there is still only one role player: the group.
| |
3.7.2 |
There must be at least one
role player |
The set of arcs that are members of the set of arcs
that specify the assertion must include at least one Cx arc.
3.8 |
Well-formedness
constraints on topic map graphs |
A topic map graph that conforms to the criteria
specified in both of the following clauses is well-formed. A topic map graph
that does not satisfy either or both criteria is not well-formed.
3.8.1 |
There is at least one node.
|
3.8.2 |
There are no arcs that do not
participate in a single well-formed assertion.
|
3.9 |
Well-formed and fully
merged topic map graphs |
When a topic map takes the form of a topic map
graph, all of the subjects that participate in the topic map are represented as
nodes.
In a well-formed topic map graph, every node
represents a single subject, but some subjects may be represented by more than
one node. In a fully merged topic map graph, every subject is represented by a
single node.
A well-formed topic map graph may or may not be
fully merged, but a fully merged topic map graph is always well-formed.
A topic map graph that does not meet this RM4TM's
criteria for well-formedness is not eligible to undergo the merging process.
Note 23: |
The process whereby well-formed topic
map graphs are converted into fully merged topic map graphs is
defined in Clause 6.
| |
4.1 |
Only a common framework
for properties; no common properties |
This RM4TM defines a framework within which each TM
Application defines all of the properties of the nodes that it governs. The
framework is designed to constrain the definitions of TM Applications in such a
way that they can be implemented independently, with each implementation able to
demonstrate the conformance of its behavior to the definition of the TM
Application, and, therefore, with the behavior of all other conforming
implementations.
Note 24: |
This RM4TM defines no properties of
nodes. It does, however, impose certain constraints on the
definitions of such properties within the definitions of TM
Applications.
| |
4.2 |
Every property is governed
by a single TM Application |
All of the properties of nodes, their value types,
and the requirements for assigning values to them are defined by TM
Applications. Every property defined by a TM Application, and every node that
exhibits values for any of the properties defined by that TM Application, is
said to be "governed" by that TM Application. Every node must be governed by one
or more TM Applications. Every property is governed by a single TM Application.
4.3 |
Subject identity
discrimination properties ("SIDPs") |
4.3.1 |
Identical subjects must be
recognizably identical |
The fact that two nodes have the same subject must
be detectable in order to trigger the merging operations that transform a
well-formed topic map graph into a fully merged one. Therefore, at least one
property of every node must be defined by its governing TM Application for the
express purpose of allowing the subject of the node to be distinguishable from
all other subjects, and in order to allow the subjects of nodes, when they are
identical, to be recognizable as identical by the topic map graph merging
process. Such properties are called "Subject Identity Discrimination Properties"
(SIDPs). The values of SIDPs, and no other data of any kind, are used in TM
Application-defined calculations to determine whether any two nodes should be
merged.
4.3.2 |
Subject identity is the
values of SIDPs |
All merging rules defined by a TM Application must
serve the Subject Location Uniqueness Objective, and all must be expressed
entirely in terms of the values of the SIDPs defined by that TM Application. TM
Applications must define sufficient SIDPs, and constrain the calculations and
assignments of their values, in sufficient detail to support all of the merging
rules defined by the TM Application.
4.3.3 |
The merging of
nodes |
When two nodes ("predecessor nodes") governed by a
TM Application are merged:
-
the resulting single node ("result node") serves as the union
of the two sets of arc endpoints of the two predecessor nodes,
-
the resulting single node exhibits the union of the built-in
property values, if any, of the two predecessor nodes, and
-
all of the property values of the result node, and of all
other nodes whose situation features are changed as a result of the merger,
are adjusted in such a way as to reflect their new situations, in accordance
with the definition(s) of the TM Application(s) that govern the properties.
Note 25: |
Nodes never merge for any reason other
than the fact that they are regarded as having the same subject; all
merging operations must serve the Subject Location Uniqueness
Objective. However, TM Applications may require the application of
any number of rules for determining whether two nodes have the same
subject. Such merging rules may be based on diverse combinations of
subject property values, each of which may be based on a complex
situation feature definition, possibly involving intermediary
assertions and nodes through which the situated node is connected to
many other nodes.
| |
4.3.4 |
RM4TM constrains the SIDPs
and SIDP values of a-nodes and c-nodes |
The subjects of a-nodes and c-nodes are
comprehensively and exclusively defined by this RM4TM in terms of their
situations in the assertions of which they are components. The properties and
value-assignment rules of TM Applications are not permitted to override,
obscure, add to, or fail to expose these subjects.
4.4 |
Other properties
("OPs") |
TM Applications may also define properties whose
values are not used for subject discrimination purposes; such properties are
called "OPs" (other properties). TM Applications define the purposes of OPs, and
the processes by which their values are calculated and assigned.
4.5 |
Names of properties of
nodes |
Each property has a name that is unique, within the
TM Application, among all the names of the properties, assertion types, and role
types defined by the TM Application. In a topic map graph, however, property
names may be defined by multiple TM Applications, so different TM Applications
may define the same property name. Therefore, each property name consists of two
fields, separated by the field separator symbol defined in 4.5.
The first field is the name of the TM Application itself, and the second field
is the property name which is unique within the TM Application.
Editor's Note 2: |
TO DO: Select a field separator symbol, so everybody knows
what not to use in the name of a TM Application, property,
assertion type, or role type. It can't be a colon (":") if we expect
people to use IETF scheme names in their TM Application-name URIs, such as
"http:". |
4.6 |
Values of properties of
nodes |
The values of properties of nodes, the types of
their values, and the methods whereby their values are calculated and assigned,
are all defined by their governing TM Applications.
4.7 |
Assignment of values of
properties of nodes |
The values of the properties of nodes are assigned
in two ways. They are either:
-
"built-in" or
-
"conferred".
4.7.1 |
Built-in values of
properties of nodes |
For bootstrapping reasons, TM Applications must
define at least some nodes to be present in all topic map graphs that contain
nodes that are governed by the TM Application, regardless of whether they appear
explicitly in any interchangeable topic map governed by that TM Application.
Such nodes are called "built-in" nodes, and they must be defined as having
"built-in values" for at least one of their SIDPs.
A node's built-in property values cannot be
overridden by virtue of its situation in the topic map graph. It is a Reportable
TM Processing Error if a built-in node's situation requires any of its
properties that have built-in values to have values conferred upon them that are
different than their built-in values.
Note 26: |
Values can be conferred on properties
of built-in nodes that do not have built-in values.
| |
Note 27: |
The determination of the ontological
basis of a TM Application, how that ontological basis is
bootstrapped, and how self-documenting (in terms of the topic map)
the ontology is, are all in the realm of TM Application design. For
example, a TM Application may be designed in such a way that all of
its assertion types are represented by built-in nodes.
Alternatively, a TM Application may be designed in such a way that
only enough "bootstrap" assertion types (with built-in SIDPs) are
required to be present to allow external definitions of all other
assertion types to be used to confer the SIDP values of such
assertion type subjects upon the nodes that represent them.
| |
4.7.2 |
Conferred values of
properties of nodes |
The properties of nodes can have values that are
conferred upon them by their nodes' situations in the topic map graph. These
values are called "conferred" values.
4.7.2.1 |
Overview of requirements
governing definitions of conferred property values
|
With respect to the values conferred on the
properties of nodes, TM Applications must define:
-
the situation features of nodes that call for values to be
conferred upon the properties of such nodes,
-
the properties of such nodes to which the values are assigned,
-
the types of the property values, and
-
how the values are calculated.
Note 28: |
The definitions of the processing
steps involved in calculating property values are not constrained by
this RM4TM. Such processing may, for example, involve resolving
addresses and using whatever information is addressed in further
processing steps.
| |
4.7.2.2 |
Situation features that TM
Applications define as requiring values to be conferred on the properties
of nodes |
For all purposes of defining situation features
that require values to be conferred on the properties of nodes, such situation
features may be described in terms of whole assertions, or in terms of specific
nodes and arcs, or both. In any case, however, for a given node, a situation
feature is always fundamentally describable as the given node's service as the
endpoints of some set of paths whose characteristics are defined by the TM
Application as constituting a situation feature that requires values to be
conferred.
When a node's service as the x endpoint of
one or more Cx arcs (i.e., when a node's situation as a role player) is
an aspect of a TM Application-defined situation feature that requires values to
be assigned to one or more of its properties, the definitions of such situation
features, the properties to which the values are assigned, the types of the
values, and how the values are calculated, must all be defined as part of, or at
least with respect to, the definition of the type of assertion of which the
assertion that has the node as a role player is an instance.
Note 29: |
For example, if the TM Application
defines an assertion type for the purpose of expressing set
memberships, in which one role is played by the node whose subject
is the set, and the other role is played by a node whose subject is
a member of the set, then the value of the corresponding property of
the node can be a node set which is the set of all the nodes whose
subjects are members of the set.
| |
Note 30: |
Not all situation features that
require property values to be conferred are situations in which the
conferred-upon node is a role player. Some situation features are
within a single assertion subgraph. For example, all TM Applications
must define a property for all the a-nodes they govern, whose value
is the assertion type of the a-node; this property value is
conferred upon it on account of its service as the A endpoint of an
AT arc (see 4.3.4).
| |
4.7.2.3 |
SIDP values cannot be
conferred on a-nodes or c-nodes on account of their situations as role
players. |
The SIDP values that reflect the subjects of
a-nodes and c-nodes, and that, therefore, determine whether they should be
merged, can only be conferred upon them by virtue of their service as the A and
C endpoints of arcs. This RM4TM defines the merging rules for assertions (see 5.2.8.2),
and conforming TM Applications cannot violate these rules. Therefore, TM
Applications cannot require the values of the subject identity discrimination
properties (SIDPs) of a-nodes or c-nodes to be conferred upon them on the basis
of their situations as role players (i.e. on the basis of their service as the
x endpoints of Cx arcs).
4.7.2.4 |
SIDP values cannot be
conferred on either r-nodes or t-nodes on account of their situations as R
or T endpoints of CR or AT arcs,
respectively. |
The SIDP values that reflect the subjects of
r-nodes and t-nodes are not, and cannot be, conferred upon them by virtue of
their service as the R endpoints of any CR arcs, or the T endpoints of AT arcs,
respectively. SIDP values can only be conferred upon r-nodes and t-nodes by
virtue of their situations as role players (i.e., as the x endpoints of
Cx arcs. (Alternatively, their SIDP values can be built-in.)
4.8 |
Internal consistency of
the values of a node's SIDPs |
TM Applications must define consistency rules
regarding the combinations of values that any given node's SIDPs can exhibit in
order for that node to be regarded as exhibiting a valid combination of SIDP
values. Merging processes must be implemented in such a way as to detect and
report (as Reportable TM Processing Errors) conditions that violate these
consistency rules.
Note 31: |
For example, if one of a node's SIDP
values indicates that the node's subject is a name, and another SIDP
value indicates that the node's subject is a set of subjects, the
definition of the TM Application can require such a node to be
regarded as exhibiting an invalid combination of SIDP values. By
stating such a constraint, the TM Application's definition can
reflect its designers' conviction that there can never be a single
subject that is both a name and a set.
| |
5 |
Definitions
of TM Applications |
This RM4TM constrains the definitions of "Topic
Maps Applications (TM Applications)", establishing the criteria that such
definitions must meet in order to facilitate the achievement of the Subject
Location Uniqueness Objective, and to assure that topic maps can be
interchanged, understood, and amalgamated predictably, regardless of their
governing TM Applications, and regardless of the combinations of TM Applications
that may govern the subjects represented by any single topic map graph that may
result from amalgamating multiple topic maps.
5.1.1 |
Any participating subjects
|
This RM4TM does not constrain the nature or
properties of subjects that can participate in topic map graphs.
5.1.2 |
Most constraints are
imposed by TM Applications |
This RM4TM imposes minimal constraints on the
definitions of "Topic Maps Applications (TM Applications)," so that the
definition of each TM Application establishes a context within which the nature
of the topic map information being represented under its governance is
well-defined.
5.1.3 |
Purpose of TM Application
definition requirements |
This RM4TM does not define any specific TM
Applications, nor does it define any aspects of any specific TM Applications.
Instead, it imposes constraints on the definitions of conforming TM
Applications. The purpose of these constraints is to require TM Applications to
be defined in sufficient detail, and with sufficient rigor, so that:
5.1.3.1 |
conforming implementations
and conforming topic maps can be created by diverse and independent
creators and creative processes, |
5.1.3.2 |
given any conforming topic
map created by any conforming implementation, the interpretation of that
topic map by any other conforming implementation will be verifiably
consistent with the TM Application, and |
5.1.3.3 |
the effort and expense
involved in amalgamating the knowledge represented by topic maps that
conform to single and multiple TM Applications can be minimized, while the
consistency of the knowledge represented by the resulting amalgamated
topic maps can be maximized, without information loss, and with the
greatest possible achievement of the Subject Location Uniqueness Objective
by automatic means. |
5.1.4 |
Overview of required TM
Application definition components |
The definition of a conforming TM Application must
include all of the following:
-
A name that is different from the name of any other conforming
TM Application. (See 5.2.1.)
-
A set of definitions of the properties of nodes and their
value types, specifying which property values are intended to be used for
purposes of deciding whether nodes have identical subjects (i.e., specifying
which are SIDPs, and which are OPs). (See 5.2.2.)
-
The validity constraints on the values of the properties of
nodes. (See 5.2.3.)
-
A set of situation features other than service as the x
endpoints of Cx arcs, and the property values that must be conferred on
the nodes so situated. (The purpose of these property values is to enable arc
traversals within assertions. Not all intra-assertion arc traversals are
required to be enabled. See 5.2.4.)
-
A set of assertion types, the role types of each assertion
type, the validation constraints on their instances, and the property values
that must be conferred upon the role players of their instances. (See 5.2.5.)
-
Rules for determining whether the values of any given node's
subject identity discrimination properties (SIDPs) are consistent with each
other. (See 5.2.6.)
-
A set of built-in nodes, with built-in property values, that
must appear in every topic map graph that conforms to the TM Application. (See
5.2.7.)
-
The rules for merging nodes on the basis of their subject
identity discrimination properties (SIDPs). (See 5.2.8.)
-
The rules for combining the built-in values of the properties
of built-in nodes during merging, if the designers of the TM Application
anticipate the need for such combination. (See 5.2.9.)
-
If the TM Application defines one or more interchange
syntaxes, the procedures for constructing topic map graphs from instances of
each syntax ("Syntax Processing Models"), and "node demander" rules that allow
topic map graph nodes to be indirectly addressed by addressing their
corresponding syntactic constructs. (See 5.2.10.)
5.2 |
Constraints on definitions
of aspects of TM Applications |
The following subclauses specify the detailed
constraints governing each of the required aspects of the definitions of TM
Applications.
5.2.1 |
Definition of TM
Application name |
The name of the TM Application must be specified.
Care should be taken to select a name that is unlikely to be used as the name of
any other TM Application, including other versions and/or conformance levels of
an evolving or configurable TM Application. (Each version, conformance level, or
other configuration must be regarded as a distinct TM Application for purposes
of naming.) This name must be used as the first field of all of the property
names that it defines. The name must not include the "name field separator"
symbol shared by all TM Applications whose definitions conform to this RM4TM.
(See 4.5.)
Non-ISO-standard TM Applications are not permitted
to use names that begin with "IS", irrespective of the cases of the letters, in
the first field.
Note 32: |
One way to minimize the risk of
ambiguity that might result from coincidental use of identical names
for TM Applications created by different TM Application designers is
for designers to use, as their TM Application names, URIs that
address the internet domain names that the designers themselves
control, or that are registered names within controlled TM
Application namespaces within the internet domains of such standards
organizations as OASIS, the World Wide Web Consortium, IDEAlliance,
or such library service organizations as the Online Computer Library
Center (OCLC), the Library of Congress, etc.
| |
5.2.2 |
Definition of properties
and property values |
All properties of nodes should be explicitly
defined. All properties whose values are used to determine whether two nodes
have the same subject (i.e., all SIDPs) must be explicitly defined.
Each property definition must specify all of the
aspects described in the following subclauses:
The property definition must specify a name that is
unique among the names of all the properties, assertion types, and role types
defined by the TM Application. The name must not include the "name field
separator" symbol (see 4.5).
The property definition must specify the type of
value of which the value must be an instance, if the property exhibits a value.
Note 33: |
Property value types are not
constrained by this RM4TM. They can be simple and/or complex. They
can be data and/or nodes.
| |
5.2.2.3 |
Constraints on property
values |
The property definition may specify validity
constraints on the value of the property. During the process of converting a
well-formed topic map graph into a fully merged one, implementations of the TM
Application must validate all SIDP values for conformance to all of the validity
constraints defined for them. (See 6.4.)
5.2.2.4 |
Subject identity
discrimination properties (SIDPs) |
The property definition must indicate whether the
property being defined is a subject identity discrimination property (SIDP).
Each property definition should include an
explanation of the significance of the property and its values, including an
explicit indication, where appropriate, of the significance of the condition in
which no value is exhibited. If the property is a subject identity
discrimination property (SIDP), such an explanation must be provided.
5.2.3 |
Definitions of validity
constraints on the values of properties |
If, in order to be considered valid, a property
value must conform to certain constraints, the TM Application should define such
constraints for each such property, wherever possible.
5.2.4 |
Definition of assignment
of property values conferred on account of arc endpoint service other than
service as the x endpoints of Cx arcs
|
All TM Applications are required to define subject
identity discrimination properties (SIDPs) for a-nodes and c-nodes, and rules
for conferring values upon them, such that all a-nodes and c-nodes will exhibit
values for those properties that will support the merging of assertions in
conformance with the assertion merging rules specified in 5.2.8.2.
Note 34: |
This RM4TM does not require TM
Applications to define properties whose values reflect the internal
structure of assertions comprehensively.
| |
Note 35: |
See Annex C
for an informative example of a set of property definitions that
reflect the internal structure of assertions.
| |
5.2.5 |
Definitions of assertion
types |
The definition of each assertion type defined by a
TM Application must include all of the aspects specified in the following
subclauses.
5.2.5.1 |
Definitions of names of
assertion types |
For each assertion type, a name that is unique
among all the names of assertion types, role types, and properties defined by
the TM Application must be specified. The names of assertion types have two
fields, in the same manner as property names, with the name of the TM
Application in the first field, and the name of the assertion type in the second
field. The name must not include the "name field separator" symbol defined in 4.5.
5.2.5.2 |
Definition of the
semantics of the assertion type |
The semantics of each assertion type must be
explained.
A set of role types must be specified, each member
of which will always be represented in all instances of the assertion type in
the topic map graph, regardless of whether they have role players.
This RM4TM does not prohibit multiple assertion
types from incorporating the identical role type(s).
Note 36: |
The designs of TM Applications may be
inherently more robust if all of the role types defined as
components of their assertions types are regarded as unique
subjects, even when they share the same names. For example, the
father-daughter relationship type and the father-son relationship
type may, in some cultures, be different in character, and the role
of fatherhood may therefore also turn out to be different. If a TM
Application defines both the father-daughter and father-son
relationship types in such a way as to regard the role type of
"father" as the same subject in both relationship types, then no
distinction can ever be made between the two kinds of fatherhood,
other than by defining a new TM Application.
| |
Each role type definition includes all of the
aspects specified in the following subclauses.
For each role type, a name which is unique among
all the names of assertion types, role types, and properties defined by the TM
Application must be specified. The names of role types have two fields, in the
same manner as property names, with the name of the TM Application in the first
field, and the name of the role type in the second field. The name must not
include the "name field separator" symbol defined in 4.5.
5.2.5.3.2 |
Definitions of property
values conferred on role players of assertion instances
|
If, in instances of the assertion type being
defined, role players of the role being defined are required to have property
values conferred upon them, the procedure required to calculate such values
should be defined. It must be defined for subject identity discrimination
properties (SIDPs).
TM Applications must not allow values to be
conferred on the SIDPs of any of the role players of assertions whose assertion
types are unspecified.
5.2.5.3.3 |
Definition of semantics of
role type |
The semantics of each role type must be explained.
5.2.6 |
Definition of consistency
of the values of SIDPs of a node |
The rules for detecting conditions in which the
subject identity discrimination properties (SIDPs) of a node have conflicting
values must be defined.
5.2.7 |
Definitions of built-in
nodes and their built-in property values |
Some of the subjects defined by a Topic Maps
Application - at least enough to bootstrap at least some of its assertion types
and role types into existence - must be represented by "built-in" nodes that are
logically present in all topic map graphs at the moment that they begin to be
constructed.
These built-in nodes and their built-in subject
identity discrimination property values must be defined.
If there are any built-in assertions, the built-in
property values that correspond to their arcs must be defined, and their
built-in a-nodes and c-nodes must be provided with built-in values for their
subject identity discrimination properties (SIDPs) such that the merging of the
built-in assertions in conformance with the assertion merging rules specified in
5.2.8.2
will occur. The definitions of the properties that have built-in values in the
built-in nodes defined by the TM Application must be such that, when topic map
graphs governed by the TM Application are constructed, any assertions that are
implicit in the built-in property values will be unambiguously recognized, so
that they can be represented explicitly in the graph.
Note 37: |
Whenever two or more topic maps that
are governed by the same TM Application are merged, all of their
built-in nodes necessarily must merge.
| |
5.2.8 |
Definition of merging
rules |
5.2.8.1 |
Node merging is based only
on SIDP values |
TM Applications must define node merging rules that
determine whether any two nodes must be merged, and these rules must operate
solely on the basis of the values of subject identity discrimination properties
(SIDPs).
5.2.8.2 |
Merging rules for
assertions |
5.2.8.2.1 |
Definition of subject
identity of a-nodes |
In all conforming TM Applications, two assertions
are merged to become a single assertion when their respective a-nodes are deemed
to represent the same subject. All TM Applications are required to define
merging rules that apply uniformly to all assertions, such that they will always
be merged during the process of converting a well-formed topic map graph into a
fully merged topic map graph under the conditions described in the following
subclauses, and such that they will be automatically merged under no other
conditions and on no other basis:
5.2.8.2.1.1 |
Both assertions specify the
same assertion type. |
Note 38: |
If neither assertion specifies its
assertion type, it cannot be assumed that the lack of an assertion
type itself constitutes a specific assertion type which is the same
for both.
| |
5.2.8.2.1.2 |
Both assertions have the same
role player, or both have no role player, for each of the same role types.
|
When two assertions are merged, the two a-nodes
become a single a-node, and each pair of c-nodes that are connected to the same
r-node and a-node become a single c-node. (Nodes are merged as described in 4.3.3.)
5.2.8.3 |
The human factor in
merging |
The merging rules defined by TM Applications are
intended be exploited by creators of topic maps, so that the topic maps they
create can incorporate other topic maps by reference, and so that when such
references are resolved, the resulting merged topic map graph will be identical
to the one that the creator intended.
In all cases, and regardless of their governing
Application(s), when two nodes represent the same subject, they must be merged.
In other words, the Subject Location Uniqueness Objective always applies. It is
the responsibility of the creator of every topic map to see to it that all such
mergers will occur when the topic map is processed in conformance with the rules
defined by its governing TM Applications.
Topic map creators must accept responsibility for
the fully merged topic map graphs represented by the interchangeable topic maps
that they create, even when their interchangeable topic maps incorporate topic
maps that were created by others. When interchangeable topic maps incorporate
other topic maps by reference, they must also contain (or incorporate by
reference) subjects and assertions that cause the merging process to yield a
satisfactory result in which no two nodes have the same subject, even when, in
the absence of any special arrangements made by the creator of the topic map, no
governing TM Application would cause the two nodes to merge. It is the
responsibility of topic map creators to make such special arrangements, by
adding assertions that will cause the nodes that must be merged to have SIDP
values that will be recognized as requiring their merger. (See 7.4.)
Note 39: |
Such special arrangements may involve
indirectly addressing the nodes of the topic map graph represented
by the interchangeable forms of the topic maps that are incorporated
by reference, by addressing the syntactic "node demanders" of the
nodes that must be merged. See 5.2.10.3.
| |
5.2.9 |
Definitions of rules for
merging property values when merging
nodes |
5.2.9.1 |
Merging built-in property
values |
The Subject Location Uniqueness Objective may
demand that built-in nodes be merged, but the effect of merging their built-in
values cannot be determined by the situation features of the node that results
from their merger. Therefore, TM Applications must define rules for combining
the built-in values of built-in nodes.
5.2.9.2 |
Merging conferred property
values |
In order to optimize the merging process, TM
Applications may also define procedures for combining the conferred property
values of two nodes in the conferred property values of the single node that
results from merging them. All such rules must be such that the result of
applying these procedures is indistinguishable from the result of recalculating
the merged node's conferred property values on the basis of its new situation.
Note 40: |
In any case, whenever two nodes are
merged, the situations of other nodes may also be affected,
necessitating recalculation of their property values, as well.
| |
5.2.10 |
Definitions of interchange
syntaxes |
The definition of a Topic Maps Application may or
may not define one or more syntaxes for the interchange of the topic maps it
governs. The constraints on the definitions of such syntaxes are specified in
the following subclauses.
The syntax itself must be defined in such a way
that instances of it can be validated for conformance with its syntactic rules
before any attempt is made to render it as a topic map graph.
A "Syntax Processing Model" must be defined that
specifies, in terms of the definition of each such syntax, how the information
represented by instances of the syntax must be comprehensively represented as
topic map graphs.
Note 41: |
In other words, a Syntax Processing
Model specifies how to construct topic map graphs from instances of
the syntax, without omitting any information represented in the
instances.
| |
5.2.10.3 |
Facilities for indirect
node addressing via syntactic
constructs |
A list of syntactic constructs ("node demanders")
whose instances can be unambiguously addressed within the instances of the
syntax must be provided. Each such node demander must be defined as being
associated with a specific node whose existence in the topic map graph that the
instance represents can reasonably be regarded as being "demanded" by the
existence of the demander.
The list of node demanders may or may not provide a
facility for comprehensively addressing every node in the topic map graph
constructed from a syntactic instance.
5.2.10.3.2 |
"Same subject as demanded
node" assertion type |
Each TM Application that defines one or more Syntax
Processing Models must also define at least one assertion type of which one of
the role types can be played by a node demander, that confers one or more SIDP
values on the player of another of its role types such that its subject will be
recognized by the merging process as being the same as the subject of the node
whose existence is demanded by the node demander.
Note 42: |
The "node demander" facilities defined
for the interchange syntaxes of TM Applications allow
interchangeable topic maps to refer to each other in ways that
guarantee the merging of nodes that are separately demanded by each
of them. | |
TM Applications can include, as portions of
themselves, other TM Applications, by reference, but only in their entirety. The
names of borrowed properties, assertion types and role types are not affected by
being borrowed; each remains as defined in the definition of its TM Application
of origin.
6 |
Constructing
fully-merged topic map graphs from well-formed topic map
graphs |
This RM4TM is designed to allow all well-formed
topic map graphs, regardless of their governing TM Application(s), to be
processed in essentially the same way, in order to achieve the result of a
fully-merged topic map graph. The process is designed to allow modular
implementation of systems for processing topic maps that are governed by
multiple TM Applications.
Conforming implementations of tools that build
fully-merged topic map graphs are free to construct fully merged topic map
graphs from well-formed topic map graphs in any way that, in any instance,
results in a graph that is indistinguishable from the graph that would
theoretically result by applying the process described in the following
subclauses. The subclauses (and the paragraphs within them) appear in the order
in which the steps must be performed (at least theoretically, for purposes of
this RM4TM's definition of the merging process in terms of its required
results).
6.1 |
Construct the topic map
graph |
The first step is to construct a well-formed topic
map graph. The process of constructing well-formed topic map graphs is only
partly constrained by this RM4TM.
6.1.1 |
Endow the graph with
built-in nodes |
When constructing a new topic map graph, it must
first be endowed with all of the built-in nodes and arcs defined by the TM
Application(s) that govern the graph.
Note 43: |
Built-in arcs are implicitly
represented by the built-in property values that correspond to them.
See 5.2.7.
| |
6.1.2 |
Interpret interchangeable
topic map as topic map graph |
If the graph is being constructed from an instance
of an interchange syntax, the Syntax Processing Model defined by the governing
TM Application must be applied to the instance, with the output being added to
the well-formed topic map graph that is under construction.
6.1.3 |
Add nodes and
assertions |
This RM4TM does not constrain any other aspects of
the original construction of a well-formed topic map graph.
Note 44: |
The well-formed topic map graph can be
interactively constructed, or constructed from sources that are not
instances of interchange syntaxes of TM Applications, or in any
other way.
| |
Note 45: |
Any notation or schema for any kind of
information can have a TM Application built around it, so that, in
effect, it becomes a topic map interchange syntax.
| |
6.2 |
Validate assertion
instances for conformance to definitions |
All of the assertions must be validated for
conformance to the definitions of their assertion types specified by their
governing TM Applications. (See 5.2.5.)
6.3 |
Assign values to
properties of nodes |
All of the nodes that appear in situations that
have situation features that are defined by any of the governing TM Applications
as demanding that values be conferred upon their SIDPs must be discovered, and
the appropriate values must be calculated and assigned to the designated SIDPs,
as specified by the definition of the TM Application.
6.4 |
Validate the values of the
SIDPs of nodes |
Each SIDP value of each node must be examined
individually, to see whether it conforms to the constraints defined for it by
the definition of its governing TM Application. Any values that are not of the
defined type (see 5.2.2.2),
or that do not conform to other constraints defined for them by the governing TM
Application (see 5.2.2.3),
must be detected and reported as Reportable Topic Map Processing errors.
For each node, and for each TM Application that
governs it, all of the property values governed by that TM Application,
including properties defined in "borrowed" TM Applications, must be examined for
consistency with each other, as such consistency is defined by the governing TM
Application (see 5.2.6).
If there are any inconsistencies among the values of its SIDPs, they must be
reported as Reportable Topic Map Processing Errors.
If any errors are reported, the conditions that
required the report must be changed in such a way as to rectify the problem, and
the merging process must (at least theoretically, for purposes of this RM4TM's
definition of the merging process in terms of its required results) be restarted
at the step described in 6.2.
6.5 |
Merge nodes according to
the defined merging rules |
The values of the subject identity discrimination
properties (SIDPs) of each pair of nodes must be compared, and the merging rules
defined by each of the governing TM Applications must be used to determine
whether the two nodes should be merged. When a rule indicates that the nodes
should be merged, they must be merged in accordance with 4.3.3.
Assertions that represent the same relationships
must always be merged in accordance with 5.2.8.2.
6.6 |
Conditionally stop or
repeat |
If any nodes were merged in the steps described in
6.5,
then the steps described in 6.3,
6.4,
and 6.5
must be repeated. When this same sequence of steps has been repeated and no
merging occurs in the step described in 6.5,
the topic map graph has been fully merged, and processing must stop.
7.1 |
Conforming TM Applications
|
Topic Maps Applications must not claim conformance
to this RM4TM if their designs are inconsistent, in any way, with the
constraints imposed by this RM4TM on the designs of conforming Topic Maps
Applications.
Each TM Application must have a conforming Topic
Map Application Definition (see 7.2).
7.2 |
Conforming TM Application
definitions |
Each conforming Topic Map Application Definition
must include comprehensive and explicit definitions of all of the components of
Topic Maps Applications, as specified by this RM4TM.
Note 46: |
If the design (ontology) of a TM
Application permits the subjects of nodes to be conferred upon them
by assertions that connect these nodes to pieces of addressable
information that are regarded as their "subject indicators" (the
Standard Application is an example of such a TM Application), then
it seems only natural to make the components of the TM Application's
design document(s) that define the TM Application's assertion types
and role types conveniently addressable, and to make the addresses
of these components the built-in values of the appropriate SIDPs of
some of the built-in nodes defined by the TM Application. In this
way, the topic maps governed by the TM Application can be
authoritatively self-documenting with respect to their assertion
types and role types.
| |
7.3 |
Conforming implementations
of TM Applications |
The behaviors of conforming implementations must be
consistent with all of the behavioral constraints imposed on them by this RM4TM
and by the TM Application definitions they claim to implement.
Implementations must report Reportable Topic Map
Processing Errors when they encounter assertion types, role types, or properties
that are not defined by their governing TM Applications, or for which they
cannot perform the property value calculations, and when they cannot apply the
property value calculations or merging rules required by those definitions.
7.4 |
Conforming interchangeable
topic maps |
Conforming interchangeable topic maps conform in
all respects to the syntactic and semantic constraints imposed by the
definitions of the TM Applications that govern them.
When interpreted in accordance with their governing
TM Applications, conforming topic maps yield topic map graphs in which all
subjects are represented as nodes, in which no node is treated as having, or
apparently has, more or less than a single subject, and in which the Subject
Location Uniqueness Objective is honored, i.e., in which no two nodes represent
the same subject.
Annex A |
Brief
informal overview (informative) |
A.1 |
The structure of topic spaces:
topic map graphs |
Every topic map defines a multidimensional "topic
space" -- a space in which the only locations are topics, and in which the
distances between topics are measurable in terms of the number of intervening
topics which must be visited in order to get from one topic to another, and the
kinds of relationships that define the path from one topic to another, if any,
through the intervening topics, if any.
This RM4TM describes the abstract structure of
topic spaces, which it calls "topic map graphs". It allows Topic Map
Applications to be described in terms of this abstract structure. All topic
maps, regardless of the diversity of their ontologies, interchange syntaxes,
subject discrimination rules, implementation interfaces, etc., can be understood
in terms of this common abstraction.
A.2 |
One subject per node; one node
per subject |
In all topic maps, every topic represents a single
subject. In the topic space represented by a topic map, every location (in
Greek, every topos) represents exactly one subject; this is the case in
the "well-formed topic map graph" abstraction defined by this RM4TM. In a "fully
merged topic map graph," the Subject Location Uniqueness Objective has been
achieved; every subject has a single location. This RM4TM specifies the process
whereby a fully merged topic map graph is constructed from well-formed topic map
graph.
Well-formed topic map graphs consist of subgraphs,
called "assertions," that represent relationships between subjects. (See Annex
B
for a very brief introduction to assertions.)
A.3 |
All subjects are represented by
nodes |
Even though every interchangeable topic map is a
map of a topic space, there is a key difference between an interchangeable topic
map and the topic map graph that it represents: in a topic map graph, every
subject, in order to exist in the topic space, must be represented as a node. By
contrast, in an interchangeable topic map, some subjects are not explicitly
represented by syntactic constructs. Instead, these subjects are present only by
virtue of the implicit semantics that are built into the syntax, as defined by
the Topic Map Application that governs that syntax.
In order to eliminate ambiguity as to the contents
of the topic spaces they represent, this RM4TM requires the definitions of
conforming Topic Map Applications to define "Syntax Processing Models" for their
topic map interchange syntaxes. A Syntax Processing Model for a topic map
interchange syntax constrains the construction of topic map graphs such that all
subjects that participate, implicitly or explicitly, in instances of that syntax
are explicitly represented in the topic map graph by nodes.
A.4 |
Nodes have properties
|
The subjects (and all other characteristics) of
nodes are expressed by the values of their properties. The properties, their
value types, and the rules for conferring values on the properties are all
defined by TM Applications. The rules for conferring property values are
expressed in terms of the relationships in which the node participates in the
graph.
The values of the properties of nodes are used to
determine whether they represent the same subjects. The rules for comparing
property values, in order to make this determination, are defined by TM
Applications. These rules are applied when a fully merged topic map graph is
constructed from well-formed topic map graph. Thus, there is a sense in which
the property values are determined by the graph structure, and a different sense
in which the graph structure is determined by the property values; the merging
process iteratively applies the two senses in sequence until no further merging
occurs.
Annex B |
Assertion
diagrams (informative) |
Figure 1:
This diagram shows an instance of an assertion.
Each of the eight participating subjects is shown as a black dot, and each arc
is shown as a colored stripe, each end of which is labeled with an endpoint type
name. For example, on the left, a Cx arc appears with its x
endpoint on the left end, and its C endpoint on the right end. The subject of
this assertion is the idea that George (the "role player" on the left) has an MD
degree from Harvard (the "role player" on the right). It is a relationship
between George and Harvard in which Harvard plays the role of a
degree-conferring institution (the "institution" role type), and George plays
the role of the person upon whom the degree is conferred (the "MD degree holder"
role type). The assertion is an instance of a "medical qualification" assertion
type.
In addition to the six different subjects already
discussed, there are still two more, each of which is shown as a black dot where
the C endpoints of three different arcs converge; these are called "casting"
nodes. The subject of the left-hand casting node is the fact that George plays
the "MD degree holder" role in this particular assertion. The subject of the
right-hand casting node is the fact that Harvard plays the "institution" role in
this particular assertion. Every assertion asserts a relationship among its role
players, which are always and only found at the x endpoints of Cx
arcs. Every node (here, every black dot) can play any number of roles in any
number of assertions. In the very small, single-assertion topic map graph
depicted here, there are only two role players (George and Harvard), and each of
them plays only one role in one assertion.
Figure 2:
This diagram shows the structure of all assertions
that have a specified assertion type, two role types, and two role players. The
structure of a 2-role, 2-role-player assertion with an unspecified
assertion type is the same, except that the AT arc and the t-node are not
present. The structure of a 2-role, 1-role-player assertion is the same except
that one of the Cx arcs, and the node at its x endpoint, are not
present. Assertions that have more than two role types have the same structure,
except that for each additional role type, there is an additional AC arc, an
additional c-node, an additional CR arc, an additional r-node, and possibly an
additional Cx arc with a role player node serving as its x
endpoint.
Annex C |
Sample
properties that reflect assertion structure (informative)
|
The following list of property definitions is
intended to illustrate how the internal structure of assertions could be
reflected in a set of property definitions within the definition of a TM
Application.
Editor's Note 3: |
Consider: should there be a DTD for TM Application
Definitions? If so, should it be normative or informative?
|
Editor's Note 4: |
Consider: How often will TM Applications borrow the
definitions provided here (or definitions like them)? If we anticipate
that they are going to be borrowed, should we present these definitions as
a normative TM Application? Should the SAM define them as a separate TM
Application module so that they can be borrowed by TM Applications that
don't want to borrow the entire SAM? If the SAM defines them (or something
like them), should they appear in the RM at all, even informatively?
On the other hand, maybe the SAM won't include such a
comprehensive set of properties for reflecting the structure of
assertions, with full traversibility of all the arc types. In that case,
does it make more sense for these definitions to appear in the RM, as they
do here? |
* Properties for which only a-nodes can exhibit
values:
Name: roleCastings
Value type: node set
Constraints on values: Only a-nodes exhibit values for this
property, and all a-nodes must exhibit a value for this property.
The value must be a set of c-nodes.
SIDP or OP?: SIDP
Semantics: The value is the node set which is the set of c-nodes
that serve as the C endpoints of the set of AC arcs of which the
a-node serves as the A endpoint.
Name: assertionType
Value type: node
Constraints on values: Only a-nodes exhibit values for this
property. The value must be a t-node.
SIDP or OP?: SIDP
Semantics: The value is the node, if any, that serves as the T
endpoint of the AT arc of which the a-node serves as the A
endpoint. If no value is exhibited, the assertion type of the
assertion of which the a-node serves as the nexus is unspecified.
Name: roleTypes
Value type: node set
Constraints on values: Only a-nodes exhibit values for this
property. The value must be a set of r-nodes.
SIDP or OP?: OP
Semantics: The value is the node set which is the set of r-nodes
that serve as the R endpoints of the set of RC arcs of which the
set of c-nodes serve as the C endpoints, which set of c-nodes
serve as the C endpoints of the set of AC arcs of which the
a-node serves as the A endpoint.
Name: players
Value type: node set
Constraints on values: Only a-nodes exhibit values for this
property. (There are no other constraints; any nodes can be
members of the node set.)
SIDP or OP?: OP
Semantics: The value is the node set which is the set of nodes that
serve as the x endpoints of the set of Cx arcs of which the set
of c-nodes serve as the C endpoints, which set of c-nodes serve
as the C endpoints of the set of AC arcs of which the a-node
serves as the A endpoint.
* Properties for which only c-nodes can exhibit
values:
Name: rolePlayer
Value type: node
Constraints on values: Only c-nodes exhibit values for this
property. There are no other constraints; any node can be the
value.
SIDP or OP?: SIDP
Semantics: This property may or may not exhibit a value. If it
does, the value is the node, if any, that serves as the x
endpoint of the Cx arc of which the c-node serves as the C
endpoint.
Name: roleType
Value type: node
Constraints on values: Only c-nodes exhibit values for this
property, and all c-nodes must exhibit a value for this property.
The value must be an r-node.
SIDP or OP?: SIDP
Semantics: The value is the node that serves as the R endpoint of
the CR arc of which the c-node serves as the C endpoint.
Name: assertion
Value type: node
Constraints on values: Only c-nodes exhibit values for this
property, and all c-nodes must exhibit a value for this property.
The value must be an a-node.
SIDP or OP?: SIDP
Semantics: The value is the node that serves as the A endpoint of
the AC arc of which the c-node serves as the C endpoint.
* Properties for which only r-nodes can exhibit
values:
Name: castingsOfRole
Value type: node set
Constraints on values: Only r-nodes exhibit values for this
property. All members of the node set must be c-nodes.
SIDP or OP?: OP
Semantics: The value is the node set which is the set of c-nodes
that serve as the C endpoints of the set of CR arcs of which the
r-node serves as the R endpoint.
* Properties for which only t-nodes can exhibit
values:
Name: assertionsOfType
Value type: node set
Constraints on values: Only t-nodes exhibit values for this
property, and all t-nodes must (by definition) exhibit a value
for this property. All members of the node set must be a-nodes.
SIDP or OP?: OP
Semantics: The value is the node set which is the set of a-nodes
that serve as the A endpoints of the set of AT arcs of which the
t-node serves as the T endpoint.
* Properties for which all kinds of nodes
(including but not limited to a-nodes, c-nodes,
r-nodes, and t-nodes) can exhibit values:
Name: rolePlayings
Value type: node set
Constraints on values: All nodes in the set must be c-nodes.
SIDP or OP?: OP
Semantics: The node set whose members are the c-nodes at the C
endpoints of the Cx arcs whose x endpoints are the node. If no
value is exhibited, then the node plays no roles in any
assertions.
* Properties for which only a-nodes, c-nodes,
r-nodes, and t-nodes can exhibit values:
Name: nodeType
Value type: enumeration
Constraints on values: Value must be one of "assertion", "casting",
"roleType", or "assertionType"
SIDP or OP?: SIDP
Semantics: Exhibits a corresponding value ("assertion", "casting",
"roleType", or "assertionType") when the node is an a-node,
c-node, r-node or t-node. When it exhibits no value, the node is
neither an a-node, nor a c-node, nor an r-node, nor a t-node.
|