From LJM@SLACVM.BITNET Thu Aug 27 21:23:53 1992
Received: from vm.uni-c.dk by dkuug.dk via EUnet with SMTP (5.64+/8+bit/IDA-1.2.8)
	id AA02681; Thu, 27 Aug 92 21:23:53 +0200
Message-Id: <9208271923.AA02681@dkuug.dk>
Received: from vm.uni-c.dk by vm.uni-c.dk (IBM VM SMTP V2R2) with BSMTP id 2423;
   Thu, 27 Aug 92 21:24:07 DNT
Received: from SLACVM.SLAC.STANFORD.EDU by vm.uni-c.dk (Mailer R2.07) with
 BSMTP id 8535; Thu, 27 Aug 92 21:24:06 DNT
Received: by SLACVM (Mailer R2.08 R208004) id 0883;
          Thu, 27 Aug 92 12:21:53 PST
Date: Thu, 27 Aug 1992   12:21 -0800 (PST)
From: "Len Moss"                                     <LJM@SLACVM>
To: "SC22/WG5 Mailing List"                        <SC22WG5@dkuug.dk>
Subject: Re: (SC22WG5.187) Processing Words, Part XLIV
X-Charset: ASCII
X-Char-Esc: 29

In-Reply-To: walt@netcom.com -- 08/27/92 08:56

[Since Walt has been sending his contributions to the SC22WG5 list
rather than just the X3J3 list, I'll start doing the same, and I
suggest others do the same.]

> [stuff omitted]
>
>Len and Dick were cheating, of course; there must be something
>in there that indicates that BNF names are in italics (because
>there is a mixture of fonts in the syntax rules).

No, I wasn't cheating (at least I don't think I was, though I must
admit my knowledge of SGML is limited).  Here's what I wrote for
a production with mixed fonts:

}   <* rule 842>
}   <bnfdef> stop-stmt
}   <bnfalt> <lit>STOP</> [ stop-code ]
}

A quick glance at Appendix D shows that BNF tends to contain a lot
more non-terminals than terminals so it would make sense for the
tagging inside BNF productions to default to non-terminals with
exceptions noted (the "<lit>STOP</>") rather than the other way
around.  On the other hand, non-terminals are relatively rare inside
constraints (or straight text), so there a non-terminal must be
explicitly tagged (the "<bnf>...</>" inside the constraint).  In
this example I chose to use "<lit>" (for "literal") and "<bnf>" (for
"BNF non-terminal") because I thought it was compact and fairly
clear.

The actual declarations for the <bnfalt> tag would indicate that it
must be followed by a sequence of elements each of which could be
either a BNF non-terminal, a BNF metasymbol ("[", "]", or "]...") or
a BNF terminal, and would set up defaults (and, perhaps, for the
metasymbols, so-called data tags, i.e., data characters that imply
their own tags) to minimize the amount of explicit markup required.
An SGML parser would, however, associate a tag (and a set of
attributes, such as font information) with each word or symbol of
the example.

>                                                   I believe the
>source document must contain both "style" tags (headings, paragraphs,
>etc.) and some actual typographic information (because, for example,
>the mathematical integral sign is not an ascii character).

An integral sign is really document content rather than either
structural _or_ typographic information.  In SGML, it would probably
be indicated by a string like "&int;".  A variety of sets of such
"publicly declared entities" are available for import into an SGML
document (for example, this declaration for the integral sign is
part of a set identified by the string, "-//Addison-Wesley//ENTITIES
Maths Symbols//EN").

Some typographic information will, however, inevitably be necessary
(to fix "bad" page breaks, etc.).  I think it should be possible to
hold it to an absolute minimum until we're actually preparing the
camera-ready copy, however.

--
Leonard J. Moss <ljm@slac.stanford.edu>   | My views don't necessarily
Stanford Linear Accelerator Center, MS 97 | reflect those of SLAC,
Stanford, CA   94309                      | Stanford or the DOE
