Re: Dictionaries in HTML

peter Murray-rust (p.murray-rust@mail.cryst.bbk.ac.uk)
Wed, 7 Feb 1996 18:54:54 +0000 (GMT)

Bob,
Thanks very much for your message recently. I very pleased to
see that you have put the topic of glossaries to the HTML-WG because
we are a virtual group that is also developing the use of glossaries.
(We have a mailing list (listproc@mail.cryst.bbk.ac.uk vsns-pps-glossary)
and I've copied your message to that).

Our primary purpose is to try to create a namespace for
glossaries and out approach is described at:
http://www.dl.ac.uk/CBMT/glossary/
We're concerned with both the namespace *inside* the glossary (we are
intending to use the very exciting new ISO standards (MARTIF and 12620) ;
ans also with the namespace for *accessing* the glossaries. I think
it's that area where your approach and ours come together.

I think that your approach and ours have started from slightly different
points but they overlap in the use of HTML and the hierarchical namespace.
We have suggested - as you will see - that all glossaries have a
namespace identifying (a) them as a glossary and (b) the type. For
example, "http://<host>/glossary-vhg/ is proposed as the leader of our
namespace and maybe you would like to suggest ways that our proposal and
yours come together here.
(BTW our whole approach is to share information so please see this as a
potential collaboration. Quite by chance I am frequently down at BBK so
we could have a drink together).

You have developed the use of HTML attributes much more that I have dared
since I am somewhat fearful of (say) using CLASS in one way and finding
out that it has been hijacked. I want to find what something *is* rather
than what colour of audible multimedia dynamic 3-D font it is rendered
in. :-) (Since I am developing a standard for chemistry I am sticking
precisely to HTML 2.x)

BTW we believe that it will be common for our community to use more than
one glossary in the same doc. I'd very much like to alias glossaries and
so I suggested this to Murray Maloney (who wrote the draft). He thought
this might be useful to post to the WG for comments. So the HTML might
look like (??)

<LINK REL="glossary_1" HREF="http://me.org/glossary-vhg/socsci/drugs">
<LINK REL="glossary_2" HREF="http://you.org/glossary-vhg/chem/drugs">

...
Amongst the list of <A REL="glossary_1" HREF="narcotics">narcotics</A>
is <A REL="glossary_2" HREF="heroin">heroin</A> ...

An advantage of this is that namespace changes in the *glossary address*
don't have to be transmitted through the whole document.

On Tue, 6 Feb 1996, Bob Rosenberg wrote:

> >><!ENTITY % gloss "VAR | CITE | DFN | INFO">
>
> >I suggest you change INFO to SPAN, as proposed in the I18N and style
> >linking drafts.
>
> I don't think SPAN will work, as it does not imply any information
> markup. If anything, I'd rename INFO something more meaningful, like
> GLOSS, or just G.
>
> >Then create a DTD module ala Murray's work.
>
> In progress.
>
> >By the way: I don't see a description of the mapping to HTTP queries.
> >I assume that would go in the draft somewhere (alal the FORMS,
> >imagemap, and ISINDEX stuff in HTML 2.0).
>
> I forgot to mention that. The data is sent in same way as if it were
> the data supplied to an ISINDEX query.
> Is the URL encoded in the same manner for non-http searches? ie gopher or ftp
>
> I haven't decided how the data should be sent by the POST method. I'm
> toying with the idea of:
> term=value&gloss=glossary_name

I think the way we see this is with every term having a unique id within
a glossary. In many cases (e.g. above) the term==the uniqueid. But
since we are also considering multilingual glossaries, clipped terms, etc
they may not always agree. My initial suggestion was

HREF=".../drugs?narcotic">

but yours looks more flexible. (We need a CGI-script anyway).

We are also excited about the use of Hyper-G for holding glossaries...

Looking forward to hearing.

P.

>
> Where value is the term to be searched (eg "www") and glossary_name is the
> name of the glossary (eg info.acronym.internet)
>
>
> The following is a summary of the proposal so far:
>
> NEW HTML
>
> Elements:
> <INFO> (possibly to be renamed <GLOSS> or <G>)
> -This is used for a generic glossary.
>
> Attributes:
> VALUE=cdata for INFO, DFN, VAR, and CITE
> -This specifies either the term to lookup, or it provides the
> definition of the term. The action depends on the CLASS
> attribute (see below)
>
> NEW HTML USAGE
>
> <LINK REL=Glossary TITLE=glossary_name HREF=url>
> This specifies the URL to use when looking up a term which is
> marked up by
> <tag CLASS=class.subclass>...</tag>
> glossary_name takes the format tag.class.subclass.
>
> <META NAME="Glossary-Default" CONTENT=glossary-name>
> This specifies the default glossary when looking up any term on a
> page not marked up by glossary tags.
> For example,
> CONTENT=dfn.internet
> would tell the browser to look up all unspecified terms in a
> dictionary of internet terms, as if they were inside
> <DFN CLASS=internet>..</DFN>
>
> <CITE CLASS=class>text</CITE>
>
> This tells the browser that the glossary cite.class should be used
> when looking up this instance of the term "text". If the CLASS
> element is not included, the glossary is simply "cite".
>
> (the elements CITE, DFN, INFO, and VAR all have the same
> format. In this discussion, CITE is used for the sake of simplicity)
>
> <CITE CLASS=class VALUE="alternative">text</CITE>
> This tells the browser to use "alternative" instead of "text" when
> looking up this instance of the term "text". This is to be used to
> make the displayed text more readable, if the glossary requires a
> less readable format. For example:
> <CITE CLASS=movie VALUE="Untouchables, The">The Untouchables</CITE>
>
> <CITE CLASS=class. VALUE="Displayed definition">text</CITE>
> The browser displays "Displayed definition" to the user when asked
> to look up this instance of the term "text". This behavior is
> done only when the CLASS attribute ends with a period (full stop).
> This is to save the trouble of looking up a term which can easily
> be defined in a document. For example:
> <VAR class=bnf. VALUE="$ | - | _ | @ | . | &amp; | + | -">safe</VAR>
>
>
> MANDATORY BROWSER BEHAVIOR
>
> All browsers which support glossaries must have the following behavior:
>
> -All behavior described in the sections NEW HTML and NEW HTML USAGE
> should be adhered to.
>
> -Glossary names are case insensitive and are constructed from tags as:
> <TAG CLASS=class> -> tag.class
>
> -Whitespace in classes should be collapsed into periods when generating
> glossary names. For example:
> <CITE CLASS="movie documentary"> -> cite.movie.documentary
> Whitespace at the end of the CLASS will translate to a name ending in
> period and that the VALUE should be used as the definition.
>
> -Any markup on a glossary term should be ignored in generating the search
> value. For example:
> <CITE>The <b>Iliad</b></CITE> -> "The Iliad"
>
> -The ALT attribute for images should be used for the search value when
> inside glossary tags. For example:
> <CITE>The <IMG SRC="Iliad.gif" ALT="Iliad"></CITE> -> "The Iliad"
>
> -The search value should be URL encoded in the standard manner for ISINDEX
> searches. For example:
> "The Iliad" -> http://host/path?The+Iliad
>
> -The browser must have some way to display the definition to the user.
>
> -The browser must have some way to let the user look up a term
>
> -The browser must be able to handle terms with multiple glossaries
>
>
> OPTIONAL BROWSER BEHAVIOR
>
> -Browsers may allow the user to define a set of glossaries. These would be
> combined with the glossaries defined by the page's author in some logical
> manner.
>
> -Browsers may allow users to have local glossaries. These are glossaries
> which exist on the user's system. On method of doing this would be have a
> series of HTML files with definition lists. If a <DT> term matches the
> search term, the associated <DD> is returned.
>
> -Browsers may allow the user to choose the search glossary when a tarm has
> multiple glossaries.
>
> -Browsers can search multiple glossaries in a logical order to find search
> term. One method would be search all parent glossaries for the term
> (eg dfn.computer.unix -> dfn.computer -> dfn) this is forward cascading.
> Another method would be to search all defined child glossaries for the term
> (eg cite.movie -> cite.movie.comedy. -> cite.movie.drama. -> ...)
> This is called reverse cascading.
>
> -Browsers can maintain a list of locally (on the page) defined definitions
> and apply it to all instances of the term. For example:
> <INFO CLASS=acronym. VALUE="World wide Web">WWW</INFO>
> would define the term "www" for info.acronym, so that if
> <INFO CLASS=acronym>WWW</INFO>
> appears, the browser act as if the VALUE was defined for it as well.
>
>
> ------------
> Written by Bob Rosenberg
> B.Rosenberg@cs.uc.ac.uk
> http://www.cs.ucl.ac.uk/staff/B.Rosenberg/lex/
>
>

Peter Murray-Rust, Glaxo Research & Dev. (pmr1716@ggr.co.uk); (BioMOO: PeterMR)
Birkbeck College, ubcg09q@cryst.bbk.ac.uk, CBMT/Daresbury mbglx@seqnet.dl.ac.uk
http://www.cryst.bbk.ac.uk/PPS/index.html, http://www.dl.ac.uk/CBMT/HOME.html