Re: ACTION re: HTML 3: Too many tags!

Ian Graham (igraham@hprc.utoronto.ca)
Thu, 27 Jul 95 09:45:00 EDT

>
> On Wed, 26 Jul 1995, Ian Graham wrote:
> > I do not agree. HTML has a history of being a semantic markup
> > language, with legacy markup for physical rendering (B,I, TT, etc).
> > Although this is not always popular, it serves an important use, allowing
> > other document formats (RTF, PS, Word, etc) to be easily converted
> > to HTML. If you eliminate B,I,TT -- and yes, S, U, BIG and SMALL you make
> > this type of conversion extremely difficult, and in fact wrong -- to get
> > the formatting you want you would have to arbitrarily assign semantic
> > meaning to a string without knowing if that meaning is correct, or else
> > drop the formatting information entirely. I therefore argue that
> > these physical styles should be retained.
>
> But what if they were to be made available, as some have suggested, via
> attributes on (an)other element(s). This would allow for conversion but
> would make specification of the physical appearance auxiliary rather than
> it being the primary meaning of the element itself.
>
>
> James K. Tauber <jtauber@tartarus.uwa.edu.au>

My point is that, when converting legacy documents, there is often *no*
primary meaning -- you only have the physical style. In principle, you
could go through a document by hand and try and figure out context, but
this is often not tenable -- imagine trying to convert, by hand,
thousands of pages of LaTeX, Word or other documents.

Therefore this mechanism would require a new element (say FONT, a la
Netscape) that is declared to have no semantic meaning and which would
take an attribute to indicate physical style. But -- B, I and TT are
already in common use, so those cannot be deprecated and replaced by
FONT. The same is true of U, to some degree. That means that you could
only use FONT for S, BIG and SMALL, which seems a small gain. Since we are
stuck with the legacy, it seems easier to just add three new physical
elements (*if they are deemed necessary!*) that to try and rework
an already implemented system.

This is a nontrivial debate, considering the types of things Netscape has
tried to fiddle in with their FONT element. (I used this name for a
reason!). So, the question becomes -- where does one stop with physical
formatting elements?

>From what I know this is not a problem with other SGML languages, as they
were, in most cases, designed for specific semantic markup purposes,
where every document would be tagged with the correct semantic elements.

Ian