Re: New Topic: HTML and the Visually Impaired [long]

yuri@sq.com (Yuri Rubinsky)
Date: Thu, 8 Sep 94 00:54:48 EDT
Message-id: <m0qibQS-000ES4C@sq.com>
Reply-To: yuri@sq.com
Originator: html-wg@oclc.org
Sender: html-wg@oclc.org
Precedence: bulk
From: yuri@sq.com (Yuri Rubinsky)
To: Multiple recipients of list <html-wg@oclc.org>
Subject: Re: New Topic: HTML and the Visually Impaired [long]
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
X-Comment: HTML Working Group (Private)

I hope this is actually of interest to more than just Terry
and me. At any rate, I think we're moving towards a clearer explanation
of what would actually be required. In practical terms, all I hoping is
to add five elements to the Proposed list. Everything else is up to the
browser implementors; we could recommend the aliasing of isomorphic
elements from ICADD to HTML, that's about all. I thank Terry for the fact
that I now have this much more succinct view of what's needed to enable
the print-disabled population to successfully interoperate with the Web.

It's probably about the right time to call for a show of hands. If enough
vote in favour, I'll create the appropriate text and when the draft spec
is in my hands, add the bits before sending it on to the next editor.

I apologise for the delay; I will try to finish the tables stuff off tomorrow.

(Responding to Terry's responses is kind of a full-time job!)

(meant in the best possible way...;-)


===================


Terry writes, quoting my previous mail:

> | In effect, since ICADD-tagged files
> | can be created from *any DTD with the fixed attributes* this would
> | allow any documents conforming to such DTDs to be rendered using
> | WWW browsers without having to convert them *both* into HTML and
> | ICADD. (The latter is what UCLA now does with its campus-wide
> 
> But if the HTML DTD has these atts, then conversion into HTML only
> would suffice.

Almost. The difficulty comes from the fact that if the HTML isn't rich enough
to support elements that are in the ICADD DTD but not HTML, then there
are problems for the Braille. (This is why, in a sense, we need so many
DTDs on the planet.)

For instance: Let's flog this sidebar a bit more. A textbook publisher
has a document with sidebars. There is a concept of a sidebar in the
Braille output as well.But we're asking them (if I understand Terry's next
comment correctly) to encode the file in HTML rather than ICADD.
So the fact that such-and-such is a sidebar gets lost, turned into
a paragraph perhaps (for display only) rendered as a pointer to a 
separate file, perhaps as text set off by HR elements before and 
after. However, the critical thing from the point of view of the 
Braille renderer (and the large print version for that matter) is that
this was a sidebar and that fact is now lost. We've down-translated
too early, so to speak.

Most often we down-translate to HTML, and expect that that is the
last point of transformation; this is where the display is going to
happen so it doesn't matter if the fact that this was a <TASK> or
a <PARTNUM> gets lost. Same with ICADD. But each of them
has a richness that the other doesn't have (or need) in certain 
areas.
> 
> | information service.) Many books, particularly textbooks,
> | need to be transformed into the ICADD tagset in order to easily
> | be printed in Braille or fed into synthesized voice readers (such
> | as IBM's Book Manager which does a great job for visually impaired
> | people). Accordingly, since that text exists in that form, it seems
> | to me to make sense to be able to distribute those files in
> | electronic form for use with free browsers.
> 
> Right, but this could be in HTML instead of ICADD proper.  In
> other words, if HTML has ICADD fixed atts, it's a better presentation
> format than ICADD-DTD-encoding.
> 
A better presentation format for the Web perhaps, but not for
the other requirements -- unless we add that handful of elements
that will make HTML a true superset of ICADD! In which case,
people creating files for Braille directly can use ICADD and those
files will browse in a WWW browser. People creating files in
HTML will have them down-translated into ICADD for Braille
production and things like MENU, DIR, PLAINTEXT, META
and so forth will get translated into Braille-appropriate formats
using ICADD tagnames.

> (Collapsing the argument a bit, as I understand it better:)
> So you want to be able to run ICADD files through Mosaic, and
> need a few extra elements, such as BOX.

Yes.That's all it is really.
> 
> | > | This is a sidebar. Remember that most ICADD usage is for textbooks
> | > | which, in the modern style, are sidebar-rich. I don't think we
> | > | can actually leave it out if we want to support ICADD files. That
> | > | is, we have to do something with a file that has a SIDEBAR in it,
> | > | rather than just format it as a paragraph. HTML is pretty specific
> | > | about online presentation already. I'm not convinced that this
> | > | <HR> approach is so out of keeping.
> 
> I'm still uneasy about sidebars.  You are suggesting, I think,
> that the aliasing be just to HR/(P|UL|etc)+/HR; doesn't that mean
> that we can get by without an actual BOX element in HTML?  
> 
We're mixing apples and potatoes here. I was thinking of an actual
new element in HTML called BOX or SIDEBAR. But I was also
thinking of how implementors might choose to represent that using
existing capability. (HR/Petc/HR). However, there's nothing to
prevent them going much further with the concept and
implementing SIDEBAR as a cross-reference to text elsewhere,
or whatever.

The reason we can't get by without the actual BOX element in
HTML is that we're trying to avoid having to do a transformation
from ICADD. I'm talking about having a raw ICADD file readable
in WWW browsers just as if it were HTML -- because in fact
it would be a true subset of HTML (if you count the aliasing effects,
but that's purely browser implementation). (That is, it's not really
a true subset, but the effect is exactly the same. A browser would
read an ICADD file if that file's extension was .html and as if it
were. Perhaps it should be, in fact

> Here's what makes me uneasiest:
> 
> | If there's nothing in the HTML DTD that matches the concept, then it
> | doesn't get used, that's all. But if it is, then it lets HTML authors
> | use the sidebar concept (since parallel text of this sort does
> | exist and is useful); it lets browser-makers implement, no doubt in
> | interesting ways, such a construct; and, when an ICADD file comes
> | along that uses BOX, it lets that be displayed.
> 
> I see HTML as (potentially, at least) a very useful set of presentation
> semantics for online rendering.  I can imagine an online sidebar 
> as another node/file linked to the original, and would rather
> map it that way.  If the element is put into the DTD people will start 
> using it and making up renderings for it when they may be able to get the
> same result more straightforwardly.  I don't use the term "Tag
> Abuse" often, but this would be a case of it.
>
If there is a stated semantic for BOX, then that's how it would be used.
I don't believe there is an equivalent right now for that result; if there
were I'd suggest a straight alias to that from the ICADD BOX.

I think "Tag Abuse" syndrome is actually the opposite. In my mind it's
the situation in which one uses the wrong element to achieve some
sort of visual effect. I'm saying there is an effect we want -- the concept
of the sidebar -- and concocting it any way but through a specific element
would be a case of Tag Abuse.
 
> | The State of Texas has established that textbook publishers must supply
> | texts (by a certain date) only in SGML. They prefer the AAP Book DTD
> | since it was designed by book publishers for trade books -- not because
> | it has the fixed attributes. Alternatively, a publisher can use *any*
> | DTD, insert the fixed attributes, and deliver files with ICADD markup.
> 
> Then we're home free.  As fixed atts are added to more and more 
> common DTDs, the need, even desireability, of encoding docs in 
> ICADD-DTD markup disappears.  

That's very true. ICADD markup is intended to be a temporary phase
that a document passes through on its way to a Braille printer etc. It is
absolutely *not* rich enough for storage and retrieval of valuable
information.

>                                  Do I understand correctly from this
> and your previous postings that while the ICADD DTD is an ISO
> standard it's not considered fully cooked (or more kindly, early
> revisions are considered useful)?  Rather different from most
> ISO standanrds.
> 
It's an "informative" part of the standard and appears in an Annex.
It intentionally leaves uncooked the table handling and the math,
but otherwise is considered to be stable. People have built Braille
software that reads ICADD files and people are using it in production
today. It seems to work just fine for its stated purposes.

> | All I'm hoping will happen is for HTML to include the handful of ICADD
> | elements that don't map directly to existing HTML elements; for 
> | browser-makers to alias the handful of isomorphic elements; and for
> | agreement from everyone (and comments thus far suggest that this *is*
> | agreed) that we can publish HTML 2.0 with the SDA attributes built in.
> | (That's work I'll do as whatever moment seems right.)
> 
> I now understand much better, Yuri; let's mull over this mapping
> problem some more and see if we can get a better result.  Would it
> be too difficult to map-on-the-fly a BOX to a new HTML instance
> with a link to it that has "See Sidebar" as its hot spot?  After
> all, a sidebar is really only out-of-flow material, often allowed
> to float in the context of page composition; we don't need to
> worry about floats online, and we can reproduce the effect of
> being out of flow by linking.

No, I don't think it would be too difficult, but my position is that this
should be a question for a browser implementor, not for a DTD writer.
That's presentation; as far as I'm concerned, SIDEBAR is a *logical
construct* which, along with LIST HEADING, AUTHOR, INK PRINT
PAGE and PAGE REFERENCE, make sense in a forthcoming HTML
-- even if only in a ICADD marked section which is defaulted to 
INCLUDE.



Yuri