This is getting off topic, but I fail to see how metadata could be used
for data that _is_ in the document itself, unless an author just copies
text from the document into a META section, which seems silly.
Personally, I would like to be able to "type" parts of my documents. A
trivial example would be:
<TYPE name="location/city">Hope</TYPE>
If a list of semi-standarized types (similar to MIME types) were agreed on,
search engines could then allow searches like "Where has Dr. Cromwell
mentioned cities?" Many words (like "Philadelphia") would have an
obvious primary type that could be listed in a table available to the
indexing software, but a method of disambiguating examples like the above
is necessary (unless AI context-based discovery systems are much better now
than those I've seen).
Unfortunately, I don't know enough about the area to speak with confidence,
but something like the above mechanism should be available to authors or
organizations that want to allow context-rich searches on documents.
It also seems desirable from a structural markup perspective to be able to
mark up by type with a style sheet. Rather than saying "this is emphasized
text" and elsewhere saying "I want to embolden emphasized text" you could
have real content markup by saying "this is a city" and elsewhere saying
"cities are italicized".
Steven
-- Steven Fought UW Madison Computer Science Webmaster Computer Systems Lab