More on Indexing and Moving one higher than HTML etc

Paul Wain (
Thu, 4 Aug 1994 16:57:14 +0200


Well there were a fair few responses from the 1st time around so thanks to all
the people who responded. Unfortunatly (and please dont take this wrong), there
werent really a lot of answers to what I asked :) So at the risk of being
repetative, Ill start of with a quick summary and then move on to a few further
thoughts and questions.

Well after reading through all those messages it appears to me that a fair few
people have, are or are about to experience the same problems that we are. And
that there are really no common grounds for answers.

On going to a higher level of markup, the only reply specific answer I got on
this was that maybe we could use some form of word processor and convert down
to HTML from this. This is fine except that for the amount of information we
would be looking at this isnt very practical. We have probably in excess of
2000 documents that would be going up. We really dont have enough disk space to
keep 2 copies (one word processor/one HTML) so it would need to be done on the
fly, but that said think of the poor CPU :) Okay so its a hypothetical worst
case, but thats what I am employed to come up with at the moment.

There also seemed to be mixed views on ALIWEB but more on that in a moment.

Appart from this however there was very little actual answers :)

Some more thoughts and questions
I talked about wanting to keep certain information in a file that may or may
not be transparent (author, owner, keywords etc) and the more I think about it
the more that I can see that people **wont** add this information to the files.
After all why should they? It wont show up at the page view level so people
tend look at the wider implications. How can we find a way, without inventing a
submissions system, to enforce people to use this information.

Im fairly sure now that we will need to come up with our own indexing system.
Again this is due to the number of documents we are looking it. It would need
to be able to run on the files themselves rather than the HTTP output, it would
need to automatically update the files (so the users dont need to run it when
they add a file in), and as such it must be able to understand how to arrive at
the URL for the file. Is this do able? I cant see a way unless I can get around
the problem in the previous paragraph.

(I was going to include my bit on ALIWEB here but I cant access its home page
right now - timed out - but I think that the above should answer questions as
to why we think we cant use it here...)

Also there were very few ideas on how to track author and ownership. Does this
mean that no one has looked at this issue?

I must appologise for trying to push this discussion along but we are currently
stuck for a lot of answers and the structure we are going to end up with if we
cant resolve some of these issues is going to be horrendous. (The biology
skeleton pages went up today on our test server... they didnt really contain
much information but came in at around 50 pages so Im told. If we have 20 or so
departments doing this I cant see that we can easily control the structure of
things AFTER they have happened so we need answers now.)

Thanks for listening,


.--------Paul Wain ( X.500 Project Engineer and WWW Person at Brunel)---------.
| Brunel WWW Support: MPhil Email: |
| Work Email (default): (Brunel internal extn: 2391) |
| or |
`-------------------So much to fit in, and so little space!-------------------'