Re: searchable index of the web
Tony Sanders <sanders@bsdi.com>
Errors-To: sanders@bsdi.com
Errors-To: sanders@bsdi.com
Message-id: <9306302030.AA09977@austin.BSDI.COM>
To: www-talk@nxoc01.cern.ch
Subject: Re: searchable index of the web
In-Reply-To: Your message of Wed, 30 Jun 93 15:59:37 EDT.
Errors-To: sanders@bsdi.com
Reply-To: sanders@bsdi.com
Organization: Berkeley Software Design, Inc.
Date: Wed, 30 Jun 1993 15:30:47 -0500
From: Tony Sanders <sanders@bsdi.com>
> > I have written a perl script that wanders the WWW collecting URLs, keeping
> > tracking of where it's been and new hosts that it finds. Eventually,
Darn, I wanted to do that. So, how "big" is the Web? Can you figure out
stuff like "width" (distance between documents)?
Wouldn't it be better if you could just ask each server for it's
connectivity? Seems like this would make things run a **lot** faster.
Since each server has local access to all the information it could
just find all the HREFs real quick, unique them and report to
someone else.
>>>> Dale & TimBL <<<<
This would be a good topic to cover at the workshop.
I was shocked to see how few home pages I've visted. I really need to
get out more often :-) Then I noticed that they all have the :port
which mans it's not the same. Marc, when doing annotations and
checking the "visited" list maybe you should ignore :80 on http:
servers?
We need to do something anyway. With annotations you can get really
lost in the Web.
--sanders