Re: converting URLs in .html files

dcmartin@library.ucsf.edu (David C. Martin)
Message-id: <199308311631.AA20114@library.ucsf.edu>
From: dcmartin@library.ucsf.edu (David C. Martin)
Organization: UCSF Center for Knowledge Management
Email: dcmartin@ckm.ucsf.edu
Phone: 415/476-6111
Fax: 415/476-4653
To: kevin@scic.intel.com (Kevin Altis)
Cc: www-talk@nxoc01.cern.ch
In-reply-to: Your message of Mon, 30 Aug 1993 17:24:56 -0800
	<9308310027.AA24018@rs042.scic.intel.com> 
Subject: Re: converting URLs in .html files 
Date: Tue, 31 Aug 1993 09:30:39 PDT
Sender: dcmartin@library.ucsf.edu
Status: RO
You mean like an automatic "tar" of a hierarchy, with controls for
depth, network hops, etc?

dcm
--------
Kevin Altis writes:

Has anyone dealt with automatically converting the URLs within HTML files
so that you could take a set of files like the Library of Congress Vatican
Exhibit and use them off a local HTTP server rather than across the
Internet? This would be especially useful for doing WWW demos which are
time constrained and thus best run on a local Ethernet. This would also be
the simplest way to change the locations of sets of HTML, GIF images,
PostScript, etc. when the "exhibit" is moved to another server or directory
location, even if it is in the same domain as the original. URLs could also
be changed to use "file" instead of "http" in most cases so that a file set
could be tested on a local disk (say a PC or Mac) without the need for a
server or TCP/IP connection. For HTML created for CD-ROM or some other
static medium, conversion from "file" to "http" or whatever would be
necessary if the files are made net accessible.

I suspect it would still be necessary to recreate the folder hierarchy of
the source server to someextent, but that would be part of the conversion
process. Maybe a utility (perl script anyone?) could be written to match a
set of files, change URLs appropriately, then tar all the files. Or given a
tar archive, change all the URLs after the files were extracted.


Kevin Altis
Intel Corporation
Supercomputer Systems Division
Internet: kevin@scic.intel.com