Re: HTML -> ASCII?

neuss@igd.fhg.de
From: neuss@igd.fhg.de
X-Mailer-Igd: ## IGD.FHG.DE ## Tue, 9 Nov 93 14:14:56 +0100
Date: Tue, 9 Nov 93 14:14:52 +0100
Message-id: <9311091314.AA00965@wildturkey.igd.fhg.de>
To: www-talk@nxoc01.cern.ch
Subject: Re: HTML -> ASCII?
Dear fellow Webbers,

dale@ora.com (Dale Dougherty) wrote:
> The simplest approach is a sed script that removes HTML tags,
> that is, anything between a pair of angle brackets.
> s/<.[^>]*>//g

Yip.. but that does not always work.. e.g. you can have brackets commented
out with "<!--", or inside a <LISTING> foo </LISTING>. Some of us
out here (and that includes me, I'm working on an indexing tool) need
to strip out HTML commands. Ari Loutonen from CERN has proposed using
the command line browser for this purpose, but something smaller would 

be preferable...

Does something like this exist? 

Cheers, Chris