Re: An Anchor attribute question:

ccoprmm@oit.gatech.edu (Michael Mealling)
Errors-To: listmaster@www0.cern.ch
Date: Thu, 2 Jun 1994 18:41:29 +0200
Errors-To: listmaster@www0.cern.ch
Message-id: <199406021639.AA22281@oit.gatech.edu>
Errors-To: listmaster@www0.cern.ch
Reply-To: ccoprmm@oit.gatech.edu
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: ccoprmm@oit.gatech.edu (Michael Mealling)
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: An Anchor attribute question:
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Type: text/plain; charset=US-ASCII
Content-Type: text/plain; charset=US-ASCII
Mime-Version: 1.0
Mime-Version: 1.0
X-Mailer: ELM [version 2.4 PL23]
X-Mailer: ELM [version 2.4 PL23]
Daniel W. Connolly said this:
> In message <199406021522.AA19445@oit.gatech.edu>, Michael Mealling writes:
> >Daniel W. Connolly said this:
> >> Actually, now that I think about it, If you're not going to include
> >> a redundant URL, why don't you just write:
> >> 
> >> 	<A HREF="URN:IANA:IETF:rfc/822"> ...</a>
> >> 
> >> ???
> >
> >This would work also. I would like to be able to make this distinction
> >in HTML though. Simply to keep in the spirit. There also seems to be
> >something in HTParse.c that is causing that example URN to be invalid
> >since HTParse.c: scan() function makes the assumption (which may be
> >a correct one according to the current URL spec) that no other colon
> >should exist beyond the first one. This is causing HTParse() to turn
> >the above into "URN:rfc/822" by basically looking at the first colon
> >READING BACKWARDS.
> >
> >Is this correct?
> 
> Well... it depends on how you want to look at it. The URI working
> group's definition of URL is
> 	scheme:anything
> 
> The WWW definition of URI (the contents of the HREF attribute) is:
> 	scheme://hostport/dir1/dir2;param=value?search#fragment
> where all the parts are optional, but only certain combinations
> make sense. (See
> http://info.cern.ch/hypertext/WWW/Addressing/URL/URI_Overview.html
> for details)
> 
> So any WWW URI is an IETF URL, but the converse isn't true.
> HTParse.c assumes you're handing it a URI.
> 
> Now if you define the syntax of URN to be:
> 	URN:anything
> then any URN is a URL, but it's not a URI.
> 
> It would make more sense to me to define the syntax of URNs
> such thaty they are also URIs. So in stead of:
> 
>  	HREF="URN:IANA:IETF:rfc/822"
> you would write:
>  	HREF="URN://IETF.IANA/rfc/822"
> 
> 
> It's just an expedient measure to hasten deployment. The syntaxes
> have equivalent expressive power.

Ok, I've added a couple of lines to HTParse.c that fix this and a few
other things that the current URL spec breaks:

in scan() I added these two lines just before the line 
after_access = name;:

    if(!strncmp(name,"URL:",4))
        name=name+4;

This takes care of the current URL spec that requires URL: in front of
a URL. Normal WWW URLs still work normally.

Next, in that first for loop that scans for scheme I added a 'break;'
as illustrated:

for(p=name; *p; p++) {
        if (*p==':') {
                *p = 0;
                parts->access = name;   /* Access name has been specified */
                after_access = p+1;
!!!!here----->  break;   <-------here!!!!
        }
        if (*p=='/') break;
        if (*p=='#') break;
    }

This fixes the apparent small bug that causes URN:bla:bla: to get fouled up.
Everything else seems to work normally.

Can anyone see anything wrong with these two changes?

-MM
-- 
------------------------------------------------------------------------------
<HR><A HREF="http://www.gatech.edu/michael.html">
<ADDRESS>Michael Mealling</ADDRESS>
<ADDRESS>michael.mealling@oit.gatech.edu</ADDRESS></A>