libwww2 bug

George Phillips <phillips@cs.ubc.ca>
Date: 30 Aug 93 21:20 -0700
From: George Phillips <phillips@cs.ubc.ca>
To: <www-talk@nxoc01.cern.ch>
Message-id: <6204*phillips@cs.ubc.ca>
Subject: libwww2 bug
Status: RO
Here's a little more info on the libwww2 bug I talked about in my previous
message.  When it gets a reply from an HTTP server, the library has to
read some bytes and parse the HTTP reply header, if any.  Of course, it
may read beyond the header in which case it must pass that extra
data to the specific content-type handler.  Unfortunately, it uses something
equivalent to strcmp() to do this.  If the extra data happened to have
a '\0' in it, a little piece near the start of the data would get
dropped.  Believe me, this plays havoc with GIF + JPEG images :-).

I've got a completely untested patch for the problem.  I gather that
Marc A. probably has a patch too.  Either way, lots of clients already
have the bug and good servers can keep them working by putting a
TCP/IP packet boundary in after the header.  Under UNIX, this means
that the header must end on a write() boundary.  If you're using stdio,
you'll want to do something like this to output a header:

    printf("HTTP/1.0 200 Document follows\r\n");
    printf("MIME-Version: 1.0\r\nContent-Type: image/gif\r\n");
    printf("\r\n");
    fflush(stdout);

or, if you're a perl hacker:

    print "HTTP/1.0 200 Document follows\r\n";
    print "MIME-Version: 1.0\r\nContent-Type: image/gif\r\n";
    $| = 1;
    print "\r\n";
    $| = 0;

Here's my untested patch.  It may not work, but I think it will give
you a good idea how to really fix the problem.


*** HTTP.c.orig	Mon Aug 30 21:00:12 1993
--- HTTP.c	Mon Aug 30 21:04:28 1993
***************
*** 381,387 ****
  **	We have to remember the end of the first buffer we just read
  */
      if (format_in != WWW_HTML) {
!         (*target->isa->put_string)(target, start_of_data);
  	HTCopy(s, target);
  	
      } else {   /* ascii text with CRLFs :-( */
--- 381,388 ----
  **	We have to remember the end of the first buffer we just read
  */
      if (format_in != WWW_HTML) {
!         (*target->isa->put_block)(target, start_of_data,
! 		length - (start_of_data - line_buffer));
  	HTCopy(s, target);
  	
      } else {   /* ascii text with CRLFs :-( */