GIF replacement action from comp.graphics

Robert S. Thau (rst@ai.mit.edu)
Tue, 10 Jan 1995 04:30:10 +0100

Some notes from the USENET side of things:

While www-talk was (so far as I could tell) adjourned, through the
first week of January, there was quite a bit of discussion of
alternatives to GIF on various USENET newsgroups; comp.graphics seems
to be getting most of the interesting traffic. The two alternatives
about which there is most talk are:

1) Modified GIF, with LZW replaced by an unencumbered compression
scheme. (A company from Telegrafix has even offered to make code
for GIF-LZW+LZ77 (gzip compression) freely available).

2) A new format, tentatively called PBF (although there's been a whole
lot of hoo-hah about alternative names), being spec'ed out by Tom
Boutell. I'm including the most recent draft spec that's arrived
here below.

The Boutell proposal in particular has seen a *lot* of comment; people
discussing other alternatives here should probably at least be aware
of it.

One further note from the general discussion --- in answer to why yet
*another* new format, the general response seems to be that no other
common format has *all* the features of GIF that matter to people
(transparency, interlacing, good compression for icons, etc.), so it's
either a modified GIF or something totally new; we might as well see
what the latter would look like. I'm not endorsing this position
myself --- just letting you all know that it's out there.

Your faithful (well, here's hoping) scribe
rst

PBF draft spec included below:
----------------------------------------------------------------

>From boutell@netcom.com Mon Jan 9 20:29:30 EST 1995
Article: 41724 of comp.graphics
Newsgroups: comp.graphics,comp.compression,comp.infosystems.www.providers
Path: ai-lab!bloom-beacon.mit.edu!gatech!howland.reston.ans.net!ix.netcom.com!netcom.com!boutell
From: boutell@netcom.com (Thomas Boutell)
Subject: PBF (Portable Bitmap Format): Second Draft
Message-ID: <boutellD21trC.EnG@netcom.com>
Followup-To: comp.graphics
Organization: Nerdsholm
Date: Sat, 7 Jan 1995 18:52:24 GMT
Lines: 418
Xref: ai-lab comp.graphics:41724 comp.compression:13903 comp.infosystems.www.providers:13699

PBF (Portable Bitmap Format) Specification, Second Draft

By Thomas Boutell, boutell@netcom.com, 1/7/1995

This is the second draft of the PBF specification discussion
document, replacing the first draft and taking many suggestions into
account. However, I have refrained from adding many suggested
capabilities in order to keep the format simple and
reasonably easy to implement. There are many significant
changes from the first draft, particularly the addition
of truecolor.

This draft proposes use of the inflate/deflate compression scheme,
an LZ77 derivative which is used in zip, gzip, pkzip and
related programs, because extensive research has been done
supporting its legality. However, there is still room for
discussion of the specifics, and for the proposal of
other schemes. The author is not a compression
maven, and is not in a position to personally write
reference inflate/deflate code. inflate.c in the gzip
package is not GPLed, but deflate.c is, so
commercial vendors of PBF-creating programs will want to
roll their own. Of course, an unrestricted specification of the
compression scheme is provided in the pkzip package, so there is no
barrier to doing this. (See algorithm.doc in the gzip package
for references.)

(Does anyone know of a completely copyrightless version of
deflate.c? It would lower the ante.)

This draft is intended solely to generate comments and
does not represent the final standard.

Hello, Compuserve folks! This draft will be posted to Compuserve as
well as to comp.graphics.

Data Representation Note

All integers which are not 1 byte integers will be in
network byte order, which is to say the most significant
byte comes first, and the less significant bytes in
descending order of significance (simply MSB LSB
for two-byte integers, B3 B2 B1 B0 for 4-byte
integers). References to bit 7 refer to the
highest bit (128) of a byte; references to
bit 0 refer to the lowest bit (1) of a byte.

The Format

The Identification Header

The first four bytes always contain the following
ASCII characters:

.PBF

(The dot is included to avoid confusion with files
such as this one which discuss PBF as opposed to
being PBF files themselves.)

The Main Section

The remainder of the file consists of a series of
chunks, where each chunk consists of a 2-byte,
UNSIGNED chunk type (ranging from 0 to 65535),
a 4-byte, UNSIGNED length (not including itself or the
chunk type), and the data bytes appropriate to that
chunk, if any. Note that this provides for a chunk
to be skipped even if the implementation does not
recognize that particular chunk type.

Chunk Ordering

Chunks must appear in ascending order (by chunk type) in
the file. This enforces the common-sense requirement that
dimensions, etc., appear before image data, while
simulataneously leaving room for extensions that are
also important or necessary to know about before the image data
itself arrives (note that I'm allowing for streaming data).

Necessary and Ancillary Chunks

Even-numbered chunks are necessary in order to properly
display the contents of the file. If an implementation
encounters an even-numbered chunk type it does not know
how to handle, it must indicate this to the user and
either not display the image or warn the user that
the file contains an extension the program does not
understand. Even-numbered chunks are referred to as
"necessary" chunks from now on. The dimensions chunk
is an example of a necessary chunk. A hypothetical
vector-graphics chunk would also be a necessary
chunk, since without rendering it the image would appear
to be blank, or would contain a background bitmap
with no other information.

Odd-numbered chunks are ancillary information that enhances
the image in some fashion, but without which the image
can still be successfully displayed.
Examples are the comment, copyright and gamma-
correction chunks.

Proprietary Chunks

If you want others outside your organization to understand
a chunk type that you invent, CONTACT THE AUTHOR
OF THE PBF SPECIFICATION (boutell@netcom.com) and
specify the format of the chunk's data and your
preferred chunk type. The author will assign a permanent,
unique chunk type. The chunk type will be publicly listed
in an appendix of extended chunk types which can be
optionally implemented. In the event that Mr. Boutell
is unable to maintain the specification, the task will
be passed on to a qualified volunteer.

If you do not require that others outside your
organization understand the chunk type, you may
use a chunk type between 49152 and 65535. Values
in this range will never be assigned in the
public specification. Please note that if you
want to use these chunks for information that is
not essential to view the image, and have any
desire whatsoever that others not using your
internal software be able to view the image,
you should use ancillary (odd) chunk types.

Required Chunks

All PBF implementations must understand the following
chunk types in order to be considered
PBF-compliant. All implementations must understand
and successfully render the even-numbered (necessary)
chunks below. Standalone image viewers
should also be capable of displaying the ancillary
chunks below, such as the copyright notice,
but this is not necessary for applications in which
many images may be displayed at once (ie,
WWW browsers).

Chunk Type Description

0 Dimensions (data is x and y, 2 bytes apiece;
maximum dimensions are 65535x65535)

8 Bit depth. Data is one byte, containing the value
1, 2, 4, 8 or 24. The first four will have palettes
unless they are grayscale; 24 signifies truecolor,
in which each pixel will have a complete color specification.
Any bit depth other than 1, 2, 4, 8 or 24 is an error
(see note #1).

16 Grayscale. If this chunk is present and the bit depth
is 1, 2, 4 or 8, no palette chunk will follow, and
the bitmap should be interpreted as a linear grayscale,
where 0 is considered black and (2^bitdepth)-1 is
considered white. If the bit depth is 24, then this
chunk should not appear and should be regarded as
an error if it does (the progam may of course
warn the user and display the image anyway).

24 Palette
This chunk appears only for 1, 2, 4, and 8-bit depths,
and only when the grayscale chunk does not appear.
The number of entries in the palette will not exceed
2^bitdepth, and may be smaller than that value.
(Determine the size of the palette by dividing
the chunk length by 3.) This chunk consists of
a series of RGB values, consisting of a red byte
(0 represents no red, 255 represents full red),
a green byte, and a blue byte for each entry.
For optimum compression, colors which are
similar should have adjacent palette values,
but this is not a requirement for compliance
with the PBF standard. Any value beyond the
highest index the palette which appears in the
bitmap data is an error.

32 Interlace
When this chunk is present, the image data will
be interlaced. This means that rows will be stored
in the following order:

Every eighth row, then every eighth row + 4,
+2, + 6, + 1, + 3, + 5 and + 7.

For example, if the image contains 23 rows, they
will be stored in the following order (considering
the first row to be row 0 and the last row to
be row 22):
0 8 16 4 12 20 2 10 18 6 14 22 1 9 17
3 11 19 5 13 21 7 15

The purpose of this feature is to allow images
to "fade in" in a simple fashion that does
minimal damage to compression efficiency
(there is some loss of compression
efficiency, however).

48 Transparency. (Note that this chunk is necessary;
without understanding it, your program will fail
to read truecolor images with alpha channels!)

FOR PALETTE AND GRAYSCALE IMAGES:
The chunk data consists of a
1-byte index into the palette for palette-color
images, or the 1-byte grayscale level to be
regarded as transparent for grayscale images.

FOR 24-BIT TRUECOLOR IMAGES:
No chunk data (length 0). The presence of this
chunk indicates that the image data will include
an alpha channel in addition to the red, green
and blue channels. Note that this chunk is
necessary, since it affects the encoding of
the truecolor data.

73 Copyright notice. The notice will consist of
ASCII text and will not be null-terminated.
New lines should be denoted by a single
line feed (ascii 10 decimal).

89 Comment. The comment will consist of
ASCII text and will not be null-terminated.
New lines should be denoted by a single
line feed (ascii 10 decimal).

16384 Uncompressed image data
(This chunk is intended for use only in applications
which require high image-reading speed and in which
communications is NOT a factor.)

The chunk data consists of the pixel data for the
image.

PALETTE AND GRAYSCALE IMAGES (bitdepths 1, 2, 4, and 8)

For 1-bit palette or grayscale images,
each horizontal line of pixels is represented
by a stream of bits, in which bit 7 (128) is the
leftmost pixel in the byte and bit 0 (1) is the
rightmost. CONSECUTIVE LINES NEVER SHARE A BYTE.
That is, if the last pixel of the line falls
in bit 4 of a byte, the first pixel of the next
line is stored in bit 7 of the next byte, NOT
in bit 3 of the same byte. The pixel value
is an index into the palette, unless the
grayscale chunk is present.

NOTE: rows appear consecutively from the top row
unless the interlace chunk is present; see the discussion
of the interlace chunk for more information about the
order of interlaced rows. You must be able
to understand interlaced images to comply with the
specification, and the scheme is simple.

For 2-bit palette or grayscale images,
the same scheme is followed, except that
each pixel is represented by a 2-bit portion
of a byte, with the leftmost bit being most
significant. For instance, the first pixel
of the line is represented by bits 7 (128) and
6 (64) of the byte. Again, consecutive lines
do NOT share bytes (see the description of
1-bit images, and note the possibility
of interlacing).

For 4-bit palette or grayscale images,
the same scheme is followed, except that
each pixel is represented by a 4-bit portion
of a byte, with the leftmost bit being most
significant. For instance, the first pixel
of the line is represented by bits 7 (128),
6 (64), 5 (32) and 4 (16) of the byte. Again,
consecutive lines do NOT share bytes (see the
description of 1-bit images, and note the
possibility of interlacing).

For 8-bit palette or grayscale images,
each pixel is represented by a single byte.
(Note the possibility of interlacing and
its effect on the order of lines.)

TRUECOLOR IMAGES (bitdepth 24)

For truecolor images, each row is represented
by either three or four rows of bytes,
depending on the presence or absence of the
transparency chunk.

For example, if the image is 80 pixels across,
the first (top) row consists of:

80 consecutive bytes representing the red values for each pixel
80 consecutive bytes representing the green values for each pixel
80 consecutive bytes representing the blue values for each pixel

And, if the transparency chunk is present,

80 consecutive bytes representing the alpha channel values
for each pixel (see below)

Rows appear consecutively beginning from the top row,
unless the interlace chunk is present (see the description
of the interlace chunk).

How and When to Interpret the Alpha Channel

Standalone image viewers can ignore the alpha channel,
provided that they properly skip over it in order to
be in the right position to read the next row.

World Wide Web browsers and the like should regard any pixel
with an alpha channel value of zero as transparent (the pixel
should be given the background color of the browser), and
any pixel with an alpha channel value greater than zero
as non-transparent.

Applications which display several images overlaid with
one another should interpret the pixel with the highest
alpha channel value as being in front and display that value
in preference to the others. Note that PBF does not
specify overlays within the format, in order to ensure
that streaming, interlaced display remains possible.
See note #2 for a suggested HTML syntax.

16400 inflate/deflate-compressed image data

The compressed image chunk takes advantage of the
inflate/deflate compression employed in the
widely known gzip software. Extensive legal
research has been done to support the belief
that gzip compression is safe with regard to patents.
inflate/deflate derives from LZ77. deflate can
be performed in a streaming manner without
the need to buffer up a significant quantity
of data.

Prior to compression, and following decompression,
the data is formatted exactly as specified for
the uncompressed image data chunk (see above),
except for truecolor images (see below).

EXCEPTION: truecolor (24 bitdepth) images are
formatted as follows in order to facilitate
effective compression:

Each byte for a particular color element (or for
the alpha channel) of a particular pixel is a SIGNED byte
storing the difference between itself and the previous
pixel, where the the previous pixel is initially
considered to have had the value zero. If the byte
contains the value -128 or +127, then THE NEXT BYTE
MUST ALSO BE ADDED to compute the value for that pixel,
and so on (in rare cases this could be needed twice
to represent one pixel).

This allows sharp transitions such as that from 0 to 255,
which would be represented by the byte sequence
127 127 1.

In the great majority of truecolor images, such
transitions are rare, and a single byte will suffice
for most pixels.

As a result, the number of uncompressed bytes
used to represent the red portion of the first line
of an 80-pixel-wide image will be at least
80 pixels, but probably slightly larger.

WHY IS THIS SCHEME USED?

Because, by storing the differences between pixels
and not the pixels themselves, we give the deflate compression
algorithm the opportunity to do a much better job.

IMPORTANT: at the beginning of each band (red, green,
blue or alpha channel), the previous pixel value is considered
to be the last pixel value of the previously stored line for THAT
BAND (zero for the first row).

WON'T COMPRESSION LOSE EFFICIENCY WHEN SEPARATE BANDS
ARE FED TO IT IN A SINGLE STREAM?

Yes, although no more so than for interlaced images.
I am still examining this issue.

* * *

Notes

#1:

Odd bit depths do not compress well; the savings associated with them
are largely illusory. Palettes larger than 8 bits begin to overwhelm
the image data size itself. 16-bit truecolor just doesn't look good,
and 24-bit truecolor can easily be used to represent it.

#2:

For the World Wide Web and similar environments, I suggest the following
extended HTML syntax:

<IMG SRC="image1.pbf" SRC="image2.pbf">

to display two overlaid images, taking advantage of the alpha channel
to determine what is in front at a given pixel. Due to the caching
capabilities of web browsers, this scheme could be terrific for
the display of often-changing, simple graphs (financial, scientific,...)
on top of rarely-changing, complex backgrounds. Keeping overlays
out of the PBF format itself saves the grief of downloading an
often-used background over and over.

-T

-- 
The ouzo of human kindness.

<URL:http://sunsite.unc.edu/boutell/index.html>