SURVEY: size distribution of HTTP transfers

Markus Peuhkuri (puhuri@snakemail.hut.fi)
Fri, 23 Sep 1994 09:18:59 +0200

I'll send this message to this list, because I'm a bit hurry
to get some input and results. (Originally posted to
comp.infosystems.www.provides, the script is not included here but a
URL is given instead.)

I'm making a little survey of transfer sizes of HTTP. For
that, I need Your help. If you are running (or otherway have access to
logs of) http-server, please run program, which is available <A
HREF="http://www.hut.fi/~puhuri/WWW-bytes.gz">here, (gzipped, 1318
bytes)</A> and mail resulted file to me. <Markus.Peuhkuri@hut.fi>

A program is a perl script (works at least with 4.0). Gunzip
it. It results a 'WWW-bytes'-named file, what you can run with
./WWW-bytes (or by "perl WWW-bytes").

It reads logfile(s), which are given as command line arguments
or if none given, reads from stdin. It exepts files to be in "COMMON"
format. (It does not come up with anything useful if this is not a
case.) It outputs its result to stdout, so redirection is needeed.

So, run it like
./WWW-bytes /usr/local/etc/httpd/logs/access_log > bytes.out
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ replace with your log
file location

or, if you have files archieved like 'access_log.19940901.gz':
zcat /usr/local/etc/httpd/httpd/logs/access_log.*.gz | ./WWW-bytes > bytes.out

or, if you want to include both current and archieved:
zcat /usr/local/etc/httpd/httpd/logs/access_log.*.gz | ./WWW-bytes > bytes.out
./WWW-bytes /usr/local/etc/httpd/logs/access_log >> bytes.out

(it does not matter, if they are in separate tables)

Then mail resulting file to me. If you like, you can give name
of your host and some description of host, like if you have some
special service, that causes lots of traffic.

I can't gurantee, that the program does not break anything,
but I have had no intention to do such and I've tried to make it clear
(it even has some comments...). Maybe it could also be faster, but at
least it works... It only reads from stdin and writes to stdout and
stderr, no other files or programs is used.

Neither it does not give out any sensitive information (IMHO),
but of course, you can check the output (and the source). (No other
text than what is in program itself and then numbers).

At least I tell something of results to those, who submit
information. If I get even more results, I'll send it to here or some
other proper place.

Thank you, for responses.

-- 
Markus Peuhkuri        ! internet: Markus.Peuhkuri@hut.fi
HUT/Telecomm. lab.     ! X.400: G=Markus/S=Peuhkuri/O=hut/ADMD=fumail/C=fi
02150 ESPOO            ! http://www.hut.fi/~puhuri/