Re: Accurate user-based log file analysis

Terry Myerson (tmyerson@iserver.interse.com)
Mon, 17 Jul 1995 17:43:23 -0700

>If you send marketing stuff like this to a technical list like www-talk,=20

Yes. There has been tremendous publicity regarding hits, and we believe that
our user algorithm is reasonably accurate.

>I'd like to see your definition of "users". For a sufficiently small=20
>site, letting users=3Dunique hosts *might* be acceptible. For large sites,=
=20
>there could be 20 people coming from behind sgigate.sgi.com or=20
>www.hensa.ac.uk or wwwproxy.edu.au or *.proxy.aol.com - how does your=20
>software divine those individuals? What assumptions are being made about=
=20
>visitors? Is the analysis happening on standard common log files, or=20
>are other types of logfiles generated and used?

We support all of the NCSA httpd log file formats (including the CLFF).

The first thing Interse' market focus does is group the requests on=20
"Differentiating Characteristics." (DC's). These DC's are entries within=20
the log files that will be constant throughout a user session, but different=
=20
among absolutely different sessions.=20

Next, we walk through the request stream within each DC group. New sessions
are demarcated when objects are requested and not cached, when they should=
be,
and there is a large time gap in the request stream within a DC group.=20

>> The software includes the Inters=E9 Internet database, which contains=
most
>> U.S. Internet domains indexed by city, state, and zip code, combined with
>> other Internet demographic information. Coupling this information with=
your
>> log files, the software translates Internet address into actual=
organization
>> names and produce geogrpahic analyses of your site=92s specific user=
community.
>
>There's going to be a whole lotta hits coming from Vienna, Virginia,=20
>White Plains, NY, and Columbus, Ohio!

Indeed, the online services due lead all other organizations in bringing=
users
to the web. Of course, this software will confirm if this is true of your
web site's user community.

After you get past CompuServer, AOL, NETCOM, and Prodigy, the results
are very interesting. For example, our web site is consistently dominated by=
=20
organizations in northern california. One beta tester located in New Orleans
had more users from the south-eastern united states. Another Beta-tester,
1-800-DEDICATE(www.800dedicate.com), purchased advertising on=
www.sjmercury.com
because San Jose was the leading city connecting to his site.

>If it's getstats++, say so. But don't say it provides counts of "users"=20
>without also stating the assumptions used to generate that number.

It's not getstats. The reports are very comprehensive, well integrated, and
understandable. I reccomend downloading one from our web site
(http://www.interse.com).
One big difference is that we've put tremendous effort into showing trends
(ie how things are changing over time). The report has been designed to
answer the=20
questions our marketing research has determined are currently going=
unanswered:

How many users (not "hits" or "hosts") are visiting your site? How are users
finding your site? How many users are turned away by your registration
screen? Geographically, where are your users located? Are AOL users
interacting differently than corporate users? How did your most recent=
changes
to your site affect user interactivity? How much bandwidth is your site=
using
during the workday? Are users downloading au or wav files? Who is
tomato.interse.com?=20

This isn't meant as pure marketing hyperbole. People are pouring a lot of=
money
into the web without the answers to these questions. A vital piece of=
technology
is missing from the web infrastructure.

We've busted our buts to put together a software package which can answer=
these
questions, conveniently and cost-effectively.=20

-Terry

-----------------------------------
Terry Myerson
Interse' Corporation
408 732-0932 x-230
408 732-7038 fax
tmyerson@interse.com
http://www.interse.com
-----------------------------------