Re: Accurate user-based log file analysis

Marc Hedlund (hedlund@best.com)
Tue, 18 Jul 1995 00:39:06 -0700

At 11:33 PM 7/17/95, Terry Myerson wrote [re: log info for id'ing users]:
>As for other aspects of the algorithm, two other factors you have not
>touched upon include:
>- The request media mix

Since a large majority of users will not change their accept values
(assuming their browser changes "accept" based on their configuration! can
anyone name a popular browser that does not?), this provides little
information beyond "user-agent."

>- The requests relative to main entry pages (ie the home page)
>You are focused upon individual lines of the log, and not sequences of
>lines.

a little better, but this still doesn't distinguish between the twenty
deprived Netcom users hitting first your home page and then your what's new
page with the same copy of Lynx in the same hour (or whatever timeout you
set). Nor does it recognize the workstation user who hits your what's new
page, leaves the browser running, and goes home; then comes back the next
morning and keeps browsing. Nor does it account for the browser that
reloads the home page when the user backs up the history list.

You can add up the clues and get better guesses, but they're still guesses.

Brian Behlendorf wrote:
>>I've said too much on the subject... next!

Myerson responded:
>You use strong words to disregard a quality piece of software (not the
>first time), that will help people make more informed Internet decisions.
>If you have a solution, then we're all ears.
>
>You quickly disregard what you don't understand.

Brian's interrogatories may be frustrating for you, but they do _not_
exhibit a lack of understanding. I am more than happy to chime in with his
frustration at having to talk down businesses that have heard charming
promises and are ready to throw money at those most charming. If Brian's
words seem strong to you, you need to get out more often; based on your
claim, "Inters=C8 market focus provides you with an accurate tally of Web
site users," he could have simply levelled a charge of false advertising
and left it at that. Instead, he asked for more explanation.

=46ew on this list will disregard a quality piece of software. The problem
you are facing is that many on this list have already tried to achieve that
quality, and we have already run up against barriers you are claiming to
have overcome. Your explanations so far have not convinced me that you
have "solved" the problem of accurate user counts.

Marc Hedlund <hedlund@best.com>