CGI and Perl

Parsing the RefererLog

To determine how many times you were referred by a particular Web page, you would scan the RefererLog. This is useful to see where people may be coming from. Here is an example of a record from the RefererLog:

http://vader/sales.html -> /technical.html

You can search this log using the same code as with the access log. The difference would be that you might want to search for the number of times you have been referred from a given page. This can be accomplished by simply modifying the regular expression in the grep statement to contain the -> string as part of the pattern to search, like this:

my($srchStr) = "$referer ->";
 my($count) = grep(/$srchStr/,@lines);

Parsing the AgentLog

The AgentLog is useful to find out what kinds of browsers are accessing your Web site. The most popular Web browser out there today is Netscape, which is known in the agent log as Mozilla. Depending on the browser, you may also be able to determine what platform the browser was running on. This can be useful if you want to know how many are Windows users and how many are Macintosh users. Here is an example of a record from the AgentLog.

Mozilla/1.1N (Macintosh; I; 68K)  via proxy gateway  CERN-HTTPD/3.0 libwww/2.17

Unfortunately, not all browsers emit this information the same way. The first part of the line can always be used to determine the browser. For example, Netscape shows Mozilla, Mosaic shows NCSA Mosaic. It is interesting to note that Microsoft's Internet Explorer also announces itself as Mozilla with a qualifier (compatible MSIE 3.0), so that it is sent the same HTML that Netscape Navigator would be sent. This allows the Internet Explorer to display its own Netscape-compatible extension capabilities. The following regular expression might provide useful information for
determining the user agent.

if ($line=~/(.*)\((.*)\)(.*)/) {
    my($browser)=$1; # $1 contains the first set of parens
    my($platform)=$2; # $2 contains the 2nd set of parens
    my($proxy)=$3; # and so on.
 }