Introduction
Hit counters come in all different forms. You can have a normal ASCII text counter, or you can get creative and make a graphical counter. A common approach is to use the concept of an odometer on a car. I will show you how to obtain the number in this example and give you one example on how to make the display graphical.
The number of accesses is not the only type of counter you can provide. You can also find out how many times your page was referred by another page and also what type of browsers are accessing your page.
Setting Up the Web Server to Log Access
This section describes how to set up the NCSA httpd Web server for logging access to your Web site. This mechanism also applies to the Apache Web server and a few others that are based on httpd. Windows- and Macintosh-based servers usually provide a GUI front-end to these server options.
The NCSA httpd server has a configuration file called httpd.conf
. This is an ASCII text file that is used to configure the server options. Within this configuration file, four variables are used to define where certain logs are kept. ErrorLog
defines where the Web server should redirect STDERR
; TransferLog
defines where the page accesses are logged; AgentLog
defines where the client information is logged; and RefererLog
defines where the referring pages are logged. ErrorLog
isn't something you would worry about in this example, although it is a very useful file to be aware of. This example focuses on TransferLog
, AgentLog
, and RefererLog
.
Parsing the Access Log
To determine how many times a certain page has been visited, you need to scan the TransferLog
. First, find out where your log file is kept. Let's assume that the Web server is installed in /usr/etc/httpd
and that you have set TransferLog
to logs/access_log
. The file that you need to look at is /usr/etc/httpd/logs/access_log
. Each line in this file pertains to one hit on a single object in your Web site. By using a Perl regular expression, you can search for the page in question and return the number of times that page has been found in the access log. The following line is an example of a record from the TransferLog
.
www-proxy - - [06/Dec/1995:13:40:52 -0800] "GET /index.html HTTP/1.0" 200 638
The Perl program in Listing 7.5 opens the access log and uses a regular expression to search for the number of occurrences in the file. The page to search for is passed in as an argument to this function.
Listing 7.5. Perl subroutine to count the number of hits on a given page.
sub pageCount { my($page)=@_; # Pre-pend the GET method to limit the search scope. my($srchStr) = "GET $page"; open(IN,"< /usr/etc/httpd/logs/access_log") || die "Cannot open access log! $?\n"; return(scalar(grep(/$srchStr/, <IN>)));}
This code can be included in your CGI script to display the number of hits on a given page. It can also be used outside of the Web site to provide statistics. $page
is defined as a document path, relative to the document root of your server.
This routine can also be used in conjunction with some images to display a graphical hit counter. Suppose, for example, you have an image for each digit. You could take the resulting number from this function, treat it as a string, and use each digit value to locate the image associated with the digit, as shown in Listing 7.6.