CGI and Perl

CPAN Layout

Once you've connected to a CPAN archive, there's quite a bit of hierarchy to navigate, and it can be a bit intimidating at first to find your way around in there. The hierarchy is relatively well thought out, believe it or not. Let's take a look at its hierarchy, and try to give a formal description of how to get around. When you first connect to any given CPAN site, you'll have to get to the top-level CPAN directory, to start. Of course, if you're using a Web browser, you can just feed it a URL to get there. Just connect to the one closest to you, and get to the top-level CPAN directory, then follow along here.

At the top level, you'll find the following directories:

authors Directory for all individually written submissions of modules, scripts, etc.
doc Documentation and other informative bits
indices Indexes, in ls -lR format
misc Miscellaneous stuff, emacs libraries
modules specific modules by name, by category, and by author
ports ports of Perl to various architectures
scripts original scripts area, specific tools/toys
src Perl sources, Versions 4 and 5

Along with the above directories, you'll find the following files in the top level, by default:

CPAN Textual description of the archive, its sites, contributions, and maintenance.
CPAN.html HTML form of CPAN
ENDINGS Filename extensions in use within the CPAN, and what they imply, in terms of MIME and specific applications.
MIRRORED.BY Sites which mirror the CPAN, and are part of the Comprehensive Perl Archive Network, which are publicly accessible.
MIRRORING.FROM Sites from which the master CPAN site, at FUNET in Finland, mirrors to create the master CPAN hierarchy.
README Notes on the intent and principle of the CPAN from Jarkko Hietaniemi, the "Self-Appointed Master Librarian (OOK!) of the CPAN"
README.html HTML version of the README
RECENT A listing of the most recent submissions and uploads to the CPAN.
RECENT.html HTML version of the RECENT text file.
ROADMAP Simple overview of the CPAN hierarchy.
ROADMAP.html HTML version of the ROADMAP. Useful to navigate through the archive, if you don't mind waiting for repeated FTP connections through the Web browser.

As you can see, each of the files has a specific intent, and some of them provide a view into the archive.

For the sake of brevity, however, let's take a closer look at the intent of each of the top-level directories.

authors This directory is really the foundation of the newer CPAN hierarchy. It has come into being during the last couple of years, specifically for the archival of works, in any form, by specific people. All of the numerous modules and extensions which can be used with Perl5 are to be found under here. There are numerous symlinks which correspond to the full name of the authors, along with the id directory.
authors/id The layout of the authors directory is such that there exists a symlink with the name the same as the full name of the author which, in turn, points at the specific CPAN userid directory allotted to this author. The id directory, within the authors directory, contains the specific author directories. These authors/id/authorid directories are the most important in the CPAN, and form the foundation of all of the other views of the hierarchy. Thus, if you wish to obtain a module, and you know it was written by Dean Roehrich, for instance, you'd change the directory to the authors/Dean_Roehrich directory, which in turn points to authors/id/DMR dir-ectory. Dean himself controls what lives in the DMR dir-ectory through the automated features of the CPAN master site, and can automatically update the items, as he makes new releases, and delete older releases of his works.
doc The doc directory in the top level is where the various forms of documentation for Perl live. In here find all of the Perl pods, and their associated manpages, HTML, and postscript files. Also located in the doc directory are Tom Christiansen's suite of FMTEYEWTK documents, the Perl and TKPerl FAQs, the annotated reference guides, along with other presentations, slide-series, and miscellaneous bits of information related to Perl. Also here find the various pod2 converters which convert the Perl pods into other documentation formats. We'll be discussing pod later in Chapter 2. Note that this directory isn't always current with the latest release of the documentation pods which come with the latest release of Perl5. Grab those from the Perl source itself, within the pod directory.

Note: FMTEYEWTK stands for Far More Than Everything You Ever Wanted To Know. Tom's suite of discourses regarding specific, usually advanced, topics related to Perl.

modules This directory is really more like a switchboard. It contains several sub-directories which provide a different sort of view to the archive's modules, which in turn contain many symlinks back to the latest (hopefully) version of whatever module you're interested in.
modules/by-author This one is simply a symlink back to the top- level authors directory.
modules/by-category This directory contains a number of subdirectories which provide you with a view to the modules by category; the directories include
02_Perl_Core_Modules
03_Development_Support
04_Operating_System_Interfaces
05_Networking_Devices_Inter_Process
06_Data_Type_Utilities/
07_Database_Interfaces/
08_User_Interfaces/
09_Interfaces_to_Other_Languages/
10_File_Names_Systems_Locking/
11_String_Processing_Language_Text_Proce/
12_Option_Argument_Parameter_Processing/
14_Authentication_Security_Encryption/
15_World_Wide_Web_HTML_HTTP_CGI/
16_Server_and_Daemon_Utilities/
17_Archiving_and_Compression/
18_Images_Pixmap_Bitmap_Manipulation/
19_Mail_and_Usenet_News/
20_Control_Flow_Utilities/
21_File_Handle_Input_Output/
22_Microsoft_Windows_Modules/
23_Miscellaneous_Modules/
99_Not_In_Modulelist/
Each of these contains symlinks back to the specific author's directory and module, as designated appropriate by the CPAN maintainer.
modules/by-module This directory contains a view of all of the Perl library directories, as they are created in @INC, when you install modules and extensions. Each of these directories, in turn, has a symlink to the appropriate version of the specific module(s) or extension(s) which populate that specific library directory. Thus, if you knew you needed the HTML::Element module, and you will later, you could look in the modules/by-module/HTML, and find the symlink, libwww-perl-5.02.tar.gz, which points back to the file: ../../../authors/id/GAAS/libwww-perl-5.02.tar.gz, which is written and maintained by Gisle Aas. Pretty nifty, eh? One copy of any given module exists at any time, but there are a number of ways to get to it, via symlinks.
ports This directory contains Perl ports, in source and binary form, for many architectures and operating systems. Some are older than others, and both Perl4 and Perl5 ports exist. If you are on an architecture other than UNIX, you may need to grab your Perl from this directory.
scripts We mention this particular area of the archive mostly because it will be going away pretty soon. The scripts area is the authors own collection of things from USENET, and all over, beginning in late 1991, and it has just about outlived its usefulness. In its day, it saw something on the order of 10,000 retrievals per week, but with the newer Perl5 modules and authors hierarchy, it's pretty much there just for posterity now. It contains specific examples of scripts and tools, some very old, which implement a given task or tasks, within a simple hierarchy.

CPAN Sites

There are a large number of sites which mirror the CPAN hierarchy which we've described above. The CPAN multiplexer at http://www.perl.com/perl will usually point you to an appropriate one. The perl.com archive is Tom's creation and contains plenty of other useful Perl stuff.

Summary

So, now that you've familiarized yourself with the resources you're going to need to work through the examples in this tutorial, you're ready to continue to the tutorial. Just turn the page and dig in.