CheckLink will map out a web-tree whose root is a specific HTML document (the starter-URL). CheckLink can also be used to examine and traverse the URLs that comprise a web-tree, and to create a hierarchical index of the web tree..
From this page you can:
"Linkage" files are created when you create a web-tree (see the description below for details).
Option | Description | ||
---|---|---|---|
Descriptive Name:
|
The descriptive name is simply used as a title. If you do not enter one, the starter-URL will be used to create a descriptive name. | ||
Check off-site URLs
Do NOT check off-site URLs |
CheckLink can attempt to verify the existence of resources residing off-site (where off-site means "with an IP address different then the starter-URL's IP address"). Or, you can suppress this option (off-site URLS will not be queried). | ||
|
If you select the under the starter-URL option,
then only documents in (or under) the directory
of the starter-URL will be processed for
recursive links. If you select only process the starter-URL, then only the starter-URL will be read & processed. Example. |
||
Create & save descriptions: No html documents html and plain text documents |
CheckLink can create & save short descriptions of html (text/html) and plain text (text/plain) documents. | ||
Return results as one long document
Return results in a multi-part document Return results in two seperate documents | CheckLink can return results in several fashions.
The simplest means is to first send run-time status information, and then send the results immediately following the status information (one long document). Using a multi-part (or two seperate) document is visually more appealing -- the "results" part will overwrite the "status" portion (the status portion's main purpose is to prevent server time-outs!) |
||
Exclusion list: | To avoid invocation of addons, scripts, and other dynamic and otherwise complicated resources, CheckLink will compare the selector of each link against each word in the space delimited exclusion list. If any of these words match the selector (and you can use multiple * wildcards), then the link will not be checked. | ||
Types of tables: | This space delimited list of codes is used to specify what results should be reported. For each code in this list, two seperate tables (one for for IMaGes and one for Anchors) is created. Valid codes are OK NOSITE NOURL OFFSITE EXCLUDED ALL | ||
(optional ) linkage file
Note: if you want to create a linkage file, enter a filename only -- do not include path information. To avoid overwriting a pre-existing linkage file, include ? marks in the file name. For example: LFILE?? will cause unique names to be used, starting with LFILE01. |
As well as creating tables that list the various URLs that comprise a
web-tree, you can also use CheckLink to examine and traverse
the web tree. That is, for each URL in the web-tree:
CheckLink will retain "linkage" information --
including information on all text/html documents (in the web tree)
that contain this URL. In addition, for text/html documents CheckLink will retain
a list of all the links in the document.
In order to do this, you must create a "linkage" file. If you specify a linkage file, you can then use the ? links in the results tables, or you can invoke the "examine and traverse" option above. |
Perhaps the use of the term "web-tree" is misleading -- it's more of a web-network, web-graph, or (dare we say it?) a web-web. The point is that a tree implies a bottom-to-top branching structure, with a clearly defined set of precedences. In contrast, a web site is defined by a network of links, with each node connecting to a wide variety of other nodes. Although most web-sites do have some sort of hierarchy (i.e.; there is usually one or several "home pages"), this is usually loosely defined, with lots of cross-cutting links.Nevertheless, for reasons of brevity CheckLink uses the term "web-tree" to refer to "the network of resources, as refered to by URLs, that may be reached from a single starting point". Although this single-starting point (the "starter-URL") is really just a point of entry, one usually chooses a "starter-URL" that is somehow more fundamental -- say, a home page. Hence, this "starter-URL" is often refered to as the "root of the web-tree".
Note that to use a multiple-part document you must have a browser that supports Connection:maintain (such as Netscape 2.0 and above). If you select "multi-parts" but your browser does not support Connection:maintain, then "two seperate documents" will be returned.
In general:
* The NOURL links are the most interesting (they should be reachable,
but aren't).
* If you are not "checking off-site links": you should not
display the NOSITE links, but you should display OFFSITE links.