Copyright Notice
This text is copyright by InfoStrada Communications, Inc., and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in Linux Magazine magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
Download this listing! | ||
Linux Magazine Column 52 (Oct 2003)
[suggested title: ``Using Class::DBI for a Link Checker'']
Web-site maintenance often includes the rather mundane but important task of making sure that the carefully-placed page links actually go somewhere useful. Over the many years as a Perl columnist, I've written many different versions, evolving as my Perl mastery gets better, and as the tools get more powerful.
I was recently inspired by a lot of buzz I've been hearing about the
rapidly developing Class::DBI
module: a simple-but-powerful
mechanism by which objects can be mapped to SQL-based databases, and
the DBD::SQLite
module, which puts a full SQL-based
transaction-supporting database into a single file of my chosing,
without the complexity of running a separate server.
These two modules looked like the perfect components for my next
major rewrite of the ever-improving link checker.
After a dozen hours or so trying to get my head around these new modules (including debugging things at about seven layers of indirection), I have a workable alpha-version of a new link checker. The technology is similar to the version I wrote for this very column some three years ago, so you might want to refer back to the Sep 2000 issue for comparison. The key features include:
-
Managed parallel processes to overlap the I/O and DNS lookups
-
On-disk storage of partial state, to resume after an interruption
-
Proper management of
if-modified-since
so that unchanged pages are not refetched and reparsed for links -
Flexible reporting control
-
Flexible mapping of known aliased URLs, such as
index.html
being replaced by merely a trailing slash -
Flexible decisions for URLs, deciding whether to descend, check for existence, merely note, or just ignore
And all this in the 422-line program provided in [listing one, below]. Because the program is fairly long, I'll be hitting only the highlights instead of my usual rambling line-by-line description.
Lines 1 through 7 provide my standard program preamble, including
pulling in the expected modules and setting up constants needed for
the configuration section. The sigtrap
module is used here to
provide a death handler for any signals, which is needed so that DBI
shuts down properly.
Lines 9 through 61 provide all the configuration options. From here, you decide the behavior and limits of the link checking. Most of the comments indicate the specific values, but let me point out some of the more interesting parts.
If $REPORT
is -1, we get a complete dump of all pages seen and
their cross-referencing information (inbound and outbound links) and
status. When it's 0 as it is here, we get a dump only of all pages
that didn't come through as a successful HTTP status. We can also
increase this to a number of seconds (say 1 * DAYS
), in which case
we'll only start hearing about a bad page when it's been bad for at
least that long. This helps with intermittent server flakiness.
The @CHECK
array provides the starting point (or points) for this
web walk. If all pages we want to check are reachable from a given
URL, we need only list it here, and all the others will be found.
(I've obscured the real URL here: I think it'll be obvious what I was
testing against.)
The HACK_URL
subroutine takes a URI
object and returns a
possibly-modified URI
object to represent any known aliasing.
Here, I'm removing index.html
if present, so that those links will
appear only as a trailing slash. Obviously, this is very server
dependent, and should be constructed with care. When in doubt, start
by merely returning the input parameter, and if you notice the same
page being fetched twice under different names, use the subroutine to
rewrite one into the other.
The PARSE
subroutine takes a hacked URL in a URI object, and
decides how far to go with it. If PARSE
returns 2, the page is
fetched, and if the response is HTML, the HTML is parsed for further
links. (If you did this to every URL, you'd eventually see the entire
web, but it will take quite a while!) If PARSE
returns 1, then the
URL is ``pinged'' by fetching just enough to figure out if it's good or
not. If PARSE
returns 0, then the URL is noted for the
cross-reference table, but is otherwise ignored. If PARSE
returns
-1, then the URL is completely ignored. Be very careful writing this
routine: scanning through a lot of dynamically generated content is
guaranteed to upset the webmaster.
Lines 63 to 166 implement the table classes and base classes as
suggested in the Class::DBI
documentation. My::DBI
(in lines 65
to 99) defines the database linkage itself, and defines methods
inherited by all table classes.
Lines 69 and 70 set the database. Here, I'm using DBD::SQLite
to
create a single file at $DATABASE
, which is set earlier to the file
.linkchecker
in my home directory. I also override AutoCommit
so that most transactions are executed on a statement-by-statement
basis.
Lines 72 to 78 create the tables of the database if they don't already exist, by calling methods within each table class. If the tables already exist, the error is trapped and ignored. Because the table-creation methods are kept within the program, I can start this script with no advance preparation (other than installing all the necessary modules).
Lines 80 to 98 are derived from the Class::DBI
documentation as a
mechanism to wrap a code reference within a transaction, automatically
aborting the transaction if something goes wrong. I added the ability
for the subroutine to return back a response value, which simplified
some of the code, as shown later.
Lines 101 to 116 define My::Link
, an object class that is mapped to
the link
table within the database, as declared in line 105. The
link
table records the links between pages from the src
URL to
the dst
URL, creating a many-to-many relationship amongst the
pages.
The SQL to create the link
table is defined in lines 106 through
112. __TABLE__
is replaced with link
, but by doing it this way,
we can derive from this class and inherit this constructor.
Lines 113 through 115 define the columns for Class::DBI
's benefit,
including the proper object type for retrieval, as well as creating
accessor methods for the columns.
Lines 118 to 165 define My::Page
, mapped to the page
table (as
defined in line 123). Most of the heavy lifting happens here.
Using the enum
pragma (found in the CPAN), we get four integer
constants within this package for the various states in which a page
record might exist. State_todo
means we need to process this page.
State_working
means that one of the child processes has started
working on the page, but has not yet completed it. State_done
means we're done. State_unseen
is for a page that has been left
over from the previous run, retaining its modification stamp and
outbound links to minimize the work on each new run. At the end of a
new run, all unseen pages are finally deleted.
Lines 124 to 133 define the SQL to create the table. Note the
interpolation of the State_unseen
into the double-quoted string.
Lines 134 and 135 declare the table columns to Class::DBI
. The
first such column is automatically recognized as the primary key.
Lines 137 and 138 define an inverse relationship, namely that link
records are referencing page
records. By declaring this
explicitly, many joins are automatically generated for us.
Additionally, deleting a particular My::Page
record also has the
``cascading delete'' feature of deleting any inbound or outbound links
automatically, minimizing the need to write explicit code for that.
Lines 140 to 149 change a page
from State_todo
to
State_working
within a transaction. This is needed because all of
the children will be looking at the same ``todo'' list, and picking an
item to do. Only one child can be successful at this, and because
we've wrapped it in a transaction, it works nicely. The subroutine
returns a true/false value to indicate whether it successfully snagged
the particular My::Page
object for this process to work on.
Lines 151 to 164 similarly add a page to the todo list, again within a
transaction to prevent other children from interfering. Only new
records or previously unseen records (left over from a previous pass)
can be set to State_todo
. The new (or reused) My::Page
object
is returned.
Lines 167 to 175 create an LWP::UserAgent
virtual browser to fetch
the pages, including giving it a distinguished agent string so we can
filter it out of our web hit logs.
The main code begins in line 177. Line 180 creates the tables if needed.
Lines 183 to 186 reset any working pages to todo pages. This happens when the program is interrupted before completing a pass, and will re-trigger any requests that might have been aborted before completion.
Lines 188 to 194 ``primes the pump''. If there aren't any todo or done
items, then we need to start somewhere, namely at the @CHECK
URLs.
Lines 196 to 213 use the kids_do
subroutine (defined later) to
perform the task given by the first coderef, in parallel by forking up
to a maximum of the lesser of $KIDMAX
or whatever number the second
coderef returns. Each kid repeatedly grabs all the to-do records,
picks one at random, and tries to ``own'' it. If the ``owning'' is
successful, do_one_page
is called (defined later). The cycle is
repeated until nothing shows up in the to-do list, at which point the
child exits.
Once the kids have done their work, lines 215 and 216 remove any pages that weren't seen on this pass.
Lines 218 to 248 display the report. Pages that aren't ``bad enough'' are rejected from being displayed in line 222. Information about the page itself gets printed in lines 223 through 230, including the status and timestamps.
Lines 232 through 246 print the cross-reference table, showing all the
inbound links (useful if the page is not successful to find out where
it was referenced) and outbound links (useful when we're editing a
particular page to find all bad links). The links are generally
relative, unless they go up two or more levels, in which case they're
made absolute from /
instead, which seemed to be a good compromise
for display.
Lines 250 to 254 set all the State_done
pages to State_unseen
,
for the next pass. And that's all there is for the main program.
Lines 260 to 291 set up the processing of a particular page. Most of
the logic decides whether this is a ``parse'' or a ``ping'', and how much
information to use (if any) from a previous pass. Note how the
mapping of the $page
object to the table eliminates any need for me
to write SQL code: I'm simply using $page
as if it was an in-memory
object.
Lines 293 to 378 form the core of the actions. As this code is very
similar to this column's September 2000 code, I'll skip the detailed
description. The page is fetched (using an LWP callback),
content-type and status is noted, links are added, and the page record
is updated. The two big differences between this code and previous
code is that I'm using HTML::LinkExtor
to derive the links, rather
than my handwritten code from before, and I'm using objects instead of
hashes to store the data.
When a link is seen, add_link
(lines 380 to 397) is called to
record the link in the database. If a ``base'' is provided, it's used
to make a relative link into an absolute link. The links are ``hacked''
to make them canonical, and any ignorable link is ignored. As the
links are getting created, there might be multiple links between the
same pair of pages, causing the create
to throw an exception, as
triggered by the SQLite's constraint on the table.
Lines 399 to 440 manage the kids, using code lifted mostly from the August 2000 column, and is better described there. The job here is greatly simplified, however, because we don't need to do anything to the kids other than start them: they figure out their own tasks directly from the database.
Although I've run up against a deadline for this issue (in both space
and time), I can see that I really need to create a ``count todo''
method for My::Page
, as I'm doing it the hard way in more than one
place. Also, I'm sure there's some fine-tuning of some of the code
steps, but at least this code runs as-is and runs well. Have fun
keeping your website clean of all those nasty broken links. Until
next time, enjoy!
Listing
=1= #!/usr/bin/perl -w =2= use strict; =3= $| = 1; =4= use sigtrap qw(die untrapped); =5= =6= use URI; =7= use constant DAYS => 86400; # for specifications =8= =9= ## configuration constants =10= =11= my $DATABASE = (glob "~/.linkchecker")[0]; =12= =13= my $VERBOSE = 1; # 0 = quiet, 1 = noise, 2 = lots of noise =14= =15= my $RECHECK = 0.1 * DAYS; # seconds between rechecking any URL =16= my $RECHECK_GOOD = 1 * DAYS; # seconds between rechecking good URLs =17= my $REPORT = 0 * DAYS; # seconds before bad enough to report =18= =19= my $FOLLOW_REDIRECT = 1; # follow a redirect as if it were a link =20= my $TIMEOUT = 30; # timeout on fetch (hard timeout is twice this) =21= my $MAXSIZE = 1048576; # max size for fetch (undef if fetch all) =22= =23= my $KIDMAX = 5; # how many kids to feed =24= =25= my @CHECK = # list of initial starting points =26= qw(http://www.perl.borg/); =27= =28= sub PARSE { =29= ## return 2 to parse if HTML =30= ## return 1 to merely verify existence =31= ## return 0 to not even verify existence, but still xref =32= ## return -1 to ignore entirely =33= my $url = shift; # URI object (absolute) =34= for ($url->scheme) { =35= return 0 unless /^http$/; =36= } =37= for ($url->query) { =38= return -1 if /^C=[DMNS];O=[AD]/; # silly mod_index =39= } =40= for ($url->host) { =41= if (/www\.perl\.borg$/) { =42= for ($url->path) { =43= return 0 if /images|photos/; # boring =44= return 0 if /^\/(tpc|yapc)\/.*(199[89]|200[012])/; # old =45= } =46= return 2; # default www.perl.borg =47= } =48= =49= return 0 if /use\.perl\.borg$/; =50= =51= } =52= return 1; # ping the world =53= } =54= =55= sub HACK_URL { =56= my $url = shift; # URI object =57= $url->path("$1") if $url->path =~ /^(.*\/)index\.html$/s; =58= $url->canonical; =59= } =60= =61= ## end configuration constants =62= =63= ### internally-defined classes =64= =65= { =66= package My::DBI; =67= use base 'Class::DBI'; =68= =69= __PACKAGE__->set_db('Main', "dbi:SQLite:dbname=$DATABASE", undef, undef, =70= {AutoCommit => 1}); =71= =72= sub CONSTRUCT { =73= my $class = shift; =74= for (qw(My::Page My::Link)) { =75= eval { $_->sql_CONSTRUCT->execute }; =76= die $@ if $@ and $@ !~ /already exists/; =77= } =78= } =79= =80= sub atomically { =81= my $class = shift; =82= my $action = shift; # coderef =83= local $class->db_Main->{AutoCommit}; # turn off AutoCommit for this block =84= =85= my @result; =86= eval { =87= @result = wantarray ? $action->() : scalar($action->()); =88= $class->dbi_commit; =89= }; =90= if ($@) { =91= warn "atomically got error: $@"; =92= my $commit_error = $@; =93= eval { $class->dbi_rollback }; =94= die $commit_error; =95= } =96= die $@ if $@; =97= wantarray ? @result : $result[0]; =98= } =99= } =100= =101= { =102= package My::Link; =103= our @ISA = qw(My::DBI); =104= =105= __PACKAGE__->table('link'); =106= __PACKAGE__->set_sql(CONSTRUCT => <<'SQL'); =107= CREATE TABLE __TABLE__ ( =108= src TEXT, =109= dst TEXT, =110= PRIMARY KEY (src, dst) =111= ) =112= SQL =113= __PACKAGE__->columns(Primary => qw(src dst)); =114= __PACKAGE__->has_a(src => 'My::Page'); =115= __PACKAGE__->has_a(dst => 'My::Page'); =116= } =117= =118= { =119= package My::Page; =120= our @ISA = qw(My::DBI); =121= use enum qw(:State_ unseen todo working done); =122= =123= __PACKAGE__->table('page'); =124= __PACKAGE__->set_sql(CONSTRUCT => <<"SQL"); =125= CREATE TABLE __TABLE__ ( =126= location TEXT PRIMARY KEY, =127= state INT DEFAULT @{[State_unseen]}, =128= last_status TEXT, =129= last_checked INT, =130= last_good INT, =131= last_modified INT =132= ) =133= SQL =134= __PACKAGE__->columns(All => qw(location state last_status =135= last_checked last_good last_modified)); =136= =137= __PACKAGE__->has_many(inbound => 'My::Link', 'dst', { sort => 'src' }); =138= __PACKAGE__->has_many(outbound => 'My::Link', 'src', { sort => 'dst' }); =139= =140= sub make_working_atomically { =141= my $self = shift; =142= =143= $self->atomically(sub { =144= $self->state == State_todo or return undef; =145= $self->state(State_working); =146= $self->update; =147= return 1; =148= }); =149= } =150= =151= sub create_or_make_todo { =152= my $class = shift; =153= my $location = shift; =154= =155= $class->atomically(sub { =156= my $item = $class->find_or_create({location => $location}); =157= if ((not defined($item->state) =158= or $item->state == State_unseen)) { =159= $item->state(State_todo); =160= $item->update; =161= } =162= $item; =163= }); =164= } =165= } =166= =167= { =168= use LWP::UserAgent; =169= my $AGENT = LWP::UserAgent->new; =170= $AGENT->agent("linkchecker/0.42 " . $AGENT->agent); =171= $AGENT->env_proxy; =172= $AGENT->timeout($TIMEOUT); =173= =174= sub fetch { $AGENT->simple_request(@_) } =175= } =176= =177= ### main code begins here =178= =179= ## initialize database if needed =180= My::DBI->CONSTRUCT; =181= =182= ## reset all working to todo =183= for my $page (My::Page->search(state => My::Page::State_working)) { =184= $page->state(My::Page::State_todo); =185= $page->update; =186= } =187= =188= ## unless any are todo or finished, prime the pump =189= unless (() = My::Page->search(state => My::Page::State_todo) =190= or () = My::Page->search(state => My::Page::State_done)) { =191= print "Starting a new run...\n"; =192= My::Page->create_or_make_todo(HACK_URL(URI->new($_))->as_string) =193= for @CHECK; =194= } =195= =196= ## main loop, done by kids: =197= kids_do(sub { # the task =198= srand; # spin random number generator uniquely =199= while (my @todo = My::Page->search(state => My::Page::State_todo)) { =200= my $page = $todo[rand @todo]; # pick one at random =201= unless($page->make_working_atomically) { =202= # someone else got it =203= print "$$ wanted ", $page->location, "\n" if $VERBOSE; =204= next; =205= } =206= ; =207= print "$$ doing ", $page->location, "\n" if $VERBOSE > 1; =208= do_one_page($page); =209= } =210= }, =211= sub { # max kids needed =212= scalar(() = My::Page->search(state => My::Page::State_todo)); =213= }); =214= =215= ## clean out any unseen at this point (no longer needed) =216= $_->delete for My::Page->search(state => My::Page::State_unseen); =217= =218= ## display report =219= print "*** BEGIN REPORT ***\n"; =220= for my $page (My::Page->search(state => My::Page::State_done, =221= {order_by => 'location'})) { =222= next if $page->last_checked <= $page->last_good + $REPORT; =223= my $url = URI->new($page->location); =224= print "$url:\n"; =225= print " Status: ", $page->last_status, "\n"; =226= for (qw(checked good modified)) { =227= my $method = "last_$_"; =228= my $value = $page->$method() or next; =229= print " \u\L$_\E: ".localtime($value)."\n"; =230= } =231= =232= for my $inbound ($page->inbound) { =233= my $inbound_page = $inbound->src; =234= my $inbound_url = URI->new($inbound_page->location); =235= my $rel = $inbound_url->rel($url); =236= $rel = $inbound_url->path_query if $rel =~ /^\.\.\/\.\./; =237= print " from $rel\n"; =238= } =239= for my $outbound ($page->outbound) { =240= my $outbound_page = $outbound->dst; =241= my $outbound_url = URI->new($outbound_page->location); =242= my $rel = $outbound_url->rel($url); =243= $rel = $outbound_url->path_query if $rel =~ /^\.\.\/\.\./; =244= my $outbound_status = $outbound_page->last_status; =245= print " to $rel: $outbound_status\n"; =246= } =247= } =248= print "*** END REPORT ***\n"; =249= =250= ## reset for next pass =251= for my $page (My::Page->search(state => My::Page::State_done)) { =252= $page->state(My::Page::State_unseen); =253= $page->update; =254= } =255= =256= exit 0; =257= =258= ### subroutines =259= =260= sub do_one_page { =261= my $page = shift; # My::Page =262= =263= my $url = URI->new($page->location); =264= my $parse = PARSE($url); =265= if ($parse >= 2) { =266= print "Parsing $url\n" if $VERBOSE; =267= if (time < ($page->last_checked || 0) + $RECHECK or =268= time < ($page->last_good || 0) + $RECHECK_GOOD) { =269= print "$url: too early to reparse\n" if $VERBOSE; =270= ## reuse existing links =271= My::Page->create_or_make_todo($_->dst->location) for $page->outbound; =272= } else { =273= parse_or_ping($page, $url, "PARSE"); =274= } =275= } elsif ($parse >= 1) { =276= print "Pinging $url\n" if $VERBOSE; =277= if (time < ($page->last_checked || 0) + $RECHECK or =278= time < ($page->last_good || 0) + $RECHECK_GOOD) { =279= print "$url: too early to reping\n" if $VERBOSE; =280= $_->delete for $page->outbound; # delete any existing stale links =281= } else { =282= parse_or_ping($page, $url, "PING"); =283= } =284= } else { =285= print "Skipping $url\n" if $VERBOSE; =286= $page->last_status("Skipped"); =287= $page->last_checked(0); =288= } =289= $page->state(My::Page::State_done); =290= $page->update; =291= } =292= =293= sub parse_or_ping { =294= my $page = shift; # My::Page =295= my $url = shift; # URI =296= my $kind = shift; # "PARSE" or "PING" =297= =298= use HTML::LinkExtor; =299= =300= ## create the request =301= my $request = HTTP::Request->new(GET => "$url"); =302= $request->if_modified_since($page->last_modified) if $page->last_modified; =303= =304= ## fetch the response =305= my $content; =306= my $content_type; =307= my $res = fetch =308= ($request, =309= sub { =310= my ($data, $response, $protocol) = @_; =311= unless ($content_type) { =312= if ($content_type = $response->content_type) { =313= if ($kind eq "PING") { =314= print "aborting $url for ping\n" if $VERBOSE > 1; =315= die "ping only"; =316= } =317= if ($content_type ne "text/html") { =318= print "aborting $url for $content_type\n" if $VERBOSE > 1; =319= die "content type is $content_type"; =320= } =321= } =322= } =323= $content .= $data; =324= if ($MAXSIZE and length $content > $MAXSIZE) { =325= print "aborting $url for content length\n" if $VERBOSE > 1; =326= die "content length is ", length $content; =327= } =328= }, 8192); =329= $res->content($content); # stuff what we got =330= =331= ## analyze the results =332= if ($res->is_success) { =333= my $now = time; =334= $page->last_checked($now); =335= $page->last_good($now); =336= $page->last_modified($res->last_modified || $res->date); =337= $_->delete for $page->outbound; # delete any existing stale links =338= =339= if ($content_type eq "text/html") { =340= if ($kind eq "PARSE") { =341= print "$url: parsed\n" if $VERBOSE; =342= $page->last_status("Verified and parsed"); =343= my %seen; =344= HTML::LinkExtor->new =345= (sub { =346= my ($tag, %attr) = @_; =347= $seen{$_}++ or add_link($page, $_) for values %attr; =348= }, $res->base)->parse($res->content); =349= } else { # presume $kind = PING =350= print "$url: good ping\n" if $VERBOSE; =351= $page->last_status("Verified (contents not examined)"); =352= } =353= } else { =354= print "$url: content = $content_type\n" if $VERBOSE; =355= $page->last_status("Verified (content = $content_type)"); =356= } =357= } elsif ($res->code == 304) { # not modified =358= print "$url: not modified\n" if $VERBOSE; =359= my $now = time; =360= $page->last_checked($now); =361= $page->last_good($now); =362= ## reuse existing links =363= My::Page->create_or_make_todo($_->dst->location) for $page->outbound; =364= } elsif ($res->is_redirect) { =365= my $location = $res->header("Location"); =366= print "$url: redirect to $location\n" if $VERBOSE; =367= $_->delete for $page->outbound; # delete any existing stale links =368= add_link($page, $location, $res->base) if $FOLLOW_REDIRECT; =369= $page->last_status("Redirect (status = ".$res->code.") to $location"); =370= $page->last_checked(time); =371= } else { =372= print "$url: not verified: ", $res->code, "\n" if $VERBOSE; =373= $_->delete for $page->outbound; # delete any existing stale links =374= $page->last_status("NOT Verified (status = ".($res->code).")"); =375= $page->last_checked(time); =376= } =377= $page->update; =378= } =379= =380= sub add_link { =381= my $page = shift; # My::Page =382= my $url_string = shift; # string =383= my $base = shift; # maybe undef =384= =385= my $url = $base =386= ? URI->new_abs($url_string, URI->new($base)) =387= : URI->new($url_string); =388= $url->fragment(undef); # blow away any fragment =389= $url = HACK_URL($url); =390= return if PARSE($url) < 0; # skip any links to non-xref pages =391= print "saw $url\n" if $VERBOSE > 1; =392= =393= my $newpage = My::Page->create_or_make_todo("$url"); =394= ## the following might die if there's already one link there =395= eval { My::Link->create({src => $page, dst => $newpage}) }; =396= die $@ if $@ and not $@ =~ /uniqueness constraint/; =397= } =398= =399= sub kids_do { =400= my $code_task = shift; =401= my $code_count = shift; =402= =403= use POSIX qw(WNOHANG); =404= =405= my %kids; =406= =407= while (keys %kids or $code_count->()) { =408= ## reap kids =409= while ((my $kid = waitpid(-1, WNOHANG)) > 0) { =410= ## warn "$kid reaped"; # trace =411= delete $kids{$kid}; =412= } =413= ## verify live kids =414= for my $kid (keys %kids) { =415= next if kill 0, $kid; =416= warn "*** $kid found missing ***"; # shouldn't happen =417= delete $kids{$kid}; =418= } =419= ## launch kids =420= if (keys %kids < $KIDMAX =421= and keys %kids < $code_count->()) { =422= ## warn "forking a kid"; # trace =423= my $kid = fork; =424= if (defined $kid) { # good parent or child =425= if ($kid) { # parent =426= $kids{$kid} = 1; =427= } else { =428= $code_task->(); # the real task =429= exit 0; =430= } =431= } else { =432= warn "cannot fork: $!"; # hopefully temporary =433= sleep 1; =434= next; # outer loop =435= } =436= } =437= print "[", scalar keys %kids, " kids]\n" if $VERBOSE; =438= sleep 1; =439= } =440= } =441= =442= sub URI::mailto::host { ""; } # workaround bug in LWP =443= sub URI::mailto::authority { ""; } # workaround bug in LWP