Copyright Notice
This text is copyright by InfoStrada Communications, Inc., and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in Linux Magazine magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
Download this listing! | ||
Linux Magazine Column 36 (May 2002)
[suggested title: Filling a pipe]
Recently, I attended a presentation at the Portland Linux User's Group
(http://www.pdxlinux.org/
) by Michael Rasmussen
(http://www.patch.com/
), about building Linux clusters using
open-source technology (http://www.linuxvirtualserver.org/
).
Michael mentioned at one point that he needed to analyze the traffic of his company's already active web server, and was surprised that many of the commercial and freely available web traffic tools do not do anything satisfactory with respect to bytes per second or hits per second for bursts. But, being reasonably fluent in Perl, Michael wrote a quick script to crawl through the text web log, and got the data he needed.
Well, that got me thinking about how much (or little) traffic I was
shooting out of the webserver for www.stonehenge.com
. And since
I'm already storing the data into a MySQL database (via DBI and
mod_perl
), I thought it shouldn't be tough to analyze not only the
bytes and hits per second (average and burst) as he had done, but also
the CPU usage per second, which I'm also recording for each hit in my
log.
To make the result more realistic, I decided to spread out the total bytes and CPU usage over the duration of the hit, which I'm also recording as wall-clock seconds. For example, if 100K has been downloaded over 10 seconds, each second in the span should get 10K of that allocation. And because this is a quick-and-dirty program to get some typical stats, I decided to handle this by simply having an array with a separate slot per second, limiting the reasonable size of analysis to just a few days. But that was enough to cover a useful time period, and actually gave me some information about usage that I didn't have from my other reports.
So, let's look at how I did all that in [listing one, below].
Lines 1 through 3 start nearly every program I write, enabling warnings and compiler restrictions (no barewords, symbol references, or undeclared variables), and disabling the normal buffering of standard output.
Line 5 brings in the standard DBI module. For this program, I also
had to have the DBD::mysql
module installed to access the database.
Line 6 pulls in the Statistics::Descriptive
module, just so I don't
have to remember how to do the mean and max values, and because I was
also considering the standard deviation, but ended up not using it.
All these modules are found in the CPAN.
Lines 10 through 12 provide the configuration constants. In
particular, the $DSN
value defines the DBI connection information,
while $DB_AUTH
gives the username and password for the particular
database. Obviously, these values have been slightly smudged for
security reasons. And $DAYS
gives the number of integer days over
which this report will process. We're making and processing three
arrays of this value times 86400 (seconds in a day), so don't get too
anxious to crank it up to a large value.
Lines 16 and 17 set up the database connection, including enabling automatic ``die on error'' handling, so I don't have to check the valid return status of each and every database call. I routinely enable MySQL's big table mode, because I've been burned when I didn't do that, although it might not make much of a difference for this program.
Line 18 is for debugging. When I'm not sure what I'm passing to DBI, or what values are getting returned, I can uncomment this line to get a tracing of operations.
Lines 20 to 25 prepare the SQL query to fetch the data of interest.
The return value is a ``statement handle'', which allows me to carry out
the calculation and obtain the results. The parameter is a
double-quoted string, using the qq{...}
notation so I can easily
balance the brackets nicely with my text editor. The one value that
is interpolated is the number of days to fetch backwards in time.
MySQL's date functions are a bit odd to wrangle, and took me a few
tries to get right.
For my web server log, the when
column is a date/time stamp of the
transaction, wall
is the wall-clock seconds (as an integer) from
start to end, and the four cpu values represent the user, system,
child user, and child system CPU as floating point numbers.
The request grabs the Unix timestamp value (seconds since the
beginning of 1970 in London), the wall clock seconds, and the sum of
the cpu usage in seconds, for all entries that have been posted in the
past $DAYS
number of days. The result is ordered for no apparent
reason, except that during testing I was adding things like LIMIT 10
and wanted a consistently understandable output.
Line 27 computes the Unix timestamp corresponding to the beginning of the result. Note that if the time clocks of the DBI server and the server this process runs on are out of sync, this is probably not going to work well. Luckily, I was actually doing this on the very same machine, so there's no chance that the values were out of sync.
Line 29 starts the database query humming, and may even fetch all the results into a temporary area.
Line 30 declares the result to have four output values for the columns, stored directly into the four named scalars. This keeps the interface simple, as well as very fast.
Lines 32 to 34 declare the three accumulating arrays, for number of
hits, total bytes transferred, and total CPU seconds used. These
arrays will grow to be 86400 times $DAYS
elements each. Since
@bytes
and @hits
are always integers, I considered playing with
a vec()
-based data structure, but decided to optimize along those
lines only if the straight solution didn't fit in memory. Never
optimize prematurely, especially when you are facing deadlines.
Lines 36 to 46 process the database response data. As each row is
returned (indicated by a true value back from fetch
), the values
are placed directly into the four scalar variables named in line 30.
The next step is to apportion the hits, bytes, and CPU seconds across
the wall-clock spread. First, line 37 computes the initial array
index. Line 38 scales down the hit count to 1 more than the wall
count. Since we're counting integer seconds, we've actually got a
rounding error here, but a kludge of simply flattening out the request
over a slightly larger rounded-up time period was satisfactory to me.
And the +1 prevented a potential divide-by-zero error nicely.
Lines 39 and 40 adjust the bytes and CPU seconds to their per-second values, making it easier for lines 41 to 45 to accumulate that value across the spread of the range of seconds for this hit. I actually don't like the repetition of the code: I'm saying the same thing with different variables far too often here, and could probably have factored that out with a bit more effort. I certainly would have done that if there had been three more parameters per second to deal with, but the payback for just the items I have here didn't seem quite worth it.
Once the database scan is complete, we disconnect in line 48 to let the server know we're done there.
Lines 51 through 53 fix the undef
s that appear in the arrays of
data for the timespans when there's no activity on my web server.
Leaving these undef
values in the array will trigger warnings
later, as the undef
s are treated as zero. Right result, but noisy,
so a quick search-and-destroy takes care of it.
Lines 55 to 60 dump out the raw data in a pretty format. Note that
the code is inside an if (0)
control, so there's no execution of
this code in the final program, but while I was testing, this code was
handy, and I tend to leave my paint scaffolding lying around in case
there's a chance I'll be doing more painting later. Note the use of
the scalar value of localtime
to quickly turn a Unix timestamp into
a human-readable string.
The real work happens in the final loop, in lines 62 to 68. This loop
is executed three times (and yes, it was originally written as a
cut-n-paste nightmare where I had the same thing three times with
three different variables). Each time through the loop, $pair
is a
different label and its associated arrayref as a reference to a
two-element array. Inside the loop, we create a new statistics object
(line 63), push the data into the object (line 64), and then compute
the mean and the maximum by calling two different methods against the
statistics object. The mean
method gets the mean label, and the
label of the item being processed from the outer loop. The use of
printf
here ensures that the output is nicely columnized.
And there you have it. The output for my websever as I just ran this looks like:
mean bytes/sec 2014.457 max bytes/sec 207162.000 mean hits/sec 0.198 max hits/sec 25.524 mean cpu/sec 0.008 max cpu/sec 1.212
showing that at peak, I'm nearly filling a T-1 with the amount of data I'm squirting out, and yet on average, I'm using about 1% of the CPU for web service. That last value shouldn't be greater than one: there is obviously some discarded data about the specific allocation of CPU to particular seconds during the day, yielding a slightly bogus max CPU seconds per second.
I hope you enjoyed this discussion of how to make sure you're filling your pipes with your webserver. Until next time, enjoy!
Listing
=1= #!/usr/bin/perl -w =2= use strict; =3= $|++; =4= =5= use DBI (); =6= use Statistics::Descriptive; =7= =8= ## BEGIN CONFIG ## =9= =10= my $DSN = 'dbi:mysql:stonehenge_httpd'; =11= my $DB_AUTH = 'do_not:try_this_at_home'; =12= my $DAYS = 2; =13= =14= ## END CONFIG ## =15= =16= my $dbh = DBI->connect($DSN, (split ':', $DB_AUTH), { RaiseError => 1 }); =17= $dbh->do("SET OPTION SQL_BIG_TABLES = 1"); =18= ## $dbh->trace(2); =19= =20= my $sth = $dbh->prepare(qq{ =21= SELECT unix_timestamp(when), wall, bytes, cpuuser + cpusys + cpucuser + cpucsys =22= FROM requests =23= WHERE when >= DATE_SUB(NOW(), INTERVAL $DAYS DAY) =24= ORDER BY when =25= }); =26= =27= my $offset = time - 86400 * $DAYS; =28= =29= $sth->execute(); =30= $sth->bind_columns(\my($when, $wall, $bytes, $cpu)); =31= =32= my @hits; =33= my @bytes; =34= my @cpu; =35= =36= while ($sth->fetch) { =37= $when -= $offset; =38= my $hits = 1 / ($wall + 1); =39= $bytes *= $hits; =40= $cpu *= $hits; =41= for ($when..($when + $wall)) { =42= $hits[$_] += $hits; =43= $bytes[$_] += $bytes; =44= $cpu[$_] += $cpu; =45= } =46= } =47= =48= $dbh->disconnect; =49= =50= ## fix the undef's =51= defined $_ or $_ = 0 for @bytes; =52= defined $_ or $_ = 0 for @hits; =53= defined $_ or $_ = 0 for @cpu; =54= =55= if (0) { # dump raw data =56= for (0..$#bytes) { =57= printf "%30s %10.1f %4.1f %4.1f\n", =58= scalar(localtime($offset+$_)), $bytes[$_], $hits[$_], $cpu[$_]; =59= } =60= } =61= =62= for my $pair ([bytes => \@bytes], [hits => \@hits], [cpu => \@cpu]) { =63= my $stat = Statistics::Descriptive::Full->new; =64= $stat->add_data(@{$pair->[1]}); =65= for my $thing (qw(mean max)) { =66= printf "%4s %5s/sec %10.3f\n", $thing, $pair->[0], $stat->$thing(); =67= } =68= }