Jan 20 2010

Simple PHP Profiling Class

datchley

Working in my current corporate environment we don’t have any useful PHP profiling abilities on any of our web servers. Unfortunately, trying to get our server support group to install something useful like APD or Xdebug and getting it integrated is a bureaucratic pain in the ass. We have some very useful tools like FireBug on the client side and we’ve started using FirePHP integration in FireBug which provides a number of nice features for debugging; but nothing specific to profiling. Given the situation, I decided to implement my own simple Profiling class that made use of FireBug/FirePHP to provide the output in a useful format. More could be done with this, to be sure; but for the time being it has taken care of my needs.

There’s no shortage of existing of profiling functions and classes by other developers on the Internet; but I figured I could use some creativity and play around with a class of my own. Plus, none of the one’s I’ve seen out there were using FireBug/FirePHP as part of their output which is something I wanted to try out. So, with some spare cycles while debugging, I came up with the fbProfile class (see download at end of the article).

The fbProfile class is very simple and straight forward. From a usage standpoint, it would provide only 3 methods: start(), stop() and display(). We’ll go into these in a bit more detail later; but, my goals for the class and it’s usage are summed up as follows:

  1. be able to nest calls to start() and stop() to produce a visible call chain
  2. be able to tag profiled tasks with a descriptive name for reference in the output
  3. track the total Page time (or close approximate)
  4. track how many times a block of profiled code was called, along with profiling information for each call
  5. optionally, be able to watch a set of variables on each subsequent call to a profiled block of code, seeing the values over time
  6. determine some basic results like total task time, average task time (per call), percentage of total page time, etc.
  7. show the output using FireBug’s console using FirePHP to take advantage of the work area and abilities it provides.

Using the fbProfile Class

Because we’re using FireBug/FirePHP you’ll need to have both of these installed and enabled for the site you’re using fbProfile on. At the top of your PHP code, do the usual setup for FirePHP and afterwards include the fbProfile.php file.

<?php
	ob_start();
	require_once($_SERVER['DOCUMENT_ROOT']."/firephp/lib/FirePHPCore/FirePHP.class.php");
	$firephp = new FirePHP;
	$firephp->registerErrorHandler();
	$firephp->registerExceptionHandler();

	include_once($_SERVER['DOCUMENT_ROOT'] . "/fbProfile.php");

Including the fbProfile.php class will be the marker for the page start timestamp it uses to keep track of total page load/processing time. It’s not as accurate as when using a real profiler like APD or Xdebug, but if you put it close to the top of the page (which is where it would normally fall to being with) it’s an accurate enough approximation of page start time.

From this point, just wrap the sections or blocks of code you would like to benchmark in calls to start() and stop() respectively. A call to start takes a string name which will be used as the label for that section of code when we output the results to the FireBug console, so use something descriptive. Also, these calls can be nested, as such:

	...
	fbProfile::start('Main Query Execution');
	$rs = $tdcsdb->Execute($sql);
	fbProfile::stop();
	...
	fbProfile::start('Main Processing Loop');
	while ($row = $rs->FetchRow()) {
		...
		fbProfile::start('Build Record');
		...
		update_output_file($row);
		...
		fbProfile::stop();	// Build Record
	}
	fbProfile::stop();	// Main Processing Loop

	...

	function update_output_file($rec) {
		fbProfile::start('update_output_file()');
		...
		fbProfile::stop();
	}

This is simple usage shown above. Each call to start() has a matching call to end() and each wrapped block/section of code is given a descriptive name. We have nested calls and even calls inside a function to profile function execution. fbProfile will track each call to the wrapped blocks of code and keep track of processing time per call and in relation to the entire page processing time.

At the end of your PHP script, do the necessary FirePHP close up because of output buffering and call fbProfile::display() to show the output of the profiling in your FireBug console. fbProfile::display takes a reference to your FirePHP instance object – the one that we created at the top of the page.

	fbProfile::display($firephp);
	ob_end_flush();
?>

Optionally, if you would like to see the individual profiling times for each call to a section of code and/or track one or more variable values related to that block of code, fbProfile::start() takes 2 optional parameters after the string label. The first is a boolean true or false to tell the fbProfile class that you would like to see this detailed information in the output when display() is called (defaults to false) and after that parameter you can pass a variable number of other parameters that are the variables you would like to see the values of during each subsequent call (shown in the detail records). So, to see the detail and watch some values, call fbProfile::start() as follows:

fbProfile::start('My Label', true, $var1, $var2, ...);

Below is some output from a run of fbProfile on some code I have been working on, the detail records are collapsed in this screenshot. If a task has detail records, you’ll see the little ‘table’ icon on the left side of the task summary, since we’re using FirePHP’s table() function to format the output.

And the next picture shows the output display of detailed call records for a task that was called numerous times. Just click the task summary with the table icon to expand the detail records display.

Hopefully, this class will provide useful for someone as-is, or even as a jumping off point in creating their own. Again, this is not hardened code and definitely not as clean as it could be; but it’s the kind of thing that developers do when they need a fix and are limited on resources.

Download link: fbProfile.php

Enjoy!


Dec 28 2009

Block Record Processing in Perl

datchley

In some work I’m doing at the moment, I need to take a rather large data file (about 1.5 million records) and create a smaller sample data file to work with.  The larger, original file had records, one per line, and the file was already sorted in ASCII order. The file in question was the movies.list file from the open IMDB database.  It contains all the movies in IMDB’s database, sorted on the movie title.  What I wanted was a block of records from each alphabetical block (a-z) giving me a smaller sampling of the file for testing purposes.

Now, not only are there movies starting with the letters A through Z, but also titles starting with symbols, such as ‘#’ or ‘$’ as well as numbers 1-9 and so forth.  Quite a smattering of possibilities.  What I wanted to do was have a block of 20 records from each set, only based on the first character of the movie title.  Here’s the quick script I came up with to generate the sample data file:

#! /usr/bin/perl
use warnings;
use strict;

my $BLOCKSIZE = 20;
my $n = 0;

FILE:
while ($_ = <>)  {
	# get character that defines this block
	my $block = quotemeta substr($_, 0, 1);

	while (m/^$block/) {
		print if $n++ < $BLOCKSIZE;
		last FILE unless defined($_ = <>);
		# eat up records numbers over $BLOCKSIZE
	}
	print;			# first record of next block, so print it now
	$n = 1;			# reset our block counter to 1, not 0 (we got that one already)
}

The original file was rather large and I didn’t want to read the entire thing into memory, so I needed to process it in one pass. The script is small and fairly quick, but let’s look at some of the primary points.

       my $block = quotemeta substr($_, 0, 1);

Getting the first character of the first input line so that I know which block we’re starting. We’ll use this character in our regular expression as we go through the rest of the matching records in this block. The quotemeta part is there because of the non-alphanumeric characters I mentioned occurring in the titles before – it escapes them so we can then use them in the following regular expression.

	while (m/^$block/) {
		print if $n++ < $BLOCKSIZE;
		last FILE unless defined($_ = <>);
		# eat up records numbers over $BLOCKSIZE
	}

Here, we use that character in our expression to test the current line (which should obviously match) and keep cycling through lines until we hit a line that doesn’t start with the character we found previously (see line 15). Since we only want 20 records of each block for our sample, we only print the matching lines if the record count is less than our blocksize, using $n to keep track of our record count.  All other records that match the block but are OVER our block size, just get eaten up.

Lastly, after the end of our previous loop we’ll have a record in $_ that doesn’t match our previous pattern but will start the next pattern, we want the first 20 records for each block, so we’ll print this one here and restart our record counter at 1, instead of zero for the next block processing loop (lines 18 & 19).

You got Better Ideas? I’d love to hear how you would tackle this problem and even get some examples or short scripts from readers. Feel free to comment on my own script as I’d love to hear about ways to improve it!


Dec 18 2009

chext: Batch rename file extensions

datchley

Most Linux systems come with a rename command today; but some of the commercial Unixes like AIX and HP-UX don’t have many of the command sets that Linux users have come to rely on.  This isn’t quite the same as rename, but it’s a script I put together a while back because changing the extension on a number of files was something I was doing quite frequently on those types of systems.  Here’s the script, feel free to copy/paste and use.

#!/bin/sh
# chext - batch rename files by changing the file extension
# Author: Dave Atchley <dave@tuxz0r.net>
#----------------------------------------------------------------------

usage() {
	echo "Usage: $0 [-R] OLD NEW"
	echo
	exit 1
}

case "$1" in
# -R recursive option
-R) 	if [ $# -ne 3 ]; then
		echo "error: missing arguments"; usage
	fi
	ext=$2
	new=$3
	files=`find . -name '*$2'`
	;;
-h)	echo "chext: change the extension on multiple files"; usage
	;;
*) 	if [ $# -ne 2 ]; then
		echo "error: missing arguments"; usage
	fi
	ext=$1
	new=$2
	files=`echo *$1`
	;;
esac

for f in $files
do
	mv $f `echo $f | sed 's/'"$ext"'$/'"$new"'/'`
done
exit 0

Basically the script will by default handle files in the current directory, but using the -R option will allow it to do so recursively.  You can call it from the command line as

$ chext .old .new

or recursively as

$ chext -R .old .new