Staging Unit Tests using rsync
Everyone has their own development process and tools. In most software processes, though, there is a spot for “unit testing” by the developer on the given feature or defect they are working on. Since I’m primarily involved in web development using PHP these days, this involves only a couple of steps:
- design and code the feature/bug fix
- push those changes out to our development server and test
The second part is what we’ll address in this article – namely, how I currently get the files I’m coding up to the development environment for testing in an efficient manner. Warning, if you aren’t using a scripting language like Ruby or PHP, such as Java or something else that needs compiling, then this method will not work for you. However, feel free to continue reading.
As a developer, I want to be able to quickly push my coding changes out for testing during the course of development. This lets me break the given feature or bug fix down into more digestible parts and get continuous feedback on how my design is holding up and whether I need to make any changes or corrections. Unit testing is not final “Acceptance Testing” of a feature, but a way to ensure that the given code builds and runs without any glaring errors or faults.
At our office we distribute our web applications here much like the open source community distributes their software – using GNU autotools to build a tar.gz package. This makes installs simple, especially since another group in our company does the actual installations (and frankly, sometimes even this method isn’t simple enough for them, but I won’t mention any names). However, if I’m needing to push files out quickly to development so I can unit test, I don’t want to take the time to build a completely new autotools package of the system I’m working on just to install 1 or 2 files with a few code line changes. Seems like overkill. Remember, good coders are inherently lazy. I’m pretty sure that’s a Larry Wall quote. I’d link you to it, but you can google just as well as me.
Luckily, for a number of our systems the code in our code repository is structured just like the actual installation would be. So, if I was working on a project we might have a php/, js/, css/, img/ and other assorted directories in our repository and this is exactly the same structure we’d have in our installation. Since we’re setup this way, I can easily script the pushing of files out to development for testing without worrying about making a brand new package using a command called rsync.
If you are familiar with ssh or rsh — and you like them — you will love rsync. The rsync command is installed by default on most Linux systems, but if your flavor doesn’t have it it is free to download and install – and the great thing is you don’t have to be root to install it or get it working. You will however, have to have it installed on each machine you want to rsync between. In most cases, rsync will use ssh for the remote file transfers, but this can be setup differently if you want. This means, that if you pass around ssh keys to your servers, then rsync will take advantage of that when it’s copying files.
The important things to note about rsync from our standpoint are that it allows us to copy files to a remote server, takes advantage of ssh (which we like) and it does so quickly using a delta style algorithm. Basically, we’re using rsync to “mirror” my working development folder directly on the server. For one of my projects, here is the “sync” shell script that I use to push files out for testing using this method (names changed to protect the innocent):
#!/bin/bash
#----------------------------------------------------------------------
# Default target to something useful
target=$1
TARGET=${target:=username@dev.company.com:/home/username/public_html/exh/}
EXFILE=/tmp/excludefiles.$$
cat >$EXFILE <<EOF
- configure
- Makefile
- **.am
- **.in
- **.cache
- **.log
- .git/***
- m4/***
- build-aux/***
- ext-2.0.2/***
- ext-2.1/***
- sql/***
EOF
echo "Syncing files to location: $TARGET .........."
if rsync --exclude-from=$EXFILE --delete -ravve ssh ./ $TARGET 2>&1; then
echo "ok"
else
echo "failed"
fi
rm -f $EXFILE
exit 0
Let’s talk about this in a bit more detail. The rsync command takes a destination parameter much like ssh and scp, of the form
username@host:/path/to/file/or/folder
This is what we setup in the beginning, allowing me to pass in an arbitrary destination on the command line for the script or, without one it defaults to pushing the files out to my development environment. I should also mention that this script is in the top level of my working code repository so when I run it from there it will copy my entire directory structure using rsync.
Now that we have a destination target for rsync to use, we also want to tell rsync to ignore certain files and directories. In my case, I don’t want to copy any of the autotools related files (Makefile.am, configure.in, etc.) or certain subdirectories which are only development and configuration related and not actual working code. We do this by creating a listing of the items we want to exclude in a temporary file. Here’s a synopsis of the syntax we use to list those file and file/directory name patterns for rsync. The ‘-’ at the beginning of the line tells rsync to ‘exclude’ the file during syncing, and a ‘+’ would be the opposite.
- if the pattern starts with a / then it is anchored to a particular spot in the hierarchy of files, otherwise it is matched against the end of the pathname. This is similar to a leading ^ in regular expressions. Thus “/foo” would match a name of “foo” at either the “root of the transfer” (for a global rule) or in the merge-file’s directory (for a per-directory rule).
- if the pattern ends with a / then it will only match a directory, not a regular file, symlink, or device.
- rsync chooses between doing a simple string match and wildcard matching by checking if the pattern contains one of these three wildcard characters: ‘*’, ‘?’, and ‘[' .
- a '*' matches any path component, but it stops at slashes.
- use '**' to match anything, including slashes.
- a '?' matches any character except a slash (/).
- a '[' introduces a character class, such as [a-z] or [[:alpha:]].
- in a wildcard pattern, a backslash can be used to escape a wildcard character, but it is matched literally when no wildcards are present.
- if the pattern contains a / (not counting a trailing /) or a “**”, then it is matched against the full pathname, including any leading directories. If the pattern doesn’t contain a / or a “**”, then it is matched only against the final component of the filename. (Remember that the algorithm is applied recursively so “full filename” can actually be any portion of a path from the starting directory on down.)
- a trailing “dir_name/***” will match both the directory (as if “dir_name/” had been specified) and everything in the directory (as if “dir_name/**” had been specified). This behavior was added in version 2.6.7.
Once we have this file setup, we simply call the rsync command telling it our target and passing in our exclude list. Then, we remove the temporary exclude list we built and we’re done. The options I’m using on rsync here are as follows:
--delete– I want rsync to remove any files on the “receiving side” that aren’t on my “sending side.”--recurseor-r: I want rsync to recurse into directories and subdirectories to make the copy. duh?--archiveor-a: I want to preserve ALL properties of the files I’m copying, perms, owner, times, etc. (use to taste)--verboseor-v: I use this twice, as the more I specify the more gratuitous rsync’s output on what it’s doing will be.--rshor-e: specify the remote shell to use when copying (we want ssh)
Again, rsync has more options than I could shake a stick at so please check out the man page and do some reading.
In using rsync, I’ve given myself a quick and easily configurable way to copy code and files out to an environment for testing without having to build entire packages. Hopefully this is useful to you in your case; but, given that your working repository might not mirror the structure of your actual installation this might not be best for your situation. This is just what works for me, on my current projects; and I’m sure that will change in the future too.
Dave Atchley is a working software engineer with 15 years experience in systems programming, data warehousing, web design and web applications development. This is his professional blog where he rambles about all things software related, organizes his personal projects and (hopefully) deftly plugs his services as a freelance/contract developer for hire.




