29 August 2007

Awkage: Subversion processing

Awk is one of those command-line tools that is sadly underused. To help with that, I'm going to blog about any useful application of awk I come across.

Today's short one-liner uses the output of svn status to revert any changes and remove any extraneous files from a working copy. You could easily modify it to do more complex things with the files in a certain status. Here's the script:

svn status | awk '$1 ~ /\?/ { system("rm -r " $2) } $1 ~ /M/ { system("svn revert " $2) }'

The first thing to know about awk is that it automatically splits the input on whitespace, and stores the resulting tokens in $1, $2, $3, etc. The output of Subversion I was trying to process looked like this, which is prime awkage material:

$ svn status
?      pom.xml.releaseBackup
?      release.properties
?      confluence-rpc-plugin/pom.xml.releaseBackup
M      confluence-rpc-plugin/pom.xml
?      conf-webapp/pom.xml.releaseBackup
M      conf-webapp/pom.xml
?      confluence/pom.xml.releaseBackup
?      confluence/plugins
?      confluence/temp
?      confluence/null
?      confluence/attachments
?      confluence/bundled-plugins
M      confluence/pom.xml
?      conf-acceptance-test/pom.xml.releaseBackup
M      conf-acceptance-test/pom.xml

My project included a failed run of the Maven release plugin, and I wanted to get my working copy back to a pristine checkout. The awk command does the following with each line of the input:

  1. Checks the first token, the Subversion status, to see if it is a '?'. If it is, awk launches the system command 'rm -r ' followed by the second token, the file name. This removes files and directories that aren't under version control.
  2. Checks the first token to see if it is an 'M'. If it is, ask launches the system command 'svn revert ' followed by the file name. This reverts any locally modified files and directories.

Hopefully it's easy to see in this short script how awk goes about processing each line of the Subversion status output and running the appropriate command.

When I write these scripts, I start out by putting 'echo' in front of all the commands I'm going to run. This lets me do a dry run first that just writes to the console all the commands which will be run. This is especially important when using commands like 'rm' automatically.

My previous articles related to scripting and Unix are: