Synchronizing configuration files with Puppet and Subversion

The concrete reason for writing this post was a particular problem we had when we set up a new Linux cluster for hosting our Subversion and Maven repositories. For Maven, we have Nexus to manage the repository and artifacts, for Subversion we have Collabnet Subversion. The challenge for both was to keep their configuration files in sync over all cluster nodes. And, additionally, keep track of changes to configuration files which are done manually or by the mentioned tools itself. Nexus, for example, will store most of its configuration in a file called nexus.xml. This file is used for storing repository information. Every repository created by a user will make its way into this file. But, Nexus itself uses this file also to store the dates of scheduled tasks. So actually two players are committing changes here. And back to the cluster, the file must always be in sync on all cluster nodes in case of a fail-over. So there had to be also a solution for committing changes to Subversion and updating configuration files from Subversion.

Everything had to be done automatically without human intervention. Even when setting up a new cluster node we didn’t want to care about having the most recent configuration in place.

So, in summary, we were facing these use cases:

  1. A new cluster node has to be provided with the most recent configuration.
  2. Local changes to configuration files (done by humans or machines) had to be committed automatically.
  3. Changes to configuration files have to be updated on all cluster nodes.
  4. Subversion conflicts due to concurrent changes have to be detected.

So this post is about giving an introduction into the solution I’ve implemented using Puppet, Subversion and some bash scripts. I will show only the configuration for Nexus’ configuration files. But this can be applied to any other system which relies on one or more configuration files. We are using this approach to also keep the configurations for Subversion (users and groups) and Gitblit (also users and groups) in sync.

There’s no weird stuff here. Just simple Puppet resources and some bash scripts. It’s the combination that makes the job. If you’re interested, take your time to read this post in order to get the whole picture.

Basics

In short, there is one Puppet manifest containing a define which does all the job. It’s a combination of cascading Puppet resources which is capable of detecting the current state of the configuration files and act accordingly. The define is called synchronize_configuration; the header looks like this:

define synchronize_configuration ($base_dir, $svn_args, $svn_url, $creates_dir, $user, $bin_dir, $status_file) {
...
}

The content of it will be shown later. For now just a short explanation of the parameters:

  • base_dir: The base directory which serves as the parent directory for checking out the configuration files.
  • svn_args: The arguments which must be given to any Subversion command. Example: –non-interactive –no-auth-cache –username nexus –password nexus –config-dir /home/nexus/.subversion
  • svn_url: The Subversion URL which contains the configuration files to be managed.
  • creates_dir: The directory which will be created the first time the configuration files are checked out.
  • user: A valid Unix account which will be the owner of the configuration files.
  • bin_dir: The directory that will contain the bash scripts which actually do the Subversion jobs (check out, commit, …). The directory must exist. The bash scripts will be checked out or updated as necessary.
  • status_file: Part of this configuration is also to send emails with the current state and – if any – changes made to the configuration files. All changes are written to this file and sent via a cron job to a dedicated email address. So just give the name (including path) of such a file here.

The next chapters describe the single parts of this Puppet define in detail. At the very end of this post, one will find the complete define which should it make easy to simply copy, paste and the adapt to ones particular needs.

Provide most recent configuration on a new cluster node

The very first step for providing the configuration to a new cluster node is of course to check out all the configuration files. This is the Puppet exec resource which actually checks out everything:

    exec { "checkout-conf-${name}":
      require => File[$base_dir],
      cwd     => $base_dir,
      command => "/usr/bin/svn co ${svn_args} ${svn_url}",
      creates => $creates_dir,
      timeout => 60, 
      user    => $user,
    }

Note the creates attribute. Once the configuration files are checked out, this is the newly created directory. So this creates ensures the check out is only done once.

Check local modifications and commit

The next step after having checked out the configuration files is to regularly check if there were any modifications. As mentioned above, those changes might be committed by either a Unix user, by some web interface (e.g. for user management) or – at least in Nexus’ case – by the program which uses the configuration files itself. The check is done with every Puppet run and looks like this:

    exec { "check-local-modifications-${name}":
      onlyif  => "/usr/bin/test -d ${creates_dir}",
      cwd     => $bin_dir,
      path    => [$bin_dir],
      require => File[["${bin_dir}/svn-commit.sh", "${bin_dir}/svn-commons.sh"]],
      command => "./svn-commit.sh ${creates_dir} \"${svn_args}\" ${status_file}",
      user    => $user,
    }

The onlyif attribute ensures that this is only done if the configuration has been checked out before.

These are the contents of the used file svn-commit.sh:

. ./svn-commons.sh

check_args
set_args "$1" "$2" "$3"

$SVN_CMD stat $WORKING_COPY | $GREP_CMD ^M
MODIFIED=$?
DATE=`/bin/date`

if [ $MODIFIED -eq 0 ]; then
    $LOGGER "Found local modifications in working copy $WORKING_COPY. Going to commit now."
    $SVN_CMD ci $SVN_ARGS -m "${SVN_KEY_CODEBASE}Auto commit of local changes in $WORKING_COPY. Date: ${DATE}" $WORKING_COPY >> $STATUS_FILE 2>&1
    echo "\n" >> $STATUS_FILE

    if [ $? -eq 0 ]; then
        REVISION=`$SVN_CMD stat -u $SVN_ARGS $WORKING_COPY | $GREP_CMD "^Status against revision:.*" | /bin/cut -d: -f2`
        $LOGGER "Committed revision $REVISION of working copy $WORKING_COPY."
    else
        $LOGGER "Failed committing local changes in $WORKING_COPY!"
        send_mail "Failed committing local changes in $WORKING_COPY on host $HOSTNAME.\nPlease check manually." "Failed commiting local changes on $HOSTNAME. Please check."
    fi  
else
    $LOGGER "No SVN modifications in working copy $WORKING_COPY"
fi

exit $EXIT_OK

If a commit is done, Subversion’s output is written to a status file. This status file is later used by a cron job to detect that were changes and to send an email to the user.

The contents of file svn-commons.sh which is sourced at the beginning of the script, will be shown later.

Update configuration changes to all nodes

Updates are necessary once changes were done on one cluster node and must then be synchronized with the other or if a user changes something on its local Subversion working copy of the files. This exec resource, which is also part of the define, does the job:

    exec { "update-conf-${name}":
      onlyif  => "/usr/bin/test -d ${creates_dir}",
      cwd     => $creates_dir,
      command => "/usr/bin/svn up --force ${svn_args}",
      timeout => 60, 
      before  => [Exec["check-local-modifications-${name}"], Exec["check-local-conflicts-${name}"]],
      user    => $user,
    }

Note that this resource is executed after the checks for local modifications and Subversion conflicts (shown later) are done. And, again – it will not be executed after the configuration files are checked out for the first time (line 2).

Detect Subversion conflicts

Subversion conflicts cannot be solved automatically and need human intervention. So, at least, we need to detect conflicts and inform someone to take care. Here’s the resource for it:

    exec { "check-local-conflicts-${name}":
      onlyif  => "/usr/bin/test -d ${creates_dir}",
      cwd     => $bin_dir,
      path    => [$bin_dir],
      require => File[["${bin_dir}/svn-check.sh", "${bin_dir}/svn-commons.sh"]],
      command => "./svn-check.sh ${creates_dir} \"${svn_args}\"",
      user    => $user,
    }

And the content of the script svn-check.sh looks like this:

. ./svn-commons.sh

check_args
set_args "$1" "$2"

$SVN_CMD stat $SVN_ARGS $WORKING_COPY | $GREP_CMD ^C
CONFLICT_FOUND=$?

$SVN_CMD stat $SVN_ARGS $WORKING_COPY | $GREP_CMD ^!
PROBLEM_FOUND=$?

if [ $CONFLICT_FOUND -eq 0 -o $PROBLEM_FOUND -eq 0 ]; then

    $LOGGER "Found SVN conflict / problem in working copy $WORKING_COPY!"
    send_mail "Found SVN conflict in working copy at $WORKING_COPY on host $HOSTNAME. You have to check manually!" "SVN conflict on $HOSTNAME configuration found!"
else
    $LOGGER "No SVN conflicts found in working copy $WORKING_COPY"
fi

exit $EXIT_OK

Note that not only conflicts are checked (line 6), also problems are detected (line 9). Basically, a simple grep is made for the first column of what is written by the Subversion command svn stat. There might be more options to check. However, after several months of having this check in place, it turned out to be sufficient.

The common stuff

In every bash script mentioned so far, the file svn-commons.sh was sourced. That’s the content:

EXIT_ERR=1
EXIT_OK=0
SVN_CMD="/usr/bin/svn"
GREP_CMD="/bin/grep"
BASENAME_CMD="/bin/basename"
LOGGER="/usr/bin/logger"
MAIL_ADDR="your-email@here.com"
NUM_ARGS=$#
SVN_KEY_CODEBASE="SVN_AUTO_COMMIT_CODEBASE: "

check_args() {
	if [ `$BASENAME_CMD "$0"` = "svn-check.sh" ]; then
		if [ $NUM_ARGS -ne 2 ]; then
			echo "Wrong number of arguments given!"
			echo "Try: $0 /path/to/svn/working/copy <SVN_ARGS>"
			echo "with SVN_ARGS something like \"--no-auth-cache --username nexus --password nexus\""
			exit $EXIT_ERR
		fi
	elif [ `$BASENAME_CMD "$0"` = "svn-cron.sh" ]; then
		if [ $NUM_ARGS -ne 2 ]; then
			echo "Wrong number of arguments given!"
			echo "Try: $0 /path/to/svn/working/copy <STATUS_FILE>"
			echo "with <STATUS_FILE> some unique file name like /tmp/svn.status.nexus.txt"
			exit $EXIT_ERR
		fi
	elif [ `$BASENAME_CMD "$0"` = "svn-commit.sh" ]; then
		if [ $NUM_ARGS -ne 3 ]; then
			echo "Wrong number of arguments given!"
			echo "Try: $0 /path/to/svn/working/copy <SVN_ARGS> <STATUS_FILE>"
			echo "with SVN_ARGS something like \"--no-auth-cache --username nexus --password nexus\""
			echo "and <STATUS_FILE> some unique file name like /tmp/svn.status.nexus.txt"
			exit $EXIT_ERR
		fi
	fi
}

set_args() {

	WORKING_COPY=$1
	echo "Working copy set to $WORKING_COPY"

	if [ `$BASENAME_CMD "$0"` = "svn-check.sh" ]; then
		SVN_ARGS=$2
		echo "Subversion arguments set to \"$SVN_ARGS\""
	elif [ `$BASENAME_CMD "$0"` = "svn-cron.sh" ]; then
		STATUS_FILE=$2
		echo "Status file set to $STATUS_FILE"
	elif [ `$BASENAME_CMD "$0"` = "svn-commit.sh" ]; then
		SVN_ARGS=$2
		echo "Subversion arguments set to \"$SVN_ARGS\""
		STATUS_FILE=$3
		echo "Status file set to $STATUS_FILE"
	fi
}

send_mail() {
	/usr/bin/printf "%b" "$1" | /bin/mail -s "$2" $MAIL_ADDR
}

Basically, that’s what the script does:

  1. Provide common variables.
  2. Check and initialize variables based on the actual script that is called. (set_args())
  3. Provide simple function to send emails. (send_mail())

Have all bash scripts in place

Of course all the bash scripts must be in place before they will be executed. This is not a big issue since it’s a simple file resource from Puppet and should be feasible to everyone familiar with Puppet. But for the sake of completeness, I’ll just show the creation of the svn-commons.sh script which of course applies to any other script, too.

    file { "${directory}/svn-commons.sh":
      ensure  => file,
      mode    => '0755',
      owner   => $user,
      group   => $group,
      source  => 'puppet:///modules/codebase/sync_conf/svn-commons.sh',
      require => File[$directory],
    }

Notifying the user

There’s also a  cron job part of the define which reads the status file created by the Subversion checks and commits and reports them (via email) to the user. Easy job here. That’s the Puppet cron resource:

    cron { "send-svn-status-${name}":
      ensure  => present,
      command => "${bin_dir}/svn-cron.sh ${creates_dir} ${status_file}",
      user    => $user,
      hour    => 2,
      minute  => 0,
    }

The bash script svn-cron.sh looks like this:

BIN_DIR=`dirname $0`
. $BIN_DIR/svn-commons.sh

check_args
set_args "$1" "$2"

if [ -f $STATUS_FILE  ]; then
    $LOGGER "Found status file $STATUS_FILE. Going to send mail now."
    MAIL_HEADER="SVN actions during the last 24 hours\nCurrent date is: `date`\n"
    MAIL_MSG=`cat $STATUS_FILE`
    MAIL_CONTENT="${MAIL_HEADER}\n${MAIL_MSG}"
    send_mail "$MAIL_CONTENT" "Configuration changes status on $HOSTNAME. Please check."
    /bin/rm -f $STATUS_FILE
else
    $LOGGER "Found no status file $STATUS_FILE!"
fi

exit $EXIT_OK

So basically the script checks if a status file is there. If it is, it will use its content to fill the message of the mail and then delete the file afterwards. It will also put the hostname into the message header so one will always know on which host the changes were performed. This is an example of how such an email may look like:

From: nexus@codebase.my.domain [mailto:nexus@codebase.my.domain] 
Sent: Freitag, 5. April 2013 04:00
To: Team ReleaseManagement
Subject: Configuration changes status on codebase.my.domain. Please check.

SVN actions during the last 24 hours
Current date is: Fri Apr  5 02:00:01 UTC 2013

Sending        repository/sonatype-work/nexus/conf/nexus.xml
Transmitting file data .
Committed revision 8198.

So this mail gives you the information of every changed file along with the Subversion revision number of the repository. If necessary, one may check every detail of the changes performed with this number.

The overall picture

Now you have seen all the details of the Puppet define mentioned above. Now find the whole define with all the already explained Puppet resources:

  define synchronize_configuration ($base_dir, $svn_args, $svn_url, $creates_dir, $user, $bin_dir, $status_file) {
    #######################################################################
    # Checkout or update configuration.
    # Checkout and update are mutualy exclusive.
    #
    exec { "checkout-conf-${name}":
      require => File[$base_dir],
      cwd     => $base_dir,
      command => "${codebase::sync_conf::svn_cmd} co ${svn_args} ${svn_url}",
      creates => $creates_dir,
      timeout => 60,
      user    => $user,
    }

    exec { "update-conf-${name}":
      onlyif  => "/usr/bin/test -d ${creates_dir}",
      cwd     => $creates_dir,
      command => "${codebase::sync_conf::svn_cmd} up --force ${svn_args}",
      timeout => 60,
      before  => [Exec["check-local-modifications-${name}"], Exec["check-local-conflicts-${name}"]],
      user    => $user,
    }

    #######################################################################
    # Check if there are local changes and commit if so
    #
    exec { "check-local-modifications-${name}":
      onlyif  => "/usr/bin/test -d ${creates_dir}",
      cwd     => $bin_dir,
      path    => [$bin_dir],
      require => File[["${bin_dir}/svn-commit.sh", "${bin_dir}/svn-commons.sh"]],
      command => "./svn-commit.sh ${creates_dir} \"${svn_args}\" ${status_file}",
      user    => $user,
    }

    #######################################################################
    # Check if there are conflicts and report to team
    #
    exec { "check-local-conflicts-${name}":
      onlyif  => "/usr/bin/test -d ${creates_dir}",
      cwd     => $bin_dir,
      path    => [$bin_dir],
      require => File[["${bin_dir}/svn-check.sh", "${bin_dir}/svn-commons.sh"]],
      command => "./svn-check.sh ${creates_dir} \"${svn_args}\"",
      user    => $user,
    }

    #######################################################################
    # Cron job for sending status mail.
    #
    cron { "send-svn-status-${name}":
      ensure  => present,
      command => "${bin_dir}/svn-cron.sh ${creates_dir} ${status_file}",
      user    => $user,
      hour    => 2,
      minute  => 0,
    }
  }

As an example, this is how this define is called for Nexus:

    synchronize_configuration { 'nexus':
      base_dir    => $nexus_base_dir,
      svn_args    => $svn_args_nexus,
      svn_url     => $nexus_svn_conf_url,
      creates_dir => "${nexus_base_dir}/conf",
      user        => 'nexus',
      bin_dir     => $nexus_exec_dir,
      status_file => $svn_status_file_nexus,
    }

finally{}

That’s it. As mentioned above, all this is very simple Puppet stuff. No Voodoo at all. However, it works very reliable on our cluster nodes. I did not show every single detail and I did not show the content of every variable. I hope, the explanations are sufficient for anyone interested. I’d appreciate getting feedback.

, , , ,

No comments yet.

Leave a Reply

* Copy This Password *

* Type Or Paste Password Here *