Vimcram - making testing vim scripts suck less
14 Apr 2012
A while ago I was reading a blog post with tips on writing vim plugins. There’s lot of good information there, and if you find yourself writing any vim scripts or plugins, it’s well worth a read. I was surprised by one point though, the section on testing. My vimtodo plugin has a large number of regression tests, and it’s my safety net to make sure that I’ve not horribly broken something with my latest change. I don’t subscribe to the test driven development philosophy of writing a test and coding only until it passes, but I do find it useful to have a few tests to guard against you breaking something.
So when I read that testing sucks so much in Vim that you should avoid it, I was a little surprised, and certainly disagreed with the sentiment. My tests work fine, and I don’t recall it being particularly hard to implement the tests. So I took a look at the shell script testing tool mentioned in the blog post, cram, to see what was so special about it. I looked back at my tests in vimtodo, then at the example of the cram page, and back at my tests. The cram example showed a test that basically looked like a transcript of a shell session, and definitely not like the mess of code that comprised my tests. I thought that it couldn’t be too hard to implement something like that for vim, and the guy who wrote the blog post offered to buy a nice bottle of scotch for anyone who did so. One can never drink too much scotch (this may or may not be true), and so I got cracking.
The result is vimcram, which is now up on github. There are still a few features I’d like to add, but it’s pretty usable, and I’m currently converting all of my tests on vimtodo over to using it.
Tests in vimcram look like this:
Test substituting text:
> Add some text
> Add some more text
:%s/some //
Add text
Add more text
Test normal mode commands
@ggdW
@jdW
text
more text
Which, while not exactly a transcript of a vim session (it’s kind of hard to do with a visual editor and multiple modes), is pretty straightforward.
If you have a vim plugin and don’t have any tests for it, give vimcram a try. It might make test writing easy enough that you actually write them!
Mivok.net is now hosted on github with jekyll
27 Jan 2012
Ever since I saw Jekyll, I liked the idea of having a statically generated site for something as simple as a blog where you really don’t need dynamic content (except for comments where, as you can see below, I cheated and used intense debate). But I hadn’t touched anything in ruby before (at the time it probably wasn’t even installed on my web server), was suffering from not-invented-here syndrome, and decided to write my own version in Python. It was clunky, but it worked, for the most part. Two years later however, it’s seen no love, was in dire need of some maintenance/improvement, and I finally realized that pretty much everything I wanted to do was already done in Jekyll.
The site was simple to convert (the original program being an attempted clone), and it now means that I can punt on hosting and let github deal with things. And if github turns out not to be a good choice, then it’s simple to host anywhere. It is after all, a static site.
Backups with bup
27 Jan 2012
I’m thinking about backups once more, and thought I would take a look at bup. Bup’s claim to fame (and the reason I first heard about it) is that it’s a git based backup system, or rather it uses the git repository format, with its own tools to make it deal with large files effectively. The more I looked at it, the more I realized that bup being git-based isn’t the main feature. Bup has a rolling checksum algorithm similar to rsync and splits up files so that only changes are backed up, even in the case of large files. This also has a nice side effect: you get deduplication of data for free. This includes space efficient backups of VM images, and files across multiple computers (the OS files are almost identical). I have two laptops with the same data (git repositories, photos, other work) on both of them, and multiple VM images used for testing, so the ability to have block level deduplication in backups sounded ideal.
Bup can also generate par2 files for error recovery, and has commands to verify/recover corrupted backups. This is a useful feature given that bup goes to great lengths to ensure that each piece of data is only stored once.
My old backups were with rsnapshot, and as it happened, bup has a conversion tool for this, so the first step was to move them over to using bup. The command do to this is bup import-rsnapshot, but this didn’t quite work for me and gave an error when running bup save. Thankfully there is a dry-run option which prints out the commands that bup uses, and because rsnapshot backups direct copies of the files, what bup does is basically back up the backup. So I ended up running:
export BUP_DIR=/bup
/usr/bin/bup index -ux -f bupindex.rigel.tmp manual.0/rigel/
/usr/bin/bup save --strip --date=1314714851 -f bupindex.rigel.tmp \
-n rigel manual.0/rigel/
The two bup commands were directly output from the import-rsnapshot command, and I did this multiple times for each backup I had.
Next was to take the initial backup from my laptop. This was actually a different laptop from the one I took the rsnapshot backups with, but I’d copied over a lot of the data and wanted to see how well the dedup feature worked. As can be seen with the rsnapshot import, taking a backup is actually two steps, bup index followed by bup save. The index command generates a list of files to back up, while the save command actually does it. The documentation gives a couple of reasons for splitting this in to two steps, mainly that it allows you to use a different method (such as inotify) to generate and update the index, and it also allows you to only generate the list of files once if you are backing up to multiple locations. This separation of duties appeals to the tinkerer in me, but it would still have been nice to have a shortcut ‘just back it up’ command, similar to how git
pull is a combination of git fetch and git merge.
The commands to take a backup are:
export BUP_DIR=/bup
bup index -ux --exclude=/bup /
bup save -n procyon /
First, I set the backup directory to /bup. What I’m doing here is backing up locally (and copying to an external hard drive later), but you can also pass the -r option to back up directly to a remote server via ssh.
I also pass the -x option to bup index to limit it to one filesystem, and also exclude the backup directory itself from the backup.
Next, the bup save command actually performs that backup. I passed in the hostname of my laptop (procyon) as the name of the backup set. Multiple backups can have the same name, and they show up as multiple git commits, so a hostname is a good choice for the name of the backup set.
As I mentioned above, bup can make use of par2 to generate parity files. This is a separate step, and is done using the bup fsck command:
bup fsck -g -j4
The -g option generates the par2 files, and the -j 4 option means run up to 4 par2 jobs at the same time. Generating parity files is CPU intensive, so I set it to twice the number of CPUs in my system. I have hyperthreading turned on, and it saturated all 4 ‘virtual’ CPUs. Once this was done, I ended up with several .par2 files in the /bup/objects/pack directory (this is a git repository, and all data is stored in the objects/ dir.
And the results? Bup used 30GB for 2 original backups from rsnapshot (rsnapshot used 26GB and 37GB for the first and second backups, and this was taking into account identical files). Then, when I backed up my 2nd laptop (with approx 40GB used at the time) the size of the bup backup increased by only 4GB. This backup of my laptop included a 5GB ubuntu VM image that didn’t exist in the previous snapshots, so bup must have been able to deal with the duplicate data from the image and the live OS.
All of this sounds amazing, but of course there are a few downsides, all of which are spelled out pretty plainly in the bup README:
- no metadata - these are backups of my personal laptop, and I’ll be restoring either single files, or reinstalling and copying over files as I need them, so losing permissions/file ownership etc. isn’t a big deal for me. However, this feature is supposed to be coming soon.
- no way to prune old backups - this is another feature that is coming soon, but given that I’m a pack rat, rarely deleting old data, and the dedup feature, I’m not too concerned for the moment.
- bup is relatively new and immature. This shows both in the possible bugs I encountered above, the lack of what some might consider essential features, and the somewhat low level command usage (separate index, save and fsck commands). This is easily worked around however, and is likely to improve in future.
That said, if you can live with the above limitations, and want incredible space savings for your backups (especially across multiple computers), then I would suggest giving bup a try.
From Solaris to FreeBSD
31 Mar 2010
Less than one week after I switched my hosting over to Solaris 10, with all its ZFS/dtrace goodness, Oracle quietly makes a license change that everybody dealing with Solaris is likely now familiar with, and Solaris 10 is no longer free to use. Emails to Sun/Oracle’s licensing department result only in form letters repeating instructions on the website, and then nothing.
I can’t really blame Oracle for this, Sun didn’t make enough money to survive, and Oracle has this radical idea that you need to actually charge people in order to make money. I can blame them for not providing more clarity regarding the issue (so far they haven’t announced anything), and for leaving customers unsure about what’s going to happen next. However, this is mostly besides the point. I now needed to look into a good alternative.
OpenSolaris is the obvious candidate, and I’ve played around with it a little previously, but I can’t make myself like some of the changes made to it. The biggest annoyances being related to the new packaging system and some of the poor choices made in its design (e.g. no –nodeps option). That is an entire post (or rather, rant) in itself however. In addition to this, I can’t help but believe that Oracle is going to make some change to OpenSolaris that makes it not a realistic option.
This is where FreeBSD comes in. With release 8.0, ZFS has become a fully supported filesystem. It has dtrace support, jails (just like zones), even virtual networking so you can have a full network stack inside the jail.
For my personal server, the main feature I was interested in was ZFS, specifically ZFS root/boot. With ZFS it is trivial to set up mirrored drives, and I wanted to avoid doing software raid with UFS as well as ZFS. Thankfully there is extensive documentation on how to do this. It isn’t in the standard install, but if you need a repeatable procedure for many servers, it’s a (relatively) simple matter to script the installation, and you would probably want to do this anyway for an automated install.
There were a few gotchas, as with any new system you’re not familiar with, but so far it looks quite nice. I’ll be looking further into jails (especially the vimage jails) and other nice features. Hopefully, FreeBSD will turn out to be a good replacement for Solaris.
Gitosis - manage git repositories sanely
05 Mar 2010
I’ve finally made all my projects available publicly via git at http://git.mivok.net/ thanks to gitosis. Before that, I kind of just thrown everything in a git directory under my home directory and accessed it over ssh, which worked fine for private repositories, but fell flat whenever I wanted to make something available to somebody else.
Gitosis promised to make it easy to add new repositories and set up access for new people as needed, and once everything is set up, it is really easy - everything is contained in a config file inside a git repository, so you can make changes locally and push. You also have the benefit that your changes themselves are under version control. However, there were a few hiccups along the way, so I’m going to describe what I did in case others try and hit the same problems I did.
Gitosis uses python and setuptools, which I already had available. I’m running Ubuntu, so installing any requirements is as simple as running:
aptitude install python python-setuptools
Of course, git itself is a requirement. For now we’ll use the Ubuntu package, but it’s a good idea to build from source if you want the latest version:
aptitude install git-core
Next, get the gitosis source:
git clone git://eagain.net/gitosis.git
and install:
cd gitosis
sudo python ./setup.py install
So far, everything is pretty straightforward. Next we need to add a user that everyone will connect as in order to access repositories. The main method gitosis uses for accessing repositories, is to have a single user that everyone connects to over ssh. Logins are only allowed via ssh keys, and anyone who connects is restricted to running gitosis commands, preventing them from accessing anything they shouldn’t.
sudo adduser --system --shell /bin/sh --gecos 'git user' \
--group --disabled-password --home /srv/git git
Here I’ve set the home directory to /srv/git. This directory will hold all repositories and gitosis files. Next we need to initialize this directory with all of the gitosis configuration files:
sudo -H -u git gitosis-init < your-ssh-key.pub
(the -H option to sudo sets the HOME variable to the user you are running commands as. In this case - /srv/git).
The your-ssh-key.pub file should be your ssh public key for the computer you are working on now. You will use this to access the administration repository and any other repositories you create later. If you don’t have an ssh key set up already, make one now and copy the id_rsa.pub file to the server before running the above command:
ssh-keygen -t rsa -b 4096
For more information on ssh-keys, see the ssh-keygen man page.
Note: by default, gitosis takes the comment field of your ssh key to be your username. In my case, it was mark@laptop, and I would have had to use mark@laptop as my username whenever editing permissions. If you want something nicer, edit the copy of your public key before running the gitosis-init command and change the comment field to something a little nicer.
Now you have the basic server set up. To edit the configuration, clone the administration repository:
git clone git@your-server.example.com:gitosis-admin.git
Then you can edit the gitosis.conf file, commit it, and push back to the server.
At this point I hit my first snag. Any changes pushed back to the server didn’t take effect. The magic updating of settings wasn’t working. After some hunting around (read: typing stuff into Google and clicking frantically), I found that all of the magic is done via a hook on the gitosis-admin repository. For some reason, the hook script wasn’t executable, and so never ran. Before committing any configuration changes, make sure to fix the permissions on the repository hook:
sudo chmod +x /srv/git/repositories/gitosis-admin.git/hooks/post-update
The basic gitosis setup at this point is complete. Aside from adding repositories and new users, the other steps are optional. However, we are talking about making repositories publicly accessible, and the other two steps - setting up git:// access via git-daemon and setting up gitweb will do this.
First though, here’s a quick overview on adding users/repositories.
To add a new user, get a copy of their ssh public key (ssh keys are what makes the whole thing work), and copy it to your gitosis-admin checkout inside the keydir directory. Name the file USERNAME.pub, replacing the username with the name of the user you wish to add - this is the username you will use when setting permissions. For example, if you add joe.pub, then you will use joe as the username in the configuration below.
To add a repository, you just give somebody permission to access it and then push. This involves editing the gitosis.conf and adding a few lines:
[group foo]
writable = myrepository
members = joe
This allows user joe to write to myrepository.git. You then add this as a remote in your local repository and push to create the repository on the server:
cd path/to/my-repository
git remote add origin git@your-server.example.com:myrepository.git
git push
This assumes you actually have something to push. In practice this isn’t an issue - you start with a blank local repository (using git init), commit your first changes, and push. The first person to push actually creates the repository.
Setting up git:// access
This part allows people to clone a repository without needing to authenticate, and without having to generate ssh keys. You can’t push to repositories in this way however - you have to use ssh if you want to push back to a repository. Chances are, you don’t want all repositories to be public, and gitosis allows you to pick and choose which you make public and which you make private using (wait for it…) the gitosis.conf file.
Setting up git:// access is as simple as running the git-daemon command:
sudo -u git git-daemon --base-path=/srv/git/repositories/
If you’re running Ubuntu however, gitosis comes with a nice script that you just drop in to /etc/event.d, edit to change the path, and it will start the git-daemon automatically on boot:
sudo cp gitosis/etc-event.d-local-git-daemon /etc/event.d/local-git-daemon
sudo sed -i s+/srv/example.com/git+/srv/git+ /etc/event.d/local-git-daemon
sudo initctl start local-git-daemon
The initctl script starts the daemon without rebooting, which is usually a good thing.
By default, no repositories are made public. To make them public, you need to add a daemon = yes option to your gitosis.conf:
[repo myrepository]
daemon = yes
Here we have made a new repo section for myrepository. Save the gitosis.conf file, commit, push, and you should be able to clone myrepository using git://your-server.example.com/myrepository.git.
Gitweb - making everything look pretty
The final step is getting gitweb working. For this you need a copy of gitweb.cgi and associated files. I built git from source, and gitweb.cgi was built as part of this, but if you didn’t do this, there is an Ubuntu package available called gitweb. I also use lighttpd on my server, with pages stored under /srv/www/domain.example.com/pages/ so I’ll be describing a configuration for that server and layout.
First, copy gitweb.cgi, gitweb.css, and all of the images to /srv/www/domain.example.com/. I put the css files and images inside a pages subdirectory (the document root), and put gitweb.cgi inside a separate cgi-bin directory outside of the document root.
Next, configure lighttpd. I have simple-vhost set up which sets the document root based on the domain name requested, so we only need to do special set up for the git/cgi parts:
$HTTP["host"] =~ "^git\.your-server\.example\.com$" {
url.redirect = (
"^/$" => "/gitweb/",
"^/gitweb$" => "/gitweb/"
)
alias.url = (
"/gitweb/" => "/srv/www/git.your-server.example.com/cgi-bin/gitweb.cgi",
)
setenv.add-environment = (
"GITWEB_CONFIG" => "/srv/www/git.your-server.example.com/gitweb.conf",
)
$HTTP["url"] =~ "^/gitweb/" { cgi.assign = ("" => "") }
}
Gitosis does provide a config file for lighttpd, but it wasn’t appropriate for my setup. Note that the above needs the following modules loaded: mod_alias, mod_cgi, mod_redirect, mod_setenv.
Gitweb.cgi needs editing slightly in the above configuration, by default it looks for the css file in the same location as the gitweb.cgi file (i.e. in a /gitweb/ dir), but they are stored at the root of the site. Open up gitweb.cgi and search for gitweb.css. Add a slash before the filename and save the file.
Next is creating a gitweb.conf file. Again, gitosis helps out here, providing a gitweb.conf file that just needs some tweaking with the right paths. Copy the gitweb.conf file from the gitosis source distribution to /srv/www/git.your-server.example.com/, and open it up for editing.
Edit the $projects_list, $projectroot, and the @git_base_url_list lines and save:
$projects_list = '/srv/git/gitosis/projects.list';
$projectroot = "/srv/git/repositories";
@git_base_url_list = ('git://your-server.example.com');
By default, gitosis creates repositories that are only accessible by the git user and users in the git group, so we need to give the web server permissions to access repositories. If this isn’t done, gitweb will say that there are no repositories available even when you configure web access in gitosis for the repository.
usermod -G git www-data
You will need to restart the web server after this in order for the group change to take effect. If your web server is running as someone else other than www-data, change the above command appropriately.
Finally, to give access to a repository via gitweb, the process is similar to setting up git:// access. Edit gitosis.conf, and add a gitweb = yes line next to the daemon = yes line for the repository. Commit, push, and the repository should now show up in gitweb. In the default configuration, you need to have both daemon = yes and gitweb = yes for a repository to be made available via gitweb. See the gitweb.conf file if you want to change this.
Bash function renaming and overriding
20 Sep 2009
One annoyance I found when writing bash scripts is the lack of function references. This became apparent when overriding a function, but when I wanted to change the behavior only slightly. I had a library of functions, and wanted to add some commands before the start of the function, and some cleanup code immediately after it finished.
This being a library function that was called elsewhere, I couldn’t edit the function in the library itself. Nor could I edit the calling code and add the steps before and after - the calling code was itself another library function. This left the option of copying and pasting the entire function, and adding my extra code to the beginning and end.
In python (and many other languages), I would have done something like the following:
old_foo = foo
def foo():
initialization_code()
old_foo()
cleanup_code()
but bash doesn’t seem to support function references in that manner. After much searching however, I finally found a way to save a function under a new name, which gives the same kind of functionality using bash’s declare builtin.
The declare command prints out the values of declared variables, and more importantly, declared functions - declare -f foo will print out the code for function foo. So all you need to do is execute the output of the declare -f command, after substituting the name of the function. The following bash function does just this:
save_function() {
local ORIG_FUNC=$(declare -f $1)
local NEWNAME_FUNC="$2${ORIG_FUNC#$1}"
eval "$NEWNAME_FUNC"
}
Add that to your scripts, and you have a simple way to copy/rename a function, and a simple way to add a step before/after an existing function. To copy the python example above:
save_function foo old_foo
foo() {
initialization_code()
old_foo()
cleanup_code()
}
Now any code calling foo in the script will get the new behavior.
Bash quoting and whitespace
18 Apr 2009
A common thing when writing shell scripts is to allow the user to specify options to commands in a variable. Something like the following:
$ OPTS="--some-option --some-other-option"
$ my_command $OPTS
We can set my_command to the following script to see exactly what gets passed:
#!/bin/bash
for t; do
echo "'$t'"
done
Running the above prints the following output:
'--some-option'
'--some-other-option'
This works fine, until you want to include options with whitespace in them:
$ OPTS="--libs='-L/usr/lib -L/usr/local/lib'"
$ my_command $OPTS
'--libs='-L/usr/lib'
'-L/usr/local/lib''
This output clearly isn’t what we want. We want a single parameter passed with the entire content of $OPTS. The culprit here is Word Splitting. Bash will split the value of $OPTS into individual parameters based on whitespace. One way to get around this is to put $OPTS in double quotes:
$ OPTS="--libs='-L/usr/lib -L/usr/local/lib'"
$ my_command "$OPTS"
'--libs='-L/usr/lib -L/usr/local/lib''
$ OPTS="--libs=-L/usr/lib -L/usr/local/lib"
$ my_command "$OPTS"
'--libs=-L/usr/lib -L/usr/local/lib'
Putting $OPTS in double quotes suppresses word expansion. In the above example, that works as expected. The second command has the single quotes removed as they were passed directly to the command, which isn’t what we wanted. So far, so good. The problem, as you may have spotted by the removal of the single quotes, comes when we want to pass more than one parameter in $OPTS:
$ OPTS="--cflags=O3 --libs=-L/usr/lib -L/usr/local/lib"
$ my_command "$OPTS"
'--cflags=O3 --libs=-L/usr/lib -L/usr/local/lib'
Here, the entire $OPTS variable gets passed as a single parameter, which isn’t what we want. We want --cflags to be passed as one parameter, and --libs (and everything that comes with it) to be passed as another parameter. Adding more quotes, backslash escaped or not, does nothing to help.
The solution? Use bash arrays:
$ OPTS=("--cflags=O3" "--LIBS=-L/usr/lib -L/usr/local/lib")
$ my_command "${OPTS[@]}"
'--cflags=O3'
'--LIBS=-L/usr/lib -L/usr/local/lib'
Perfect. But what about backward compatibility? If you have hundreds of scripts that use a string for $OPTS, how does it work if you change to using arrays? Let’s try it out:
$ OPTS="--some-option --some-other-option"
$ my_command "${OPTS[@]}"
'--some-option --some-other-option'
So it works if your old scripts only have single options, but if multiple scripts are needed, then they will need to be changed to use arrays instead. This however seems to be the best option for passing multiple arguments with whitespace.