Friday, September 20, 2013

Interesting new technologies

A couple of things I heard about at Postgres Open that deserve closer inspection are Docker, a nice container toolkit, leveraging among other things Linux LXC, and OpenShift, another containerization platform. Today I discovered that while I was travelling home yesterday (and taking far too long about it), a major announcement was made of collaboration between Docker and RedHat, the principal sponsors of OpenShift. This is pretty cool stuff. OpenShift already has "cartridge" for PostgreSQL, and I am going to play with it the first opportunity I get.


  1. As I understand it Docker uses aufs a union / layered filesystem. Some benchmarks, exploration would be really quite interesting.

  2. Yeah, I haven't explored it much yet. But I do think this will be important in the future of virtualization - lightweight containers that can be stood up easily from scratch in seconds is quite a big advance.

  3. I've been playing with docker this weekend, and so far pretty cool. They emphasize the disk-space saving advantages and copy-on-write to show how 'cheap' dockers can be to turn up, but that means they haven't spent much time making them as useful for permanent virtual-machine replacing options (for say development environments). A few examples:

    1) Its base default image (ubuntu) doesn't start with a non-root user, no root password, and no port 22 exposed, no openssh, etc.. You have to specifically tell docker via a running command (not just daemonize) sshd to make that work. If you just run a docker backgrounded, it won't actually run - it *must* have a command even if its something silly like a wait command in a bash script. The point is you have a chicken-and-egg if you wanted to turn it up and then let a deployment solution (puppet, salt, ansible) manage things like your access.
    2) In building an image, you build a 'Dockerfile' with commands to get it up and running (install a few basic packages like openssh, put in authorized keys, etc.). Each command makes its own running docker container built on top of the previous. I know thats useful for container development, but not so practical for regular use especially when they have a 42-container limit. Worse, there are still known bugs where it won't let you delete those sometimes (so I've got ~ 12 of them stuck). You can append all the docker commands into a shell script to try and avoid that, but its just not prime-time ready.
    3) its only a single process at a time, so theres issues with backgrounding, such as ubuntu tools assuming upstart is running. Upstart actually doesn't work without some hack burried in a Git issue by the core team, which took a while to figure out how to deploy from my Ansible scripts that assumed that.
    4) You can only send files to docker in subdirectories of your Dockerfile configuration, even though it all runs as root. Thats again an issue if you're using a deployment tool and want to copy authorized_keys from a different location than your config template.

    In implementing it as a virtual machine alternative, its still got some rough spots. I think docker currently is a great alternative for making single-process functions run in parallel (long running build/teardown unit tests), quick experiments (let anonymous users test your app in a short-lived docker), and the community seems to be growing fast so things will get ironed out.

    I think so far my recommendation is to consider Docker for things that are a single process and not a collection of processes. I'm specifically thinking about something like node.js running in cluser mode (multicore) and a few cron jobs, and what the implications would be with docker.

  4. This comment has been removed by a blog administrator.