In this post we describe the use of the Capstan tool being extended within MIKELANGELO project. Capstan is a member of the Lightweight Execution Environment Toolbox (LEET) responsible for the management of composable application packages and OSv unikernel lifecycle. This post explains how these packages are prepared so that they are 100% compatible with OSv kernel and with other packages. Source code of the approach described here is available at GitHub repository.
How does Capstan work
Capstan is a tool for composing OSv unikernels rapidly that was initially implemented by ScyllaDB. For the last three years, Capstan is being under active development by MIKELANGELO consortium that introduced support for precompiled packages simplifying the process of building OSv unikernels significantly.
Capstan does not compile anything from source code but rather relies on precompiled pieces of software, MPM packages, to be downloadable from a public S3 repository. To prepare an OSv unikernel for your specific application, Capstan performs following three steps:
- downloads the unikernel base containing the OSv kernel along with some other prerequisites and boots it
- downloads packages required by your application and uploads them to the unikernel
- uploads user application files to the unikernel
What are MPM packages
MPM (MIKELANGELO Package Manager) package is a compressed directory that contains precompiled software, libraries and configuration files. When composing a unikernel, Capstan extracts the content of an MPM package and copies it as-it-is to the unikernel filesystem, effectively composing target unikernel out of smaller building blocks. Precompiled software (usually files whose name ends with .so) is then loaded on demand into memory by ELF dynamic linker as described in this academic article and used by application as a library.
Preparing a package requires knowledge about the application itself as well as the OSv limitations. To give you an example, the following recipe can be followed to build Python 2.7 runtime and package it into MPM package.
A - Build python from source
$ git clone https://github.com/python/cpython.git
$ git checkout -b 2.7 origin/2.7
# We have to enable building of shared libraries
$ cd cpython; configure --enable-shared --prefix=./install
$ make
# We also have to ensure python binary is a shared object (dynamic library)
$ sed -i 's/^LDFLAGS=.*$/LDFLAGS=-shared/g' Makefile
$ make python
$ make install
B - Copy build result into a new directory
$ mkdir ${RES}
$ cp ./install/bin/python ${RES}
$ cp ./install/lib/libpython* ${RES}
$ mkdir -p mkdir ./install/pyenv/lib
$ cp ./install/lib/python2.7 ${RES}/pyenv/lib
$ cp /lib/x86_64-linux-gnu/libreadline.so* ${RES}
$ cp /lib/x86_64-linux-gnu/libtinfo.so* ${RES}
C - Add metadata and compress package
$ cd ${RES}
$ capstan package init --name python --title “Python 2.7”
$ capstan package build
After step C we obtain a compressed directory, the python MPM package, containing following directories and files:
ROOT/
├── libpython2.7.so
├── libpython2.7.so.1.0
├── libreadline.so.6
├── libreadline.so.6.3
├── libtinfo.so.5
├── libtinfo.so.5.9
├── meta
│ ├── package.yaml
├── pyenv
│ └── lib
└── python
This package can now be used locally to compose unikernels for running Python application or uploaded to a public S3 repository where any user can download it and use it when composing their own unikernel. However, when using MPM package that was prepared by someone else, you can quickly run into problems, in particular when combining it with your own precompiled software in the same unikernel due to incompatibility issues presented next.
Incompatibility issues
As mentioned earlier, Capstan is a tool for composing unikernels rapidly. It doesn’t build any source code but rather downloads precompiled MPM packages from S3 repository and uploads them to the unikernel filesystem as it would be assembling puzzles. But we must be careful not to play with incompatible puzzles, i.e. puzzles that were precompiled on different compiler version or with different libraries versions. There are two types of incompatibility wide open:
- incompatibility between MPM package and OSv kernel
- incompatibility between different MPM packages
If OSv kernel, for example, was built on Ubuntu 14.04 system and the MPM package on Ubuntu 16.04 system, the puzzles won’t match. Once the unikernel will be run, it will crash hard as soon as application will attempt to access some symbol that exists on Ubuntu 16.04 system (where MPM package was built on) but did not exist on Ubuntu 14.04 system (where the OSv kernel was built on).
There is a GitHub issue open for this for quite a while now, but the problem seems to be hard nut to crash and is therefore not solved yet. In this post we therefore present our workaround for it.
Docker containers to the rescue
We’ve just described that MPM packages won’t play well with each other and with OSv kernel unless they are built on the same platform. The solution is actually quite simple: we must build them all on the same platform. If the mountain will not come to Muhammad, then Muhammad must go to the mountain, they say! So we’ve prepared a Docker container inside which we first compile OSv kernel and then all the MPM packages. This way the packages are always 100% compatible with the kernel and among each other.
The Dockerfile for mikelangelo/capstan-packages is available at GitHub repository while the Docker image is available on DockerHub.
The mikelangelo/capstan-packages container
One can think of mikelangelo/capstan-packages container as a transportable platform for building MPM packages. When we’re talking about Docker containers, there are two phases to be taken into account: the building phase and the running phase. Result of building phase is permanent container image that can be run multiple times, while result of the running phase is one or more MPM packages. Once the container is stopped it is erased from the disk and all its resources are reclaimed, only host directories that were mounted to the container are persisted.
Here is how mikelangelo/capstan-packages works. During the building phase, all tools and libraries that are needed for preparing MPM packages are installed (like Curl, Git,Go, Java, Capstan etc.). Also the OSv source code and its submodules (aka. “osv-apps”) are downloaded and the OSv kernel is compiled at this phase. The last step of the building phase specifies container’s command:
CMD python /capstan-packages.py; \
echo "\n--- Script exited, container will now sleep ---\n"; \
sleep infinity
One can notice that when the container based on this image is run, talking about running phase now, a Python script capstan-packages.py
is invoked. And when the script finishes, container sleeps for infinity until user hits CTRL+C. As soon as this happens, the running phase is over and container is discarded.
There is a “recipe” provided for each MPM package. A recipe is a Bash script (just like the one with steps A, B, C for Python package shown above) that is run by capstan-packages.py
script inside the container in the running phase. In other words, when the mikelangelo/capstan-packages container is run, the capstan-packages.py
script is executed and it iterates all available recipes and runs them. Running a single recipe yields a single MPM package, for example, if there is a recipe called “python-2.7” then running it will yield “python-2.7” MPM package.
How to use the container
The simplest way to use the container is to skip the building phase by pulling the image from DockerHub and jump directly to running phase. Do realize, though, that OSv is being built in building phase so when downloading image from DockerHub, you are determined to use the OSv version that is baked inside the image. If you need the latest OSv version, you will need to actually perform the building phase. Anyway, here is how you pull the container image from DockerHub:
$ docker pull mikelangelo/capstan-packages:2017-08-02_9aba80a
And here is how you run the container using the image that we’ve just pulled:
$ mkdir ./result
$ docker run -it --volume="$PWD/result:/result" mikelangelo/capstan-packages
As mentioned earlier, when the runing phase of Docker container ends, the container is discarded. Therefore we need to mount a directory from host, in our case ./result, to the /result directory on the container filesystem. This way we won’t loose the result when container stops as compiled MPM packages will be waiting for us in the ./result directory on host.
When container is run it performs following tasks (see also the figure above):
- copy mike/osv-loader into result directory (it was built during the building phase)
- run each recipe to obtain MPM package
- test each MPM package by composing actual unikernel with it and examining the stdout
- copy MPM packages to the host
The whole process currently takes an hour or so, but when it completes, you can be 100% sure that the packages are compatible with kernel and among each other. Container provides following content in ./result directory:
$ tree -L 2 result
result
├── intermediate
│ ├── erlang-7.0
│ ├── mysql-5.6.21
│ ├── node-4.4.5
│ ├── ...
├── log
│ ├── erlang-7.0.log
│ ├── mysql-5.6.21.log
│ ├── node-4.4.5.log
│ ├── ...
├── mike
│ └── osv-loader
│ ├── index.yaml
│ ├── osv-loader.qemu
│ └── osv-loader.qemu.gz
└── packages
├── erlang-7.0.mpm
├── erlang-7.0.yaml
├── mysql-5.6.21.mpm
├── mysql-5.6.21.yaml
├── node-4.4.5.mpm
├── node-4.4.5.yaml
├── ...
Where:
intermediate
directory contains uncompressed packages. As the name suggests, these are not final results, but come handy if you need to peek in package content.log
directory contains one file per package that was built. Content of this file is nothing but redirected stdout and stdin of the recipe’s build.sh script. In other words, when building recipe fails, this is where you find answers about what went wrong.mike
directory contains compiled OSv kernel that is packaged into a small qemu image. Copy this whole directory into your $CAPSTAN_ROOOT/repository and Capstan will be able to compose images that base on mike/osv-loader.packages
directory contains MPM packages. There are two files for each package: <package-name>.mpm and <package-name>.yaml. The former contains actual package files (that are in .tar.gz format, in case you were wondering) while the latter contains package metadata. Copy this whole directory into your $CAPSTAN_ROOOT and Capstan will be able to compose images that require these packages.
Notice how we obtained everything that Capstan needs to compose unikernels: the mike/osv-loader image (aka. “base unikernel”) and all the MPM packages. And it’s guaranteed that they were all compiled on the same platform and are thus compatible. There is also a script available that uploads all the packages and the mike/osv-loader to the public S3 repository where Capstan users can download them. S3 credentials are required for this and the s3cmd tool properly configured.
Final thoughts
Present post describes how incompatibility between precompiled MPM packages can be addressed. Our approach is to provide a common platform using Docker container where everyone can compile their package and thus avoid incompatibility issues with others. The idea is that each user provides not the pre-built MPM package with her software, but rather “recipe” to teach the platform how to build it. Owner of the Capstan’s S3 repository then runs new recipes and uploads the new MPM packages to make them publically available.