Accelerate Buildout
We are using zc.buildout 2.13.6 and setuptools 38.7.0 for assembly of our zope based product. Buildout is surely a wonderful tool to build. It is very simple and easy to use. We have been using it for over a decade now and no problems at all.
Because of business needs we automated egg release activity. This has started filling the package server. We started experiencing slow buildout runs. Upon investigation we noticed that packages having more eggs took more time to download pinned eggs. A lot of checks are built-in easy_install and setuptools. For more eggs, a good amount of data from the index page transferred from the package server. This is an overkill and buildout is getting progressively slower.
When I started finding a solution, initially we had to reduce the frequency of releasing eggs and make package server light by removing unwanted eggs. But realized that it was not a viable option.
So I decided to find a different solution to make Buildout faster. Buildout caches eggs in a download-cache folder. I tried caching a few eggs and ran Buildout but soon realized it is not an option. Because though a download-cache has eggs required to build, it is hitting the package server for a lookup.
Now I wanted to try the intermediate package server which will be a subset of the main package server. It will have only required eggs for a given version of pinned egg. This is a good solution but would end up building many package servers and it will be adding many more processes to manage this subset of eggs.
Then I wanted to try buildout’s recipe to download eggs. I have put a simple python script and buildout gives the version and name of pinned eggs. We need to extend all configuration files. The very simple buildout recipe gave me the result I was looking for. Python script is designed to look for py2.7, tar.gz and zip extensions direct downloads. And if lookup returns nothing, then grab the index page of that package and grab all eggs of that pinned version. This addressed the finding extension issue.
This is downloading all eggs in a file system in about 5 minutes.
I have created separate configuration files for caching eggs and leaving existing configuration files untouched.
Now time to test the new file system based local package server. I overridden the configured package server as a command line parameter. First time, to test, I ran it in verbose mode to see all the things happening. I was very much satisfied with the initial result I got. From 6 hours, buildout run has come down to just 40 minutes. Yes, this simple trick saves 5 hours and 20 minutes for every fresh buildout run.
Here one point I noticed that my recipe was downloading eggs sequentially which was taking about 5 minutes. Later recipe is multiprocessed to put a spurt on to complete building package server under couple minutes. This techniques helps me rebuild the entire package server under 2 minutes. But it works only on *nix. For windows users eggs will be sequentially downloaded. They should be happy using windows.
I had to handle ‘_’ [underscores] and case sensitivity issues. Which will happen using url but when you refer to a file system, referring filepath should be case specific.
In a few places we use the zc.recipe.cmmi recipe. It does not use the ‘-j’ flag to `make`. This flag is to use available CPUs. I am writing a patch for it. Till then you may export MAKEFLAGS environment flag. On Windows this flag will not be honored but on *nix it works well.
So this recipe helped to get the buildout running with blazing speed.