Importing pypi packages into Baserock

This guide shows how to use the import tool to import a package hosted on pypi into Baserock, for a more general and comprehensive guide to the import tool see the guide on How to use the Baserock Import tool to import Ruby on Rails

This guide assumes you have the import tool installed on your system, if not see the quickstart guide.

A simple import: Flask

Here we will use the import tool to generate a set of definitions we can use to build Flask in Baserock, we will not lorry any repositories, for details on how to use lorry with the import tool see How to use the Baserock Import tool to import Ruby on Rails

First clone the definitions repository

git clone git://git.baserock.org/baserock/baserock/definitions /src/definitions

Then run the importer, note the use of the use-local-sources flag, this tells the importer that we're building from our local source repos rather than source repos on a trove, if you plan to build from sources stored on a trove then don't use this flag.

mkdir /src/baserock-import
cd /src/baserock-import
baserock-import python Flask --use-local-sources

01/12/15 18:09:08 UTC: Import of python Flask started
Not updating existing Git checkouts or existing definitions
Flask: calling python.to_lorry to generate lorry
Lorrying http://github.com/mitsuhiko/flask/
...
MarkupSafe 0.23: calling python.to_chunk to generate chunk morph
MarkupSafe 0.23: calling python.find_deps to calculate dependencies
Generating stratum morph for Flask
01/12/15 18:09:53 UTC: Import of python Flask ended (took 44 seconds)

The import tool has successfully imported Flask and its dependencies and generated a set of definitions in baserock/baserock/definitions, you can copy these definitions into your definitions repo and add them to a system as you would normally.

The import tool has also generated a set of lorries in lorries/python.lorry which can be used to import Flask and all of its dependencies into a trove.

cp definitions/strata/Flask.morph ../definitions/strata
cp -r definitions/strata/Flask ../definitions/strata

cd /src/definitions
git add strata/Flask/*.morph # don't want to add the foreign-dependencies files
git add strata/Flask.morph

Use your favourite editor to add the Flask stratum to the system you want to add Flask to, in this case we're adding Flask to the devel system.

vim systems/devel-system-x86_64-generic.morph
git diff

diff --git a/systems/devel-system-x86_64-generic.morph b/systems/devel-system-x86_64-generic.morph
index 143ceb8..18df350 100644
--- a/systems/devel-system-x86_64-generic.morph
+++ b/systems/devel-system-x86_64-generic.morph
@@ -34,6 +34,8 @@ strata:
   morph: strata/nfs.morph
 - name: python-tools
   morph: strata/python-tools.morph
+- name: Flask
+  morph: strata/Flask.morph
 configuration-extensions:
 - set-hostname
 - add-config-files

Stage the system morph, and commit changes:

git add systems/devel-system-x86_64-generic.morph
git commit -m 'Add Flask to devel system'

The devel-system-x86_64-generic is now ready to be built and deployed.

A more involved import: coverage

Often an import will not be quite as simple as the example above, dependency information may be missing, repositories may not be tagged, and there may be version conflicts. In the event of any of these manual intervention is needed. We will begin by running the import command as usual. Since this will not be a trivial import it will be useful to specify a logfile for later debugging.

cd /src/baserock-import
baserock-import python coverage --log=coverage-import.log

01/12/15 17:51:40 UTC: Import of python coverage started
Not updating existing Git checkouts or existing definitions
coverage: calling python.to_lorry to generate lorry
Lorrying https://pypi.python.org/packages/source/c/coverage/coverage-3.7.1.tar.gz
coverage master: using checkouts/python_coverage-tarball ref master (commit f85bcf60d2bf5c87428c5922ee39d24f8d19c25b)
coverage master: calling python.to_chunk to generate chunk morph
coverage master: calling python.find_deps to calculate dependencies
...
pylint: calling python.to_lorry to generate lorry
Lorrying http://www.pylint.org
Could not find ref for pylint-1.4.0, required by: coverage-master.
...
snowballstemmer: calling python.to_lorry to generate lorry
Lorrying https://github.com/shibukawa/snowball_py
Could not find ref for snowballstemmer-1.2.0, required by: Sphinx-1.3b2.
...
tox: calling python.to_lorry to generate lorry
python.to_lorry failed with code 1: Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/baserockimport/exts/python.to_lorry", line 239, in <module>
    PythonLorryExtension().run()
  File "/usr/lib/python2.7/site-packages/baserockimport/exts/python.to_lorry", line 229, in run
    if 'home_page' in info else None)
  File "/usr/lib/python2.7/site-packages/baserockimport/exts/python.to_lorry", line 55, in find_repo_type
    status_code = requests.get(url).status_code
  File "/usr/lib/python2.7/site-packages/requests/api.py", line 55, in get
    return request('get', url, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/api.py", line 44, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 360, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 479, in send
    history = [resp for resp in gen] if allow_redirects else []
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 147, in resolve_redirects
    allow_redirects=False,
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 463, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 331, in send
    raise SSLError(e)
requests.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)
...

Errors encountered, not generating a stratum morphology.
See the README files for guidance.
01/12/15 17:56:06 UTC: Import of python coverage ended (took 265 seconds)

When the import tool encounters errors a stratum will not be generated by default. Several errors prevented the coverage stratum from being generated:

  • Could not find ref for snowballstemmer-1.2.0, required by: Sphinx-1.3b2
  • Could not find ref for pylint-1.4.0, required by: coverage-master
  • SSL certificate error importing tox

Fixing: Could not find ref for snowballstemmer-1.2.0

In most cases a "Could not find ref for" error will be due to missing tags or tags that deviate from the forms the import tool expects. In the case of snowballstemmer it is simply untagged, we can see from the most recent commit that the version is 1.2.0

cd checkouts/python_snowballstemmer
git show | grep version
-      version='1.1.0',
+      version='1.2.0',

So we can safely tag this as 1.2.0,

git tag -a 1.2.0 -m 1.2.0

Fixing: Could not find ref for pylint-1.4.0

This error seems to be caused by the absence of a pylint-1.4.0 ref, let's change into the checkout for snowballstemmer to investigate.

../python_pylint
ls
assets           humans.txt       image            pylint.svg       styles
crossdomain.xml  ico              index.html       robots.txt

This does not look like the source for pylint, it looks like the source for a website.

Searching the log for the word pylint shows that we detected www.pylint.org as a mercurial (hg) repo and lorried it,

grep pylint /src/baserock-import/coverage-import.log

2015-01-07 14:24:31 INFO pylint: calling python.to_lorry to generate lorry
2015-01-07 14:24:31 DEBUG Running /usr/lib/python2.7/site-packages/baserockimport/exts/python.to_lorry [u'pylint']                                                                                                  
2015-01-07 14:24:32 DEBUG Found package pylint
2015-01-07 14:24:32 DEBUG Treating pylint as pylint
2015-01-07 14:24:32 DEBUG "GET /pypi/pylint/json HTTP/1.1" 301 0
2015-01-07 14:24:32 DEBUG "GET /pypi/pylint/json HTTP/1.1" 200 19998
2015-01-07 14:24:32 DEBUG Getting 'http://www.pylint.org' ...
2015-01-07 14:24:32 DEBUG Starting new HTTP connection (1): www.pylint.org
2015-01-07 14:24:32 DEBUG 200 OK for http://www.pylint.org
2015-01-07 14:24:32 DEBUG Finding repo type for http://www.pylint.org
2015-01-07 14:24:32 DEBUG Cloning into 'www.pylint.org'...
2015-01-07 14:24:32 DEBUG fatal: repository 'http://www.pylint.org/' not found
2015-01-07 14:24:36 DEBUG destination directory: www.pylint.org
2015-01-07 14:24:36 DEBUG http://www.pylint.org is a hg repo
2015-01-07 14:24:36 DEBUG run external command: <span class="createlink">&#39;git&#39;&#44; &#39;rev-parse&#39;&#44; &#39;--verify&#39;&#44; u&#39;pylint-1.4.0&#94;&#123;commit&#125;&#39;</span>
2015-01-07 14:24:37 ERROR Command failed: git rev-parse --verify pylint-1.4.0^{commit}
2015-01-07 14:24:37 ERROR Could not find ref for pylint-1.4.0, required by: coverage-master.

Packages on pypi sometimes use the homepage_url metadata to store the url of the project's source repository, sadly there is no standard attribute for storing the project's source repo, this makes it difficult to find a project's source repo automatically. Nonetheless the import tool will check whether the url in the homepage_url attribute refers to a repository, if it does then the import tool will lorry from this repository. If the homepage_url does not refer to a repository then the import tool does a tarball import using the tarball urls the pypi package provides.

The mercurial repo at www.pylint.org is not the source repository for pylint, but the import tool has no way of knowing this, at least not while pypi metadata lacks a standard field for storing a package's source repo url. To fix this we must find the correct repo url (https://bitbucket.org/logilab/pylint) and amend the lorry file the import tool generated.

Edit /src/baserock-import/lorries/python.lorry replacing the following generated entry:

"python/pylint": {
    "type": "hg",
    "url": "http://www.pylint.org",
    "x-products-python": [
        "pylint"
    ]
},

with,

"python/pylint": {
    "type": "hg",
    "url": "https://bitbucket.org/logilab/pylint"
    "x-products-python": [
        "pylint"
    ]
},

And remove the existing checkout for pylint and the pylint directory from the lorry-working-dir, since the tool won't overwrite any existing checkouts by default.

cd /src/baserock-import
rm -r checkouts/python_pylint/
rm -r lorry-working-dir/python_pylint/

Fixing: SSL certificate error importing tox

Now to deal with the final error, from the output above we see that the import tool fails to generate a lorry for tox, grepping the log for tox we see that the address we're failing to lorry from is http://tox.testrun.org, which is the website for tox which seems to use a self-signed certificate. The repository for tox is in fact https://bitbucket.org/hpk42/tox, in this case the simplest option is to add the entry for tox manually.

Edit /src/baserock-import/lorries/python.lorry and add:

"python/tox": {
    "type": "hg",
    "url": "https://bitbucket.org/hpk42/tox",
    "x-products-python": [
        "tox"
    ]
}

Now rerun the tool, the import tool should generate a stratum morph and complete successfully.

/src/baserock-import
baserock-import python coverage --log=coverage-import.log

01/13/15 10:46:19 UTC: Import of python coverage started
Not updating existing Git checkouts or existing definitions
coverage master: using checkouts/python_coverage-tarball ref master (commit cc295617f479f655460d46dcd769cb0255c37273)
...
Generating stratum morph for coverage
01/14/15 10:42:26 UTC: Import of python coverage ended (took 50 seconds)

Note that to build a system with the newly generated coverage definitions you will need to add the lorries generated by the tool to your trove and merge the definitions generated by the tool with some existing set of definitions. As with the Flask example, you can skip the lorrying step and build from local copies of the repos by using --use-local-sources. The import tool will not overwrite an existing stratum morph by default, so either remove the existing coverage.morph or run the tool with --force-stratum-generation.