Services
BiotechFintechAutonomous Vehicles
Open sourceContactCareersTeamResearchBlog
12 August 2020 — by Adam Hoese
Developing Python with Poetry & Poetry2nix: Reproducible flexible Python environments
nixpython

Most Python projects are in fact polyglot. Indeed, many popular libraries on PyPi are Python wrappers around C code. This applies particularly to popular scientific computing packages, such as scipy and numpy. Normally, this is the terrain where Nix shines, but its support for Python projects has often been labor-intensive, requiring lots of manual fiddling and fine-tuning. One of the reasons for this is that most Python package management tools do not give enough static information about the project, not offering the determinism needed by Nix.

Thanks to Poetry, this is a problem of the past — its rich lock file offers more than enough information to get Nix running, with minimal manual intervention. In this post, I will show how to use Poetry, together with Poetry2nix, to easily manage Python projects with Nix. I will show how to package a simple Python application both using the existing support for Python in Nixpkgs, and then using Poetry2nix. This will both show why Poetry2nix is more convenient, and serve as a short tutorial covering its features.

Our application

We are going to package a simple application, a Flask server with two endpoints: one returning a static string “Hello World” and another returning a resized image. This application was chosen because:

  1. It can fit into a single file for the purposes of this post.
  2. Image resizing using Pillow requires the use of native libraries, which is something of a strength of Nix.

The code for it is in the imgapp/__init__.py file:

from flask import send_file
from flask import Flask
from io import BytesIO
from PIL import Image
import requests


app = Flask(__name__)


IMAGE_URL = "https://farm1.staticflickr.com/422/32287743652_9f69a6e9d9_b.jpg"
IMAGE_SIZE = (300, 300)


@app.route('/')
def hello():
    return "Hello World!"


@app.route('/image')
def image():
    r = requests.get(IMAGE_URL)
    if not r.status_code == 200:
        raise ValueError(f"Response code was '{r.status_code}'")

    img_io = BytesIO()

    img = Image.open(BytesIO(r.content))
    img.thumbnail(IMAGE_SIZE)
    img.save(img_io, 'JPEG', quality=70)

    img_io.seek(0)

    return send_file(img_io, mimetype='image/jpeg')


def main():
    app.run()


if __name__ == '__main__':
    main()

The status quo for packaging Python with Nix

There are two standard techniques for integrating Python projects with Nix.

Nix only

The first technique uses only Nix for package management, and is described in the Python section of the Nix manual. While it works and may look very appealing on the surface, it uses Nix for all package management needs, which comes with some drawbacks:

  1. We are essentially tied to whatever package version Nixpkgs provides for any given dependency. This can be worked around with overrides, but those can cause version incompatibilities. This happens often in complex Python projects, such as data science ones, which tend to be very sensitive to version changes.
  2. We are tied to using packages already in Nixpkgs. While Nixpkgs has many Python packages already packaged up (around 3000 right now) there are many packages missing — PyPi, the Python Package Index has more than 200000 packages. This can of course be worked around with overlays and manual packaging, but this quickly becomes a daunting task.
  3. In a team setting, every team member wanting to add packages needs to buy in to Nix and at least have some experience using and understanding Nix.

All these factors lead us to a conclusion: we need to embrace Python tooling so we can efficiently work with the entire Python ecosystem.

Pip and Pypi2Nix

The second standard method tries to overcome the faults above by using a hybrid approach of Python tooling together with Nix code generation. Instead of writing dependencies manually in Nix, they are extracted from the requirements.txt file that users of Pip and Virtualenv are very used to. That is, from a requirements.txt file containing the necessary dependencies:

requests
pillow
flask

we can use pypi2nix to package our application in a more automatic fashion than before:

nix-shell -p pypi2nix --run "pypi2nix -r requirements.txt"

However, Pip is not a dependency manager and therefore the requirements.txt file is not explicit enough — it lacks both exact versions for libraries, and system dependencies. Therefore, the command above will not produce a working Nix expression. In order to make pypi2nix work correctly, one has to manually find all dependencies incurred by the use of Pillow:

nix-shell -p pypi2nix --run "pypi2nix -V 3.8 -E pkgconfig -E freetype -E libjpeg -E openjpeg -E zlib -E libtiff -E libwebp -E tcl -E lcms2 -E xorg.libxcb -r requirements.txt"

This will generate a large Nix expression, that will indeed work as expected. Further use of Pypi2nix is left to the reader, but we can already draw some conclusions about this approach:

  1. Code generation results in huge Nix expressions that can be hard to debug and understand. These expressions will typically be checked into a project repository, and can get out of sync with actual dependencies.
  2. It’s very high friction, especially around native dependencies.

Having many large Python projects, I wasn’t satisfied with the status quo around Python package management. So I looked into what could be done to make the situation better, and which tools could be more appropriate for our use-case. A potential candidate was Pipenv, however its dependency solver and lock file format were difficult to work with. In particular, Pipenv’s detection of “local” vs “non-local” dependencies did not work properly inside the Nix shell and gave us the wrong dependency graph. Eventually, I found Poetry and it looked very promising.

Poetry and Poetry2nix

The Poetry package manager is a relatively recent addition to the Python ecosystem but it is gaining popularity very quickly. Poetry features a nice CLI with good UX and deterministic builds through lock files.

Poetry uses pip under the hood and, for this reason, inherited some of its shortcomings and lock file design. I managed to land a few patches in Poetry before the 1.0 release to improve the lock file format, and now it is fit for use in Nix builds. The result was Poetry2nix, whose key design goals were:

  1. Dead simple API.
  2. Work with the entire Python ecosystem using regular Python tooling.
  3. Python developers should not have to be Nix experts, and vice versa.
  4. Being an expert should allow you to “drop down” into the lower levels of the build and customise it.

Poetry2nix is not a code generation tool — it is implemented in pure Nix. This fixes many of problems outlined in previous paragraphs, since there is a single point of truth for dependencies and their versions.

But what about our native dependencies from before? How does Poetry2nix know about those? Indeed, Poetry2nix comes with an extensive set of overrides built-in for a lot of common packages, including Pillow. Users are encouraged to contribute overrides upstream for popular packages, so everyone can have a better user experience.

Now, let’s see how Poetry2nix works in practice.

Developing with Poetry

Let’s start with only our application file above (imgapp/__init__.py) and a shell.nix:

{ pkgs ? import <nixpkgs> {} }:

pkgs.mkShell {

  buildInputs = [
    pkgs.python3
    pkgs.poetry
  ];

}

Poetry comes with some nice helpers to create a project, so we run:

$ poetry init

And then we’ll add our dependencies:

$ poetry add requests pillow flask

We now have two files in the folder:

  • The first one is pyproject.toml which not only specifies our dependencies but also replaces setup.py.
  • The second is poetry.lock which contains our entire pinned Python dependency graph.

For Nix to know which scripts to install in the bin/ output directory, we also need to add a scripts section to pyproject.toml:

[tool.poetry]
name = "imgapp"
version = "0.1.0"
description = ""
authors = ["adisbladis <adisbladis@gmail.com>"]

[tool.poetry.dependencies]
python = "^3.7"
requests = "^2.23.0"
pillow = "^7.1.2"
flask = "^1.1.2"

[tool.poetry.dev-dependencies]

[tool.poetry.scripts]
imgapp = 'imgapp:main'

[build-system]
requires = ["poetry>=0.12"]
build-backend = "poetry.masonry.api"

Packaging with Poetry2nix

Since Poetry2nix is not a code generation tool but implemented entirely in Nix, this step is trivial. Create a default.nix containing:

{ pkgs ? import <nixpkgs> {} }:
pkgs.poetry2nix.mkPoetryApplication {
  projectDir = ./.;
}

We can now invoke nix-build to build our package defined in default.nix. Poetry2nix will automatically infer package names, dependencies, meta attributes and more from the Poetry metadata.

Manipulating overrides

Many overrides for system dependencies are already upstream, but what if some are lacking? These overrides can be manipulated and extended manually:

poetry2nix.mkPoetryApplication {
    projectDir = ./.;
    overrides = poetry2nix.overrides.withDefaults (self: super: {
      foo = foo.overridePythonAttrs(oldAttrs: {});
    });
}

Conclusion

By embracing both modern Python package management tooling and the Nix language, we can achieve best-in-class user experience for Python developers and Nix developers alike.

There are ongoing efforts to make Poetry2nix and other Nix Python tooling work better with data science packages like numpy and scipy. I believe that Nix may soon rival Conda on Linux and MacOS for data science.

Python + Nix has a bright future ahead of it!

If you enjoyed this article, you might be interested in joining the Tweag team.
This article is licensed under a Creative Commons Attribution 4.0 International license.
Interested in working at Tweag?Join us
See our work
  • Biotech
  • Fintech
  • Autonomous vehicles
  • Open source
Tweag
Tweag HQ → 207 Rue de Bercy — 75012 Paris — France
hello@tweag.io
© Tweag I/O Limited. All rights reserved
Privacy Policy