Announcing the 0.9 Release of PyOxidizer

October 18, 2020 at 10:00 PM | categories: Python, PyOxidizer

I have decided to make up for the 6 month lull between PyOxidizer's 0.7 and 0.8 releases by releasing PyOxidizer 0.9 just 1 week after 0.8!

The full 0.9 changelog is found in the docs. First time user? See the Getting Started documentation.

While the 0.9 release is far smaller in terms of features compared to 0.8, it is an important release because of progress closing compatibility gaps.

Build a python Executable

PyOxidizer 0.8 quietly shipped the ability to build executables that behave like python executables via enhancements to the configurability of embedded Python interpreters.

PyOxidizer 0.9 made some minor changes to make this scenario work better and there is even official documentation on how to achieve this. So now you can emit a python executable next to your application's executable. Or you could use PyOxidizer to build a highly portable, self-contained python executable and ship your Python scripts next to it, using PyOxidizer's python in your #!.

Support Packaging Files as Files for Maximum Compatibility

There is a long-tail of Python packages that don't just work with PyOxidizer. A subset of these packages don't work because of bugs with how PyOxidizer attempts to classify files as specific types of Python resources.

The way that normal Python works is you materialize a bunch of files on the filesystem and at run-time the filesystem-based importer stat()s a bunch of paths until it finds a candidate file satisfying the import request. This works of course. But it is inefficient. Since PyOxidizer has awareness of every resource being packaged at build time, it attempts to index all known resources and serialize them to an efficient data structure so finding and loading a resource can be extremely quick (effectively just a hashmap lookup in Rust code to resolve the memory address of data).

PyOxidizer's approach does work in the majority of cases. But there are edge cases. For example, NumPy's binary wheels have installed file paths like numpy.libs/libopenblasp-r0-ae94cfde.3.9.dev.so. The numpy.libs directory is not a valid Python package directory since it has a . and since it doesn't have an __init__.py[c] file. This is a case where PyOxidizer's code for turning files into resources is currently confused.

It is tempting to argue that file layouts like NumPy's are wrong. But there doesn't seem to be any formal specification preventing the use of such layouts. The arbiter of truth here is what Python packaging tools accept and the current code for installing wheels gladly accepts file layouts like these. So I've accepted that PyOxidizer is just going to have to support edge cases like this. (I've captured more details about this particular issue in the docs).

Anyway, PyOxidizer 0.9 ships a new, simpler mode for handling files: files mode. In files mode, PyOxidizer disables its code for classifying files as typed Python resources (like module sources and extension modules) and instead treats a file as... a file.

When in files mode, actions that invoke Python packaging tools return files objects instead of classified resources. If you then add these files for packaging, those files are materialized on the filesystem next to your built executable. You can then use Python's standard filesystem importer to load these files at run-time.

This allows you to use PyOxidizer with packages like NumPy that were previously incompatible due to bugs with file/resource classification. In fact, getting NumPy working with PyOxidizer is now in the official documentation!

Files mode is still in its infancy. There exists code for embedding files data in the produced executable. I plan to eventually teach PyOxidizer's run-time code to extract these embedded files to a temporary directory, SquashFS FUSE filesystem, etc. This is the approach that other Python packaging tools like PyInstaller and XAR use. While it is less efficient, this approach is highly compatible with Python code in the wild since you sidestep issues with __file__ and other assumptions about installed file layouts. So it makes sense for PyOxidizer to provide support for this so you can still achieve the friendliness of a self-contained executable without worrying about compatibility. Look for improvements to files mode in future releases.

And to help debug issues with PyOxidizer's file handling and resource classification, the new pyoxidizer find-resources command can be used to invoke PyOxidizer's code for scanning and classifying files. Hopefully this makes it easier to diagnose bugs in this critical component of PyOxidizer!

Some Important Bug Fixes

PyOxidizer 0.8 shipped with some pretty annoying bugs and behavior quirks.

The ability to set custom sys.path values via Starlark was broken. How I managed to ship that, I'm not sure. But it is fixed in 0.9.

Another bug I can't believe I shipped was the PythonExecutable.read_virtualenv() Starlark method being broken due to a typo. You can read from virtualenvs again in PyOxidizer 0.9.

Another important improvement is in the default Python interpreter configuration. We now automatically initialize Python's locales configuration by default. Without this, the encoding of filesystem paths and sys.argv may not have been correct. If someone passed a non-ASCII argument, the Python str value was likely mangled. PyOxidizer built binaries should behave reasonably by default now. The issue is a good read if the subtle behaviors of how encodings work in Python and on different operating systems is interesting to you.

Better Binary Portability Documentation

The documentation on binary portability has been overhauled. Hopefully it is much more clear about the capabilities of PyOxidizer to produce a binary that just works on other machines.

I eventually want to get PyOxidizer to a point where users don't have to think about binary portability. But until PyOxidizer starts generating installers and providing the ability to run builds in deterministic and reproducible environments, it is sadly a problem that is being externalized to end users.

In Conclusion

PyOxidizer 0.9 is a small release representing just 1 week of work. But it contains some notable features that I wanted to get out the door.

As always, please report any issues or feedback in the GitHub issue tracker or the users mailing list.