PyOxidizer 0.7

April 09, 2020 at 09:00 PM | categories: Python, PyOxidizer

I am very pleased to announce the 0.7 release of PyOxidizer, a modern Python application packaging tool.

There are a host of notable new features in this release. You can read all about them in the project history.

I want to use this blog post to call out the more meaningful ones.

I started PyOxidizer as a science experiment of sorts: I sat out to prove the hypothesis that it was possible to produce high performance single file executables embedding Python and all of its resources (Python modules, non-module resource files, compiled extensions, etc). PyOxidizer has achieved this on Windows, Linux, and macOS since its very earliest releases. Hypothesis confirmed!

In order to actually achieve single file executables, you have to fundamentally change aspects of Python's behavior. Some of these changes invalidate deeply rooted assumptions about how Python works, such as the existence of __file__ in modules. As you can imagine, these broken assumptions translated to numerous compatibility issues and PyOxidizer didn't work with many popular Python packages.

With the science experiment phase of PyOxidizer out of the way, I have been making a concerted effort to broaden the user base of PyOxidizer. While single file executables can be an amazing property, it isn't critical for many use cases and the issues it was causing were preventing people from exploring PyOxidizer.

This brings us to what I think are the major new features in PyOxidizer 0.7.

Better Support for Loading Extension Modules

Earlier versions of PyOxidizer insisted that you compile Python (C) extension modules from source and statically link them into a produced binary. This requirement prevented the use of pre-built extension modules (commonly found in Python binary wheels available on PyPI) with PyOxidizer, forcing people to compile them locally. While this often just worked for many extension modules, it frequently failed on complex extension modules and it frequently failed on Windows.

PyOxidizer now supports loading compiled extension modules from standalone files (typically .so or .pyd files, which are actually shared libraries). There are still some sharp edges and known deficiencies. But in many cases, if you tell PyOxidizer to run pip install and package the result, pre-built wheels can be installed and PyOxidizer will pick up the standalone files.

On Windows, PyOxidizer even supports embedding the shared library data into the produced .exe and loading the .pyd/DLL directly from memory.

Loading Resources from the Filesystem

Binaries built with PyOxidizer contain a blob holding an index of available Python resources along with their data.

Earlier versions of PyOxidizer only allowed you to define resources as in-memory. If the resource was defined in this blob, it was imported from memory. Otherwise it wasn't known to PyOxidizer. You could still install files next to the produced binary and tell PyOxidizer to enable Python's default filesystem-based importer. But PyOxidizer didn't explicitly know about these files on the filesystem.

In PyOxidizer 0.7, the blob index of Python resources is able to express different locations for that resource. Currently, a resource can have its data made available in-memory or filesystem-relative. in-memory works as before: the raw data is embedded next to the next in memory and loaded from there (using 0-copy). filesystem-relative encodes a filesystem path to the resource. During packaging, PyOxidizer will place the resource next to the executable (using a typical Python file layout scheme) and store the relative path to that resource in the resources index.

The filesystem-relative resource indexing feature has a few implications for PyOxidizer.

First, it is more standard. When PyOxidizer loads a Python module from the filesystem, it sets __file__, __path__, etc and the module semantics should behave as if the file were imported by Python's standard importer. This means that if a package is having issues with in-memory importing, you can simply fall back to filesystem-relative to get standard Python behavior and everything should just work.

Second, PyOxidizer's filesystem resource loading is faster than Python's! When Python's standard importer goes to import a module, it needs to stat() various paths to first locate the file. It then performs some sanity checking and other minor actions before actually importing the module. All of this has overhead. Since the goal of PyOxidizer is to produce standalone applications and applications should be immutable, PyOxidizer can avoid most of this overhead. PyOxidizer simply tries to open() and read() the relative path baked into the resource index at build time. If that works, the resource is loaded. Else there is a failure. The code path in PyOxidizer to locate a Python resource is effectively a lookup in a Rust HashMap<&str, T>.

I thought it would be interesting to isolate the performance benefits of this new feature. I ran Mercurial's test harness with different variants of hg on Linux on my Ryzen 3950X.

  • traditional - A hg script with a #!/path/to/python3.7 shebang.
  • oxidized - A hg executable built with PyOxidizer, without PyOxidizer's custom module importer.
  • filesystem - A hg executable built with PyOxidizer using the new filesystem-relative resource index.
  • in-memory - A hg executable built with PyOxidizer with all resources loaded from memory (how PyOxidizer has traditionally worked).

The results are quite clear:

VariantCPU Time (s)Delta (s)% Orig
traditional11,287-552100
oxidized10,735-55295.1
filesystem10,186-1,10190.2
in-memory9,883-1,40487.6

We see a nice win just from using a native executable built with PyOxidizer (traditional to oxidized).

Then from oxidized to filesystem we see another jump of ~5%. This difference is attributed to using PyOxidizer's Rust-powered importer with an index of resources available on the filesystem. In other words, all that work that Python's standard importer is doing to discover files and then operate on them is non-trivial!

Finally, the smaller jump from filesystem to in-memory isolates the benefits of importing resource data from memory instead of involving filesystem I/O. (Filesystems are generally slow.) While I haven't measured explicitly, I hypothesize that macOS and Windows will see a bigger jump between these two variants, as the filesystem performance on these platforms generally isn't as good as it is on Linux.

PyOxidizer's Future

With PyOxidizer now supporting a couple of much-needed features to support a broader set of users, I'm hoping that future releases of PyOxidizer continue to broaden the utility of PyOxidizer.

The over-arching goal of PyOxidizer is to solve large aspects of the Python application packaging and distribution problem. So far a lot of focus has been spent on the former. PyOxidizer in its current form can materialize files on the filesystem that you can copy or package up manually and distribute. But I want these processes to be part of PyOxidizer: I want it to be possible for PyOxidizer to emit a Windows MSI installer, a macOS dmg, a Debian package, etc for a Python application.

In order to support the aforementioned marquee features of this PyOxidizer release, I had to pay down a lot of technical debt in the code base left over from the science experiment phase of PyOxidizer's inception.

In the short term, I plan to continue shoring up the code base and rounding out support for features requested in the issue tracker on GitHub. The next release of PyOxidizer will also likely require Python 3.8, as this will improve run-time control over the embedded Python interpreter and enable PyOxidizer to better support package metadata (importlib.metadata), enabling support for features like entry points.

I've also been thinking about extracting PyOxidizer's custom module importer to be usable as a standalone Python extension module. I think there's some value in publishing a pyoxidizer_importer package on PyPI that you can easily add to your installed packages to speed up Python's standard filesystem importer by a few percent. If nothing else, this may drum up interest in the larger Python community for standardizing a format for serializing Python resources in a single file. Perhaps we can get other Python packaging tools producing the same packed resources data blob that PyOxidizer uses so we can all standardize on a more efficient mechanism for loading Python modules. Time will tell.

Enjoy the new release. File issues at https://github.com/indygreg/PyOxidizer as you encounter them.


C Extension Support in PyOxidizer

June 30, 2019 at 04:40 PM | categories: Python, PyOxidizer

The initial release of PyOxidizer generated a bit of excitement across the Internet! The post was commented on heavily in various forums and my phone was constantly buzzing from all the Twitter activity. There has been a steady stream of GitHub Issues for the project, which I consider a good sign. So thank you everybody for the support and encouragement! And especially thank you to everyone who filed an issue or submitted a pull request!

While I don't usually read the comments, I was looking at various forums posting about PyOxidizer to see what reactions were like. A common negative theme was the lack of C extension support in PyOxidizer. People seemed dismissive of PyOxidizer because it didn't support C extensions. Despite the documentation stating that the feature was planned and that I had an idea for how to implement it, people seemed pessimistic. Perhaps I didn't adequately communicate that making C extensions work is actually a subset of the already-solved single file executable problem and therefore was already a solved problem at the technical level (only the integration with the Python and PyOxidizer build systems was missing). So in my mind C extension support was only a matter of time and the only open question was how many hacks would be needed to make it work, not whether it would work.

Well, I'm pleased to report that the just-released version 0.2 of PyOxidizer supports C extensions on Windows, macOS, and Linux. If you install a Python package through a pip-install-simple, pip-requirements-file, or setup-py-install packaging rule, C extensions will be compiled in a special way that enables them to be embedded in the same binary containing Python itself. I've tested it with the zope.interface, zstandard, and mercurial packages and it seems to work (although Mercurial has other issues that prevent it from being packaged as a PyOxidizer application - but the C extensions do compile).

There are some limitations to the support, however. I'm pretty confident the limitations can be eliminated given enough time. Given how many people were hung up on the lack of C extensions and were seemingly writing off PyOxidizer thinking it was snake oil or something, I wanted to deliver basic C extension support to curtail this line of criticism. Perfect is the enemy of good and hopefully basic C extension support is good enough to ease concerns about PyOxidizer's viability.

Also in PyOxidizer 0.2 are some minor new features, like the --pip-install and --python-code flags to pyoxidizer init. These allow you to generate a pyoxidizer.toml file pre-configured to install some packages from pip and run custom Python code. So now applications can be created and built with a one-liner without having to edit a pyoxidizer.toml file!

The full release notes are available. As always, please keep filing issues. I'm particularly interested in hearing about packages whose C extensions don't work properly.


Building Standalone Python Applications with PyOxidizer

June 24, 2019 at 09:00 AM | categories: Python, PyOxidizer, Rust

Python application distribution is generally considered an unsolved problem. At their PyCon 2019 keynote talk, Russel Keith-Magee identified code distribution as a potential black swan - an existential threat for longevity - for Python. In their words, Python hasn't ever had a consistent story for how I give my code to someone else, especially if that someone else isn't a developer and just wants to use my application. I completely agree. And I want to add my opinion that unless your target user is a Python developer, they shouldn't need to know anything about Python packaging, Python itself, or even the existence of Python in order to use your application. (And you can replace Python in the previous sentence with any programming language or software technology: most end-users don't care about the technical implementation, they just want to get stuff done.)

Today, I'm excited to announce the first release of PyOxidizer (project, documentation), an open source utility that aims to solve the Python application distribution problem! (The installation instructions are in the docs.)

Standalone Single File, No Dependencies Executable Python Applications

PyOxidizer's marquee feature is that it can produce a single file executable containing a fully-featured Python interpreter, its extensions, standard library, and your application's modules and resources. In other words, you can have a single .exe providing your application. And unlike other tools in this space which tend to be operating system specific, PyOxidizer works across platforms (currently Windows, macOS, and Linux - the most popular platforms for Python today). Executables built with PyOxidizer have minimal dependencies on the host environment nor do they do anything complicated at run-time. I believe PyOxidizer is the only open source tool to have all these attributes.

On Linux, it is possible to build a fully statically linked executable. You can drop this executable into a chroot or container where it is the only file and it will just work. On macOS and Windows, the only library dependencies are on always-present or extremely common libraries. More details are in the docs.

At execution time, binaries built with PyOxidizer do not do anything special to run the Python interpreter. (Other tools in this space do things like create a temporary directory or SquashFS filesystem and extract Python to it.) PyOxidizer loads everything from memory and there is no explicit I/O being performed. When you import a Python module, the bytecode for that module is being loaded from a memory address in the executable using zero-copy. This makes PyOxidizer executables faster to start and import - faster than a python executable itself!

Current Release and Future Roadmap

Today's release of PyOxidizer is just the first release milestone in what I envision is a long and successful project history. While my over-arching goal with PyOxidizer is to solve vast swaths of the Python application distribution problem, I want to be clear that this first release comes nowhere close to doing so. I toiled with what features must be in the initial release. I ultimately decided that PyOxidizer's current functionality is extremely valuable to some audiences and that the project has matured to the point where more eyeballs and users would substantially help its development. (I could definitely use some help prioritizing which features to work on and for that I need users and user feedback.)

In today's release, PyOxidizer is good at producing executables embedding Python. It doesn't yet venture too far into the distribution part of the problem (I want it to be trivial to produce MSI installers, DMG images, deb/rpm packages, etc). But on Linux, this is already a huge step forward because PyOxidizer makes it easy (hopefully!) to produce binaries that should just work on other machines. (Anyone who has attempted to distribute Linux applications will tell you how painful this problem can be.)

Despite its limitations, I believe today's release of PyOxidizer to be a viable tool for some applications. And I believe PyOxidizer can start to replace existing tools in this space. (See the Comparisons to Other Tools document for how PyOxidizer compares to other Python packaging and distribution tools.)

Using today's release of PyOxidizer, larger user-facing applications using Python (like Dropbox, Kodi, MusicBrainz Picard, etc) could use PyOxidizer to produce self-contained executables. This would likely cut down on installer size, decrease install/update time (fewer files means faster operations), and hopefully make packaging simpler for application maintainers. Maintainers of Python utilities could produce self-contained executables, making their utilities faster to start and easier to package and distribute.

New Possibilities and Reliability for Python

By enabling support for self-contained, single file Python applications, PyOxidizer opens exciting new doors for Python. Because Python has historically required an explicit, separate runtime not part of the executable, Python was not viable (or was a hinderance) in many domains. For example, if you wanted to use Python to bootstrap a fresh server or empty container environment, you had a chicken-and-egg problem because you needed to install Python before you could use it.

Let's take Ansible for example. One of Ansible's features is that it remotes into a machine and runs things. The way it does this is it dynamically generates Python scripts locally, uploads them to the remote machine, and tells the remote to execute them. Those Python scripts require the existence of a Python interpreter on the remote machine. This means you need to install Python on a machine before you can control it with Ansible. Furthermore, because the remote's Python isn't under Ansible's control, you can assume very little about its behavior and capabilities, making interaction a bit brittle.

Using PyOxidizer, projects like Ansible could produce a self-contained executable containing a Python interpreter. They could transfer that single binary to the remote machine and execute it, instantly giving the remote machine access to a fully-featured and modern Python interpreter. From there, the sky is the limit. In Ansible's case, the executable could contain the full Ansible runtime, along with any 3rd party Python packages they wanted to leverage. This would allow execution to occur (possibly mostly independently) on the remote machine. This architecture is simpler, scales better, would likely result in faster operations, and would probably improve the quality of life for everyone involved, from application developers to its end users.

Self-contained Python applications built with PyOxidizer essentially solve the Python interpreter bootstrapping and reliability problems. By providing a Python interpreter and a known set of Python modules, you provide a highly deterministic and reliable execution environment for your application. You don't need to fret about which version of Python is installed: you know which version of Python you are using. You don't need to worry about which Python packages are installed: you control explicitly which packages are available. You don't need to worry about whether you are running in a virtualenv, what sys.path is set to, whether .pth files come into play, whether various PYTHON* environment variables can mess up your application, whether some Linux distribution packaged Python differently, what to put in your script's shebang, etc: executables built with PyOxidizer behave as you have instructed them to because they are compiled that way.

All of the concerns in the previous paragraph contribute to a larger problem in the eyes of application maintainers that can be summarized as Python isn't reliable. And because Python isn't reliable, many people reach the conclusion that Python shouldn't be used (this is the black swan that was referred to earlier). With PyOxidizer, the Python environment is isolated and highly deterministic making the reliability problem largely go away. This makes Python a more viable technology choice. And it enables application maintainers to aggressively adopt modern Python versions, utilize third party packages fearlessly, and spend far less time chasing an extremely long tail of issues related to Python environment variance. Succinctly, application developers can focus on building great applications instead of toiling with Python environment problems.

Project Status

PyOxidizer is still in its relative infancy. While it is far from feature complete, I'm mentally committed to working on the remaining major functionality. The Status document lists major missing functionality, lesser missing functionality, and potential future value-add functionality.

I want PyOxidizer to provide a Python application packaging and distribution experience that just works with minimal cognitive effort from Python application maintainers. I have spent a lot of effort documenting PyOxidizer. I care passionately about user experience and want everything about PyOxidizer to be simple and frustration free. I know things aren't there yet. The problems that PyOxidizer is attempting to solve are hard (that's a reason nobody has solved them well yet). I know there's details floating around in my head that haven't been added to the documentation yet. I know there's missing features and bugs in PyOxidizer. I know there are Packaging Pitfalls yet to be discovered.

This is where you come in.

I need your help to make PyOxidizer great. I encourage Python application maintainers reading this to head over to Getting Started and the Packaging User Guide and try to package your applications with PyOxidizer. If things don't work, let me know by filing an issue. If you are confused by lack of or unclear documentation, file an issue. If something frustrates you, file an issue. If you want to suggest I work on a certain feature or fix a bug, file an issue! Tweet to @indygreg to engage with me there. Join the pyoxidizer-users mailing list. While I feel PyOxidizer is usable today (that's why I'm announcing it), I need your feedback to help guide future prioritization.

Finally, I know PyOxidizer has significant implications for some companies and projects that use Python. While I'm not looking to enrich myself or make my livelihood from PyOxidizer, if PyOxidizer is useful to you and you'd like to send money my way as appreciation, you can do so on Patreon or PayPal. If not, that's totally fine: I wouldn't be making PyOxidizer open source if I didn't want to share it with the world for free! And I am financially well off as well. I just feel like there should be more financial contribution to open source because it would improve the health of the ecosystem and I can help achieve that end by advocating for it and giving myself.

Leveraging Rust

The oxidize part of PyOxidizer comes from Rust (See the Wikipedia Rust article - for the chemical not the programming language - to understand where oxidize comes from.) The build time packaging and building functionality is implemented in Rust. And the binary that embeds and controls the Python interpreter in built applications is Rust code. Rationale for these decisions is explained in the FAQ.

This is my first non-toy project using Rust and I have to say that Rust is... incredible! I may have to author a dedicated blog post extolling the virtues of Rust. In short, Rust is now my go-to language for systems level projects. Unless you need the target platform versatility, I don't think C or C++ are defensibles choices in 2019 given their security deficiencies. Languages like Go, Java, and various JVM or CLR languages are acceptable if you can tolerate having a garbage collector and/or a larger runtime. But what makes Rust superior in my mind is the ability for the compiler to prevent large classes of software bugs (especially those that turn into CVEs) and inefficiencies that have plagued our industry for decades. Rust is the first programming language I've used where I feel like the language itself, the compiler, the tools around it (cargo, rustfmt, clippy, rustup, etc), and the community surrounding it all actually care about and assist me with writing high quality software. Nothing else I've used comes even close.

What I've been most surprised about Rust is how high level it feels for a systems level language that isn't garbage collected. When you program lower-level languages like C or C++, compared to a higher level language like Python, you have to type a lot more and be more explicit in nearly everything you do. While Rust is certainly not as expressive or compact as say Python, it is far, far closer to Python than I was expecting it to be. Yes, you do have to type more and think more about your code to appease the Rust compiler's constraints. But the return on that investment is the compiler preventing entire classes of bugs and C/C++ levels of performance. When I started PyOxidizer, the build time logic was implemented in Python and only the run-time pieces were in Rust. After learning a bit more Rust and realizing the obvious code quality benefits, I ditched Python and adopted Rust for the build time logic. And as the code base has grown and gone through various refactorings, I am so glad I did so! The Rust compiler has caught dozens of would-be bugs in Python. Granted, many of these can be attributed to having strong typing and compile time type checking and Rust is little different than say Java on this front. But a significant number of prevented bugs covered invariants in the code because of the way Rust's type system often intersects with control flow. e.g. match arms must be exhaustive, so you can't have unhandled values/types and unchecked Result instances result ina compiler warning. And clippy has been just fantastic helping to guide me towards writing more acceptable code following community accepted best practices.

Even though PyOxidizer is implemented in Rust, most end-users shouldn't have to care (beyond having to install a Rust compiler and build PyOxidizer from source). The existence of Rust should be abstracted away from Python packagers. I did this on purpose because I believe that users of an application shouldn't have to care about the technical implementation of that application. It is a bit unfortunate that I force users to install Rust before using PyOxidizer, but in my defense the target audience is technically savvy developers, bootstrapping Rust is easy, and PyOxidizer is young, so I think it is acceptble for now. If people get hung up on it, I can provide pre-compiled pyoxidizer executables.

But if you do know Rust, PyOxidizer being implemented in Rust opens up some exciting possibilities!

One exciting possibility with PyOxidizer is the ability to add Rust code to your Python application. PyOxidizer works by generating a default Rust application (main.rs) that simply instantiates and runs an embedded Python interpreter then exits. It essentially does what python or a Python script would do. The key takeaway here is your Python application is technically a Rust application (in the same way that python is technically a C application). And being a Rust application means you can add Rust code to that application. You can modify the autogenerated main.rs to do things before, during, and after the embedded Python interpreter runs. It's a regular Rust program and can do anything that Rust programs can do!

Another possibility - and variant of above - is embedding Python in existing Rust projects. PyOxidizer's mechanism for embedding a Python interpreter is implemented as a standalone Rust crate. One can add the pyembed crate to an existing Rust project and a little of build system magic later, your Rust project can now embed and run a Python interpreter!

There's a lot of potential for hybrid Rust + Python programs. And I am very excited about the possibilities.

If you are a Rust programmer, PyOxidizer allows you to easily embed Python in your Rust application. If you are a Python programmer, PyOxidizer allows your to easily leverage Rust in your Python application. In short, the package ecosystem of the other becomes available to you. And if you aren't familiar with Rust, there are some potentially crazy possibilities. For example, Alacritty is a GPU accelerated terminal emulator written in Rust and Servo is an entire web browser engine written in Rust. With PyOxidizer, you could integrate a terminal emulator or browser engine as part of your Python application if you really wanted to. And, yes, Rust's packaging tools are so good that stuff like this tends to just work. As a concrete example, the pyoxidizer CLI tool contains libgit2 for performing in-process interactions with Git repositories. Adding this required a single line change to a Cargo.toml file and it just worked on Linux, macOS, and Windows. Stuff like this often takes hours to days to integrate in C/C++. It is quite ridiculous how easy it is to add (complex) components to Rust projects!

For years, Python projects have implemented extensions in C to realize performance wins. If your Python application is a Rust executable, then implementing this functionality in Rust (rather than C) seems rationale. So we may see oxidized Python applications have their performance critical pieces slowly be rewritten in Rust. (Honestly, the Rust crates to interface between Rust and the CPython API still leave a bit to be desired, so the experience of writing this Rust code still isn't great. But things will certainly improve over time.)

This type of inside-out split language work has been practiced in Python for years. What PyOxidizer brings to the table is the ability to more easily port code outside-in. For example, you could implement performance-criticial, early application logic such as config file parsing and command line argument parsing in Rust. You could then have Rust service some application functionality without Python. Why would you want this? Performance is a valid reason. Starting a Python interpreter, importing modules, and running code can consume several dozen or even hundreds of milliseconds. If you are writing performance sensitive applications, the existence of any Python can add enough latency that people no longer perceive the interaction as instananeous. This added latency can make Python totally inappropriate for some contexts, such as for programs that run as part of populating your shell's prompt. Writing such code in Rust instead of Python dramatically increases the probability that the code is fast and likely delivers stronger correctness guarantees courtesy of Rust's compile time validation as well!

An extreme practice of outside-in porting of Python to Rust would be to incrementally rewrite an entire Python application in Rust. Rust's ergonomics are exceptional and I do think we'll see people choose Rust where they previously would have chosen Python. I've done this myself with PyOxidizer and feel it is a very defensible decision to reach! I feel a bit conflicted releasing a tool which may undermine Python's popularity by encouraging use of Rust over Python. But at the end of the day, PyOxidizer increases the utility of both Python and Rust by giving each more readily accessible access to the other and PyOxidizer improves the overall utility of Python by improving the application distribution story. I have no doubt PyOxidizer is a net benefit for the Python ecosystem, even if it does help usher in more people choosing Rust over Python. If I have an ulterior motive in developing PyOxidizer, it is to enable Mercurial's official distribution to be a Rust executable and for some functionality (like hg status) to be runnable without Python (for performance reasons).

Another possible use of PyOxidizer is as a library. All the build time functionality of PyOxidizer exists in a Rust crate. So, you can add the pyoxidizer crate to your own Rust project and use its code to do things like build a library containing Python, compile Python source modules to bytecode, or walk a directory tree and find Python resources within. The code is still heavily geared towards PyOxidizer and there's no promise of API stability. But this potential for library usage exists and if others want to experiment with building custom Python binaries not using the pyoxidizer CLI tool, using PyOxidizer as a library might save you a lot of time.

Standalone Python Distributions

One of the most time consuming parts of building PyOxidizer was figuring out how to build self-contained Python distributions. Typically, a Python build consists of a library, shared libraries for various extension modules, shared libraries required by the prior items, and a hodgepodge of other files, such as .py files implementing the Python standard library. The python-build-standalone project was created to automate creating special builds of Python which are self-contained and distributable. This requires doing dirty things with build systems. But I don't want to inflict the details on you here. What I do think is worth mentioning is how those Python distributions are distributed. The output of the build is a tarball containing the Python installation, build artifacts that can be used to link a custom libpython, and a PYTHON.json file describing the contents of the distribution. PyOxidizer reads the PYTHON.json file and learns how it should interact with that distribution. If you produce a Python distribution conforming to the format that python-build-standalone defines, you can use that Python with PyOxidizer.

While I have no urgency to do so at this time, I could see a future where this Python distribution format is standardized. Then maintainers of various Python distributions (CPython, PyPy, etc) would independently produce their own distributable artifacts conforming to this standard, in turn allowing machine consumers of Python distributions (such as PyOxidizer) to easily consume different Python distributions and do interesting things with them. You could even imagine these Python distribution archives being readily available as packages in your system's package manager and their locations exposed via the sysconfig Python module, making it easy for tools (like PyOxidizer) to find and use them.

Over time, I could see PyOxidizer's functionality rolling up into official packaging tools like pip, which would know how to consume the distribution archives and produce an executable containing a Python interpreter, required Python modules, etc.

Getting PyOxidizer's functionality rolled into official Python packaging tools is likely years away (if it ever happens). But I think standardizing a format describing a Python distribution and (optionally) contains build artifacts that can be used to repackage it is a prerequisite and would be a good place to start this journey. I would certainly love for Python distributions (like CPython) to be in charge of producing official repackagable distributions because this is not something I want to be in the business of doing long term (I'm lazy, less equipped to make the correct decisions, and there are various trust and security concerns). And while I'm here, I am definitely interested in upstreaming some of the python-build-standalone functionality into the existing CPython build system because coercing CPython's build system to produce distributable binaries is currently a major pain and I'd love to enable others to do this. I just haven't had time nor do I know if the patches would be well received. If a CPython maintainer wants to get in touch, I'd love to have a conversation!

Conclusion

I started hacking on PyOxidizer in November 2018. After months of chipping away at it, I think I finally have a useful utility for some audiences. There's still a lot of missing features and some rough edges. But the core functionality is there and I'm convinced that PyOxidizer or its underlying technology could be an integral part of solving Python's application distribution black swan problem. I'm particularly proud of the hacks I concocted to coerce Python into importing module bytecode from memory using zero-copy. Those are documented in this blog post and in the pyembed crate docs.

So what are you waiting for? Head on over to the documentation, install PyOxidizer, and let me know how it goes by filing issues!

I hope you enjoy oxidizing your Python applications!


PyOxidizer Support for Windows

January 06, 2019 at 10:00 AM | categories: Python, PyOxidizer, Rust

A few weeks ago I introduced PyOxidizer, a project that aims to make it easier to produce completely self-contained executables embedding a Python interpreter (using Rust). A few days later I observed some PyOxidizer performance benefits.

After a few more hacking sessions, I'm very pleased to report that PyOxidizer is now working on Windows!

I am able to produce a standalone Windows .exe containing a fully featured CPython interpreter, all its library dependencies (OpenSSL, SQLite, liblzma, etc), and a copy of the Python standard library (both source and bytecode data). The binary weighs in at around 25 MB. (It could be smaller if we didn't embed .py source files or stripped some dependencies.) The only DLL dependencies of the exe are vcruntime140.dll and various system DLLs that are always present on Windows.

Like I did for Linux and macOS, I produced a Python script that performs ~500 import statements for the near entirety of the Python standard library. I then ran this script with both the official 64-bit Python distribution and an executable produced with PyOxidizer:

# Official CPython 3.7.2 Windows distribution.
$ time python.exe < import_stdlib.py
real    0m0.475s

# PyOxidizer with non-PGO CPython 3.7.2
$ time target/release/pyapp.exe < import_stdlib.py
real    0m0.347s

Compared to the official CPython distribution, a PyOxidizer executable can import almost the entirety of the Python standard library ~125ms faster - or ~73% of original. In terms of the percentage of speedup, the gains are similar to Linux and macOS. However, there is substantial new process overhead on Windows compared to POSIX architectures. On the same machine, a hello world Python process will execute in ~10ms on Linux and ~40ms on Windows. If we remove the startup overhead, importing the Python standard library runs at ~70% of its original time, making the relative speedup on par with that seen on macOS + APFS.

Windows support is a major milestone for PyOxidizer. And it was the hardest platform to make work. CPython's build system on Windows uses Visual Studio project files. And coercing the build system to produce static libraries was a real pain. Lots of CPython's build tooling assumes Python is built in a very specific manner and multiple changes I made completely break those assumptions. On top of that, it's very easy to encounter problems with symbol name mismatch due to the use of __declspec(dllexport) and __declspec(dllimport). I spent several hours going down a rabbit hole learning how Rust generates symbols on Windows for extern {} items. Unfortunately, we currently have to use a Rust Nightly feature (the static-nobundle linkage kind) to get things to work. But I think there are options to remove that requirement.

Up to this point, my work on PyOxidizer has focused on prototyping the concept. With Windows out of the way and PyOxidizer working on Linux, macOS, and Windows, I have achieved confidence that my vision of a single executable embedding a full-featured Python interpreter is technically viable on major desktop platforms! (BSD people, I care about you too. The solution for Linux should be portable to BSD.) This means I can start focusing on features, usability, and optimization. In other words, I can start building a tool that others will want to use.

As always, you can follow my work on this blog and by following the python-build-standalone and PyOxidizer projects on GitHub.


Faster In-Memory Python Module Importing

December 28, 2018 at 12:40 PM | categories: Python, PyOxidizer, Rust

I recently blogged about distributing standalone Python applications. In that post, I announced PyOxidizer - a tool which leverages Rust to produce standalone executables embedding Python. One of the features of PyOxidizer is the ability to import Python modules embedded within the binary using zero-copy.

I also recently blogged about global kernel locks in APFS, which make filesystem operations slower on macOS. This was the latest wrinkle in a long battle against Python's slow startup times, which I've posted about on the official python-dev mailing list over the years.

Since I announced PyOxidizer a few days ago, I've had some productive holiday hacking sessions!

One of the reached milestones is PyOxidizer now supports macOS.

With that milestone reached, I thought it would be interesting to compare the performance of a PyOxidizer executable versus a standard CPython build.

I produced a Python script that imports almost the entirety of the Python standard library - at least the modules implemented in Python. That's 508 import statements. I then executed this script using a typical python3.7 binary (with the standard library on the filesystem) and PyOxidizer-produced standalone executables with a module importer that loads Python modules from memory using zero copy.

# Homebrew installed CPython 3.7.2

# Cold disk cache.
$ sudo purge
$ time /usr/local/bin/python3.7 < import_stdlib.py
real   0m0.694s
user   0m0.354s
sys    0m0.121s

# Hot disk cache.
$ time /usr/local/bin/python3.7 < import_stdlib.py
real   0m0.319s
user   0m0.263s
sys    0m0.050s

# PyOxidizer with non-PGO/non-LTO CPython 3.7.2
$ time target/release/pyapp < import_stdlib.py
real   0m0.223s
user   0m0.201s
sys    0m0.017s

# PyOxidizer with PGO/non-LTO CPython 3.7.2
$ time target/release/pyapp < import_stdlib.py
real   0m0.234s
user   0m0.210s
sys    0m0.019

# PyOxidizer with PTO+LTO CPython 3.7.2
$ sudo purge
$ time target/release/pyapp < import_stdlib.py
real   0m0.442s
user   0m0.252s
sys    0m0.059s

$ time target/release/pyall < import_stdlib.py
real   0m0.221s
user   0m0.197s
sys    0m0.020s

First, the PyOxidizer times are all relatively similar regardless of whether PGO or LTO is used to build CPython. That's not too surprising, as I'm exercising a very limited subset of CPython (and I suspect the benefits of PGO/LTO aren't as pronounced due to the nature of the CPython API).

But the bigger result is the obvious speedup with PyOxidizer and its in-memory importing: PyOxidizer can import almost the entirety of the Python standard library ~100ms faster - or ~70% of original - than a typical standalone CPython install with a hot disk cache! This comes out to ~0.19ms per import statement. If we run purge to clear out the disk cache, the performance delta increases to 252ms, or ~64% of original. All these numbers are on a 2018 6-core 2.9 GHz i9 MacBook Pro, which has a pretty decent SSD.

And on Linux on an i7-6700K running in a Hyper-V VM:

# pyenv installed CPython 3.7.2

# Cold disk cache.
$ time ~/.pyenv/versions/3.7.2/bin/python < import_stdlib.py
real   0m0.405s
user   0m0.165s
sys    0m0.065s

# Hot disk cache.
$ time ~/.pyenv/versions/3.7.2/bin/python < import_stdlib.py
real   0m0.193s
user   0m0.161s
sys    0m0.032s

# PyOxidizer with PGO CPython 3.7.2

# Cold disk cache.
$ time target/release/pyapp < import_stdlib.py
real   0m0.227s
user   0m0.145s
sys    0m0.016s

# Hot disk cache.
$ time target/release/pyapp < import_stdlib.py
real   0m0.152s
user   0m0.136s
sys    0m0.016s

On a hot disk cache, the run-time improvement of PyOxidizer is ~41ms, or ~78% of original. This comes out to ~0.08ms per import statement. When flushing caches by writing 3 to /proc/sys/vm/drop_caches, the delta increases to ~178ms, or ~56% of original.

Using dtruss -c to execute the binaries, the breakdown in system calls occurring >10 times is clear:

# CPython standalone
fstatfs64                                      16
read_nocancel                                  19
ioctl                                          20
getentropy                                     22
pread                                          26
fcntl                                          27
sigaction                                      32
getdirentries64                                34
fcntl_nocancel                                106
mmap                                          114
close_nocancel                                129
open_nocancel                                 130
lseek                                         148
open                                          168
close                                         170
read                                          282
fstat64                                       403
stat64                                        833

# PyOxidizer
lseek                                          10
read                                           12
read_nocancel                                  14
fstat64                                        16
ioctl                                          22
munmap                                         31
stat64                                         33
sysctl                                         33
sigaction                                      36
mmap                                          122
madvise                                       193
getentropy                                    315

PyOxidizer avoids hundreds of open(), close(), read(), fstat64(), and stat64() calls. And by avoiding these calls, PyOxidizer not only avoids the userland-kernel overhead intrinsic to them, but also any additional overhead that APFS is imposing via its global lock(s).

(Why the PyOxidizer binary is making hundreds of calls to getentropy() I'm not sure. It's definitely coming from Python as a side-effect of a module import and it is something I'd like to fix, if possible.)

With this experiment, we finally have the ability to better isolate the impact of filesystem overhead on Python module importing and preliminary results indicate that the overhead is not insignificant - at least on the tested systems (I'll get data for Windows when PyOxidizer supports it). While the test is somewhat contrived (I don't think many applications import the entirety of the Python standard library), some Python applications do import hundreds of modules. And as I've written before, milliseconds matter. This is especially true if you are invoking Python processes hundreds or thousands of times in a build system, when running a test suite, for scripting, etc. Cumulatively you can be importing tens of thousands of modules. So I think shaving even fractions of a millisecond from module importing is important.

It's worth noting that in addition to the system call overhead, CPython's path-based importer runs substantially more Python code than PyOxidizer and this likely contributes several milliseconds of overhead as well. Because PyOxidizer applications are static, the importer can remain simple (finding a module in PyOxidizer is essentially a Rust HashMap<String, Vec<u8> lookup). While it might be useful to isolate the filesystem overhead from Python code overhead, the thing that end-users care about is overall execution time: they don't care where that overhead is coming from. So I think it is fair to compare PyOxidizer - with its intrinsically simpler import model - with what Python typically does (scan sys.path entries and looking for modules on the filesystem).

Another difference is that PyOxidizer is almost completely statically linked. By contrast, a typical CPython install has compiled extension modules as standalone shared libraries and these shared libraries often link against other shared libraries (such as libssl). From dtruss timing information, I don't believe this difference contributes to significant overhead, however.

Finally, I haven't yet optimized PyOxidizer. I still have a few tricks up my sleeve that can likely shave off more overhead from Python startup. But so far the results are looking very promising. I dare say they are looking promising enough that Python distributions themselves might want to look into the area more thoroughly and consider distribution defaults that rely less on the every-Python-module-is-a-separate-file model.

Stay tuned for more PyOxidizer updates in the near future!

(I updated this post a day after initial publication to add measurements for Linux.)


« Previous Page -- Next Page »