Packaging Python Files¶
The most important packaged resource type are arguably Python files: source modules, bytecode modules, extension modules, package resources, etc.
For PyOxidizer to recognize these Python resources as Python resources
(as opposed to regular files), you will need to use the methods on the
PythonExecutable
Starlark type to use the settings from the
thing being built to scan for resources, possibly performing a Python
packaging action (such as invoking pip install
) along the way.
This documentation covers the available methods and how they can be used.
PythonExecutable
Python Resources Methods¶
The PythonExecutable
Starlark type has the following methods that
can be called to perform an action and obtain an iterable of objects
representing discovered resources:
PythonExecutable.pip_download()
Invokes
pip download
with specified arguments and collects resources discovered from downloaded Python wheels.PythonExecutable.pip_install()
Invokes
pip install
with specified arguments and collects all resources installed by that process.PythonExecutable.read_package_root()
Recursively scans a filesystem directory for Python resources in a typical Python installation layout.
PythonExecutable.setup_py_install()
Invokes
python setup.py install
for a given path and collects resources installed by that process.PythonExecutable.read_virtualenv()
Reads Python resources present in an already populated virtualenv.
Typically, the Starlark types resolved by these method calls are
passed into a method that adds the resource to a to-be-generated
entity, such as the PythonExecutable
Starlark type.
The following sections demonstrate common use cases.
Packaging an Application from a PyPI Package¶
In this section, we’ll show how to package the pyflakes program using a published PyPI package. (Pyflakes is a Python linter.)
First, let’s create an empty project:
$ pyoxidizer init-config-file pyflakes
Next, we need to edit the configuration file to tell
PyOxidizer about pyflakes. Open the pyflakes/pyoxidizer.bzl
file in your
favorite editor.
Find the make_exe()
function. This function returns a
PythonExecutable
instance which defines
a standalone executable containing Python. This function is a registered
target, which is a named entity that can be individually built or run.
By returning a PythonExecutable
instance, this function/target is saying
build an executable containing Python.
The PythonExecutable
type holds all state needed to package and run
a Python interpreter. This includes low-level interpreter configuration
settings to which Python resources (like source and bytecode modules)
are embedded in that executable binary. This type exposes an
PythonExecutable.add_python_resources()
method which adds an iterable
of objects representing Python resources to the set of embedded resources.
Elsewhere in this function, the dist
variable holds an instance of
PythonDistribution
. This type represents a Python distribution,
which is a fancy way of saying an implementation of Python.
Two of the methods exposed by PythonExecutable
are
PythonExecutable.pip_download()
and
PythonExecutable.pip_install()
, which
invoke pip
commands with settings to target the built executable.
To add a new Python package to our executable, we call one of these
methods then add t he results to our PythonExecutable
instance. This
is done like so:
exe.add_python_resources(exe.pip_download(["pyflakes==2.2.0"]))
# or
exe.add_python_resources(exe.pip_install(["pyflakes==2.2.0"]))
When called, these methods will effectively run pip download pyflakes==2.2.0
or pip install pyflakes==2.2.0
, respectively. Actions are performed in
a temporary directory and after pip
runs, PyOxidizer will collect all the
downloaded/installed resources (like module sources and bytecode data) and
return them as an iterable of Starlark values. The
exe.add_python_resources()
call will then teach the built executable
binary about the existence of these resources. Many resource types will be
embedded in the binary and loaded from binary. But some resource types (notably
compiled extension modules) may be installed next to the built binary and
loaded from the filesystem.
Next, we tell PyOxidizer to run pyflakes
when the interpreter is executed:
python_config.run_command = "from pyflakes.api import main; main()"
This says to effectively run the Python code
eval(from pyflakes.api import main; main())
when the embedded interpreter
starts.
The new make_exe()
function should look something like the following (with
comments removed for brevity):
def make_exe(dist):
policy = dist.make_python_packaging_policy()
policy.extension_module_filter = "all"
policy.include_distribution_sources = True
policy.include_distribution_resources = True
policy.include_test = False
config = dist.make_python_interpreter_config()
config.run_command = "from pyflakes.api import main; main()"
exe = dist.to_python_executable(
name="pyflakes",
packaging_policy=policy,
config=config,
)
exe.add_python_resources(exe.pip_install(["pyflakes==2.1.1"]))
return exe
With the configuration changes made, we can build and run a pyflakes
native executable:
# From outside the ``pyflakes`` directory
$ pyoxidizer run --path /path/to/pyflakes/project -- /path/to/python/file/to/analyze
# From inside the ``pyflakes`` directory
$ pyoxidizer run -- /path/to/python/file/to/analyze
# Or if you prefer the Rust native tools
$ cargo run -- /path/to/python/file/to/analyze
By default, pyflakes
analyzes Python source code passed to it via
stdin.
Packaging an Application from an Existing Virtualenv¶
This scenario is very similar to the above example. So we’ll only briefly describe what to do so we don’t repeat ourselves.:
$ pyoxidizer init-config-file /path/to/myapp
Now edit the pyoxidizer.bzl
so the make_exe()
function look like the
following:
def make_exe(dist):
policy = dist.make_python_packaging_policy()
policy.extension_module_filter = "all"
policy.include_distribution_sources = True
policy.include_distribution_resources = False
policy.include_test = False
config = dist.make_python_interpreter_config()
config.run_command = "from myapp import main; main()"
exe = dist.to_python_executable(
name="myapp",
packaging_policy=policy,
config=config,
)
exe.add_python_resources(exe.read_virtualenv("/path/to/virtualenv"))
return exe
Of course, you need a populated virtualenv!:
$ python3.8 -m venv /path/to/virtualenv
$ /path/to/virtualenv/bin/pip install -r /path/to/requirements.txt
Once all the pieces are in place, simply run pyoxidizer
to build and
run the application:
$ pyoxidizer run --path /path/to/myapp
Warning
When consuming a pre-populated virtualenv, there may be compatibility differences between the Python distribution used to populate the virtualenv and the Python distributed used by PyOxidizer at build and application run time.
For best results, it is recommended to use a packaging method like
pip_install(...)
or setup_py_install(...)
to use PyOxidizer’s
Python distribution to invoke Python’s packaging tools.
Packaging an Application from a Local Python Package¶
Say you have a Python package/application in a local directory. It follows
the typical Python package layout and has a setup.py
file and Python
files in sub-directories corresponding to the package name. e.g.:
setup.py
mypackage/__init__.py
mypackage/foo.py
You have a number of choices as to how to proceed here. Again, the
workflow is very similar to what was explained above. The main difference
is the content of the pyoxidizer.bzl
file and the exact
method to call
to obtain the Python resources.
You could use pip install <local path>
to use pip
to process a local
filesystem path:
exe.add_python_resources(exe.pip_install(["/path/to/local/package"]))
If the pyoxidizer.bzl
file is in the same directory as the directory you
want to process, you can derive the absolute path to this directory via the
CWD Starlark variable:
exe.add_python_resources(exe.pip_install([CWD]))
If you don’t want to use pip
and want to run setup.py
directly,
you can do so:
exe.add_python_resources(exe.setup_py_install(package_path=CWD))
Or if you don’t want to run a Python packaging tool at all and just scan a directory tree for Python files:
exe.add_python_resources(exe.read_package_root(CWD, ["mypackage"]))
Note
In this mode, all Python resources must already be in place in their
final installation layout for things to work correctly. Many setup.py
files perform additional actions such as compiling Python extension
modules, installing additional files, dynamically generating some files,
or changing the final installation layout.
For best results, use a packaging method that invokes a Python packaging
tool (like pip_install(...)
or setup_py_install(...)
.
Choosing Which Packaging Method to Call¶
There are a handful of different methods for obtaining Python resources that can be added to a resource collection. Which one should you use?
The reason there are so many methods is because the answer is: it depends.
Each method for obtaining resources has its niche use cases. That being said,
the preferred method for obtaining Python resources is pip_download().
However, pip_download()
may not work in all cases, which is why other
methods exist.
PythonExecutable.pip_download()
runs pip download
and
attempts to fetch Python wheels for specified packages, requirements files,
etc. It then extracts files from inside the wheel and converts them to
Python resources which can be added to resource collectors.
Important
pip_download()
will only work if a compatible Python wheel package
(.whl
file) is available. If the configured Python package repository
doesn’t offer a compatible wheel for the specified package or any of its
dependencies, the operation will fail.
Many Python packages do not yet publish wheels (only .tar.gz
archives)
or don’t publish at all to Python package repositories (this is common in
corporate environments, where you don’t want to publish your proprietary
packages on PyPI or you don’t run a Python package server).
Important
Not all build targets support pip_download()
for all published packages.
For example, when targeting Linux musl libc, built binaries are fully static
and aren’t capable of loading Python extension modules (which are shared
libraries). So pip_download()
only supports source-only Python wheels
in this configuration.
Another advantage of pip_download()
is it supports cross-compiling.
Unlike pip install
, pip download
supports arguments that tell it
which Python version, platform, implementation, etc to download packages
for. PyOxidizer automatically tells pip download
to download wheels
that are compatible with the target environment you are building for. This
means you can do things like download wheels containing Windows binaries
when building on Linux.
Note
Cross-compiling is not yet fully supported by PyOxidizer and likely
doesn’t work in many cases. However, this is a planned feature (at least
for some configurations) and pip_download()
is likely the most
future-proof mechanism to support installing Python packages when
cross-compiling.
A potential downside with pip_download()
is that it only supports
classical Python binary loading/shipping techniques. If you are trying
to produce a statically linked executable containing custom Python
extension modules, pip_download()
won’t work for you.
After pip_download
,
PythonExecutable.pip_install()
PythonExecutable.setup_py_install()
are the next most-preferred
packaging methods.
Both of these work by locally running a Python packaging action
(pip install
or python setup.py install
, respectively) and then
collecting resources installed by that action.
The advantage over pip download
is that a pre-built Python wheel
does not have to be available and published on a Python package repository
for these commands to work: you can run either against say a local version
control checkout of a Python project and it should work.
The main disadvantage over pip download
is that you are running
Python packaging operations on the local machine as part of building
an executable. If your package contains just Python code, this should
just work. But if you need to compile extension modules, there’s a
good chance your local machine may either not be able to build them
properly or will build those extension modules in such a way that
they aren’t compatible with other machines you want to run them on.
The final options for obtaining Python resources are
PythonExecutable.read_package_root()
and
PythonExecutable.read_virtualenv()
. Both of these methods
rely on traversing a filesystem tree that is already populated with Python
resources. This should just work if only pure Python resources are in play.
But if there are compiled Python extension modules, all bets are off and
there is no guarantee that found extension modules will be compatible with
PyOxidizer or will have binary compatibility with other machines. These
resource discovery mechanisms also rely on state not under the control of
PyOxidizer and therefore packaging results may be highly inconsistent and
not reproducible across runs. For these reasons, read_package_root()
and read_virtualenv() are the least preferred methods for Python resource
discovery.