Controlling Python From Rust Code¶
PyOxidizer can be used to embed Python in a Rust application.
This page documents what that looks like from a Rust code perspective.
Interacting with the pyembed
Crate¶
When writing Rust code to interact with a Python interpreter, your
primary area of contact will be with the pyembed
crate.
The pyembed
crate is a standalone crate maintained as part of the
PyOxidizer project. This crate provides the core run-time functionality
for PyOxidizer, such as the implementation of
PyOxidizer’s custom importer. It also exposes
a high-level API for initializing a Python interpreter and running code
in it.
Under the hood, the pyembed
crate uses the cpython
and
python3-sys
crates for interacting with Python’s C APIs. pyembed
exposes the Python
object from cpython
, which means that
once you’ve initialized a Python interpreter with pyembed
, you can
use all the functionality in cpython
to interact with that
interpreter.
Initializing a Python Interpreter¶
Initializing an embedded Python interpreter in your Rust process is as simple
as calling
pyembed::MainPythonInterpreter::new(config: OxidizedPythonInterpreterConfig)
.
The hardest part about this is constructing the
pyembed::OxidizedPythonInterpreterConfig
instance.
Using the Default OxidizedPythonInterpreterConfig
¶
If the pyembed
crate is configured to emit build artifacts (the default),
its build script will generate a Rust source file containing a
fn default_python_config() -> pyembed::OxidizedPythonInterpreterConfig
which
emits a pyembed::OxidizedPythonInterpreterConfig
using the configuration as
defined by the utilized PyOxidizer configuration file.
Assuming you are using the boilerplate Cargo.toml
and build.rs
script
generated with pyoxidizer init-rust-project
, the path to this generated
source file will
be in the PYOXIDIZER_DEFAULT_PYTHON_CONFIG_RS
environment variable.
This all means that to use the auto-generated
pyembed::OxidizedPythonInterpreterConfig
instance with your Rust application,
you simply need to do something like the following:
include!(env!("PYOXIDIZER_DEFAULT_PYTHON_CONFIG_RS"));
fn create_interpreter() -> Result<pyembed::MainPythonInterpreter> {
// Calls function from include!()'d file.
let config: pyembed::OxidizedPythonInterpreterConfig = default_python_config();
pyembed::MainPythonInterpreter::new(config)
}
Using a Custom OxidizedPythonInterpreterConfig
¶
If you don’t want to use the default
pyembed::OxidizedPythonInterpreterConfig
instance, that’s fine too! However,
this will be slightly more complicated.
First, if you use an explicit OxidizedPythonInterpreterConfig
, the
PythonInterpreterConfig Starlark
type defined in your PyOxidizer configuration file doesn’t matter that much.
The primary purpose of this Starlark type is to derive the default
OxidizedPythonInterpreterConfig
Rust struct. And if you are using your own
custom OxidizedPythonInterpreterConfig
instance, you can ignore most of the
arguments when creating the PythonInterpreterConfig
instance.
An exception to this is the raw_allocator
argument/field. If you
are using jemalloc, you will need to enable a Cargo feature when building
the pyembed
crate or else you will get a run-time error that jemalloc
is not available.
pyembed::OxidizedPythonInterpreterConfig::default()
can be used to
construct a new instance, pre-populated with default values for each field.
The defaults should match what the
PythonInterpreterConfig Starlark
type would yield.
The main catch to constructing the instance manually is that the custom
meta path importer won’t be able to service Python import
requests
unless you populate a few fields. In fact, if you just use the defaults,
things will blow up pretty hard at run-time:
$ myapp
Fatal Python error: initfsencoding: Unable to get the locale encoding
ModuleNotFoundError: No module named 'encodings'
Current thread 0x00007fa0e2cbe9c0 (most recent call first):
Aborted (core dumped)
What’s happening here is that Python interpreter initialization hits a fatal
error because it can’t import encodings
(because it can’t locate the
Python standard library) and Python’s C code is exiting the process. Rust
doesn’t even get the chance to handle the error, which is why we’re seeing
a segfault.
The reason we can’t import encodings
is twofold:
The default filesystem importer is disabled by default.
No Python resources are being registered with the
OxidizedPythonInterpreterConfig
instance.
This error can be addressed by working around either.
To enable the default filesystem importer:
let mut config = pyembed::OxidizedPythonInterpreterConfig::default();
config.filesystem_importer = true;
config.sys_paths.push("/path/to/python/standard/library");
As long as the default filesystem importer is enabled and sys.path
can find the Python standard library, you should be able to
start a Python interpreter.
Hint
The sys_paths
field will expand the special token $ORIGIN
to the
directory of the running executable. So if the Python standard library is
in e.g. the lib
directory next to the executable, you can do something
like config.sys_paths.push("$ORIGIN/lib")
.
If you want to use the custom PyOxidizer Importer to import Python resources, you will need to update a handful of fields:
let mut config = pyembed::OxidizedPythonInterpreterConfig::default();
config.packed_resources = ...;
config.use_custom_importlib = true;
The packed_resources
field defines a reference to packed resources
data (a &[u8]
. This is a custom serialization format for expressing
resources to make available to a Python interpreter. See
Python Packed Resources for more. The easiest way to obtain this
data blob is by using PyOxidizer and consuming the packed-resources
build artifact/file, likely though include_bytes!
.
OxidizedFinder Python Type can also be used to produce these data structures.
Finally, setting use_custom_importlib = true
is necessary to enable
the custom bytecode and meta path importer to be used at run-time.
Using a Python Interpreter¶
Once you’ve constructed a pyembed::MainPythonInterpreter
instance, you
can obtain a cpython::Python
instance via .acquire_gil()
and then
use it:
fn do_it(interpreter: &MainPythonInterpreter) -> {
let py = interpreter.acquire_gil().unwrap();
match pyembed::run_code(py, "print('hello, world')") {
Ok(_) => print("python code executed successfully"),
Err(e) => print("python error: {:?}", e),
}
}
The pyembed
crate exports various run_*
functions for
performing high-level evaluation of various primitives (files, modules,
code strings, etc). See the pyembed
crate’s documentation for more.
Since CPython’s API relies on static variables (sadly), if you really wanted
to, you could call out to CPython C APIs directly (probably via the
bindings in the python3-sys
crate) and they would interact with the
interpreter started by the pyembed
crate. This is all unsafe
, of course,
so tread at your own peril.
Finalizing the Interpreter¶
pyembed::MainPythonInterpreter
implements Drop
and it will call
Py_FinalizeEx()
when called. So to terminate the Python interpreter, simply
have the MainPythonInterpreter
instance go out of scope or drop it
explicitly.
A Note on the pyembed
APIs¶
The pyembed
crate is highly tailored towards PyOxidizer’s default use
cases and the APIs are not considered extremely well polished.
While the functionality should work, the ergonomics may not be great.
It is a goal of the PyOxidizer project to support Rust programmers who want
to embed Python in Rust applications. So contributions to improve the quality
of the pyembed
crate will likely be greatly appreciated!