API Reference¶
Module Level Functions¶
- oxidized_importer.decode_source(io_module, source_bytes) str ¶
Decodes Python source code
bytes
to astr
.This is effectively a reimplementation of
importlib._bootstrap_external.decode_source()
- oxidized_importer.find_resources_in_path(path) List ¶
This function will scan the specified filesystem path and return an iterable of objects representing found resources. Those objects will be 1 of the types documented in oxidized_importer Python Resource Types.
Only directories can be scanned.
- oxidized_importer.register_pkg_resources()¶
Enables
pkg_resources
integration.This function effectively does the following:
Calls
pkg_resources.register_finder()
to mapOxidizedPathEntryFinder
to :py:func:pkg_resources_find_distributions`.Calls
pkg_resources.register_load_type()
to mapOxidizedFinder
toOxidizedPkgResourcesProvider
.
It is safe to call this function multiple times, as behavior should be deterministic.
- oxidized_importer.pkg_resources_find_distributions(finder: OxidizedPathEntryFinder, path_item: str, only=false) list ¶
Resolve
pkg_resources.Distribution
instances given aOxidizedPathEntryFinder
and search criteria.This function is what is registered with
pkg_resources
for distribution resolution and you likely don’t need to call it directly.
The OxidizedFinder
Class¶
- class oxidized_importer.OxidizedFinder¶
A meta path finder that resolves indexed resources. See See OxidizedFinder Meta Path Finder for more high-level documentation.
This type implements the following interfaces:
importlib.abc.MetaPathFinder
importlib.abc.Loader
importlib.abc.InspectLoader
importlib.abc.ExecutionLoader
See the importlib.abc documentation for more on these interfaces.
In addition to the methods on the above interfaces, the following methods defined elsewhere in
importlib
are exposed:get_resource_reader(fullname: str) -> importlib.abc.ResourceReader
find_distributions(context: Optional[DistributionFinder.Context]) -> [Distribution]
ResourceReader
is documented alongside otherimportlib.abc
interfaces.find_distribution()
is documented in importlib.metadata.Instances have additional functionality beyond what is defined by
importlib
. This functionality allows you to construct, inspect, and manipulate instances.- multiprocessing_set_start_method¶
(
Opional[str]
) Value to pass tomultiprocessing.set_start_method()
on import ofmultiprocessing
module.None
means the method won’t be called.
- origin¶
(
str
) The path this instance is using as the anchor for relative path references.
- path_hook_base_str¶
(
str
) The base path that the path hook handler on this instance will respond to.This value is often the same as
sys.executable
but isn’t guaranteed to be that exact value.
- pkg_resources_import_auto_register¶
(
bool
) Whether this instance will be registered viapkg_resources.register_finder()
upon this instance importing thepkg_resources
module.
- __new__(cls, relative_path_origin: Optional[os.PathLike]) OxidizedFinder ¶
Construct a new instance of
OxidizedFinder
.New instances of
OxidizedFinder
can be constructed like normal Python types:finder = OxidizedFinder()
The constructor takes the following named arguments:
relative_path_origin
A path-like object denoting the filesystem path that should be used as the origin value for relative path resources. Filesystem-based resources are stored as a relative path to an anchor value. This is that anchor value. If not specified, the directory of the current executable will be used.
See the python_packed_resources Rust crate for the specification of the binary data blob defining packed resources data.
Important
The packed resources data format is still evolving. It is recommended to use the same version of the
oxidized_importer
extension to produce and consume this data structure to ensure compatibility.
- index_bytes(data: bytes) None ¶
This method parses any bytes-like object and indexes the resources within.
- index_file_memory_mapped(path: pathlib.Path) None ¶
This method parses the given Path-like argument and indexes the resources within. Memory mapped I/O is used to read the file. Rust managed the memory map via the
memmap
crate: this does not use the Python interpreter’s memory mapping code.
- index_interpreter_builtins() None ¶
This method indexes Python resources that are built-in to the Python interpreter itself. This indexes built-in extension modules and frozen modules.
- index_interpreter_builtin_extension_modules() None ¶
This method will index Python extension modules that are compiled into the Python interpreter itself.
- index_interpreter_frozen_modules() None ¶
This method will index Python modules whose bytecode is frozen into the Python interpreter itself.
- indexed_resources() List[OxidizedResource] ¶
This method returns a list of resources that are indexed by the instance. It allows Python code to inspect what the finder knows about.
Any mutations to returned values are not reflected in the finder.
See OxidizedResource for more on the returned type.
- add_resource(resource: OxidizedResource)¶
This method registers an OxidizedResource instance with the finder, enabling the finder to use it to service lookups.
When an
OxidizedResource
is registered, its data is copied into the finder instance. So changes to the originalOxidizedResource
are not reflected on the finder. (This is becauseOxidizedFinder
maintains an index and it is important for the data behind that index to not change out from under it.)Resources are stored in an invisible hash map where they are indexed by the
name
attribute. When a resource is added, any existing resource under the same name has its data replaced by the incomingOxidizedResource
instance.If you have source code and want to produce bytecode, you can do something like the following:
def register_module(finder, module_name, source): code = compile(source, module_name, "exec") bytecode = marshal.dumps(code) resource = OxidizedResource() resource.name = module_name resource.is_module = True resource.in_memory_bytecode = bytecode resource.in_memory_source = source finder.add_resource(resource)
- add_resources(resources: List[OxidizedResource]
This method is syntactic sugar for calling
add_resource()
for every item in an iterable. It is exposed because function call overhead in Python can be non-trivial and it can be quicker to pass in an iterable ofOxidizedResource
than to calladd_resource()
potentially hundreds of times.
- serialize_indexed_resources(ignore_builtin=true, ignore_frozen=true) bytes ¶
This method serializes all resources currently indexed by the instance into an opaque
bytes
instance. The returned data can be fed into a separateOxidizedFinder
instance by passing it toOxidizedFinder.__new__()
.Arguments:
ignore_builtin
(bool)Whether to ignore
builtin
extension modules from the serialized data.Default is
True
ignore_frozen
(bool)Whether to ignore
frozen
extension modules from the serialized data.Default is
True
.
Entries for built-in and frozen modules are ignored by default because they aren’t portable, as they are compiled into the interpreter and aren’t guaranteed to work from one Python interpreter to another. The serialized format does support expressing them. Use at your own risk.
- path_hook(path: Union[str, bytes, os.PathLike[AnyStr]]) OxidizedPathEntryFinder ¶
Implements a path hook for obtaining a PathEntryFinder from a
sys.path
entry. See Paths Hooks Compatibility for details.Raises
ImportError
if the given path isn’t serviceable. The exception should have.__cause__
set to an inner exception with more details on why the path was rejected.
The OxidizedDistribution
Class¶
- class oxidized_importer.OxidizedDistribution¶
Represents the metadata of a Python package. Comparable to
importlib.metadata.Distribution
. Instances of this type are emitted byOxidizedFinder.find_distributions
.- from_name(cls, name: str) OxidizedDistribution ¶
- Classmethod:
Resolve the instance for the given package name.
- discover(cls, **kwargs) list[OxidizedDistribution] ¶
- Classmethod:
Resolve instances for all known packages.
- property metadata¶
-
Return the parsed metadata for this distribution.
- property entry_points¶
Resolve entry points for this distribution package.
- property files¶
Not implemented. Always raises when called.
- property requires¶
Generated requirements specified for this distribution.
The OxidizedResourceReader
Class¶
The OxidizedPathEntryFinder
Class¶
- class oxidized_importer.OxidizedPathEntryFinder¶
A path entry finder that can find resources contained in an associated
OxidizedFinder
instance.Instances are created via
OxidizedFinder.path_hook
.Direct use of
OxidizedPathEntryFinder
is generally unnecessary:OxidizedFinder
is the primary interface to the custom importer.See Paths Hooks Compatibility for more on path hook and path entry finder behavior in
oxidized_importer
.- find_spec(fullname: str, target: Optional[types.ModuleType] = None) Optional[importlib.machinery.ModuleSpec] ¶
Search for modules visible to the instance.
- invalidate_caches() None ¶
Invoke the same method on the
OxidizedFinder
instance with which theOxidizedPathEntryFinder
instance was constructed.
- iter_modules(prefix: str = '') List[pkgutil.ModuleInfo] ¶
Iterate over the visible modules. This method complies with
pkgutil.iter_modules
’s protocol.
The OxidizedPkgResourcesProvider
Class¶
- class oxidized_importer.OxidizedPkgResourcesProvider¶
A
pkg_resources.IMetadataProvider
andpkg_resources.IResourceProvider
enablingpkg_resources
to access package metadata and resources.All members of the aforementioned interfaces are implemented. Divergence from
pkg_resources
defined behavior is documented next to the method.- run_script(script_name: str, namespace: Any)¶
Always raises
NotImplementedError
.Please leave a comment in #384 if you would like this functionality implemented.
- get_resource_filename(manager, resource_name: str)¶
Always raises
NotImplementedError
.This behavior appears to be allowed given code in
pkg_resources
. However, it means thatpkg_resources.resource_filename()
will not work. Please leave a comment in #383 if you would like this functionality implemented.
- get_resource_stream(manager, resource_name: str) io.BytesIO ¶
The OxidizedResource
Class¶
- class oxidized_importer.OxidizedResource¶
Represents a resource that is indexed by a
OxidizedFinder
instance.Each instance represents a named entity with associated metadata and data. e.g. an instance can represent a Python module with associated source and bytecode.
New instances can be constructed via
OxidizedResource()
. This will return an instance whosename = ""
and all properties will beNone
orfalse
.- is_module¶
A
bool
indicating if this resource is a Python module. Python modules are backed by source or bytecode.
- is_builtin_extension_module¶
A
bool
indicating if this resource is a Python extension module built-in to the Python interpreter.
- is_frozen_module¶
A
bool
indicating if this resource is a Python module whose bytecode is frozen into the Python interpreter.
- is_extension_module¶
A
bool
indicating if this resource is a Python extension module.
A
bool
indicating if this resource is a shared library.
- name¶
The
str
name of the resource.
- is_package¶
A
bool
indicating if this resource is a Python package.
- is_namespace_package¶
A
bool
indicating if this resource is a Python namespace package.
- in_memory_source¶
bytes
orNone
holding Python module source code that should be imported from memory.
- in_memory_bytecode¶
bytes
orNone
holding Python module bytecode that should be imported from memory.This is raw Python bytecode, as produced from the
marshal
module..pyc
files have a header before this data that will need to be stripped should you want to move data from a.pyc
file into this field.
- in_memory_bytecode_opt1¶
bytes
orNone
holding Python module bytecode at optimization level 1 that should be imported from memory.This is raw Python bytecode, as produced from the
marshal
module..pyc
files have a header before this data that will need to be stripped should you want to move data from a.pyc
file into this field.
- in_memory_bytecode_opt2¶
bytes
orNone
holding Python module bytecode at optimization level 2 that should be imported from memory.This is raw Python bytecode, as produced from the
marshal
module..pyc
files have a header before this data that will need to be stripped should you want to move data from a.pyc
file into this field.
bytes
orNone
holding native machine code defining a Python extension module shared library that should be imported from memory.
- in_memory_package_resources¶
dict[str, bytes]
orNone
holding resource files to make available to theimportlib.resources
APIs via in-memory data access. Thename
of this object will be a Python package name. Keys in this dict are virtual filenames under that package. Values are raw file data.
- in_memory_distribution_resources¶
dict[str, bytes]
orNone
holding resource files to make available to theimportlib.metadata
API via in-memory data access. Thename
of this object will be a Python package name. Keys in this dict are virtual filenames. Values are raw file data.
bytes
orNone
holding a shared library that should be imported from memory.
list[str]
orNone
holding the names of shared libraries that this resource depends on. If this resource defines a loadable shared library, this list can be used to express what other shared libraries it depends on.
- relative_path_module_source¶
pathlib.Path
orNone
holding the relative path to Python module source that should be imported from the filesystem.
- relative_path_module_bytecode¶
pathlib.Path
orNone
holding the relative path to Python module bytecode that should be imported from the filesystem.
- relative_path_module_bytecode_opt1¶
pathlib.Path
orNone
holding the relative path to Python module bytecode at optimization level 1 that should be imported from the filesystem.
- relative_path_module_bytecode_opt2¶
pathlib.Path
orNone
holding the relative path to Python module bytecode at optimization level 2 that should be imported from the filesystem.
pathlib.Path
orNone
holding the relative path to a Python extension module that should be imported from the filesystem.
- relative_path_package_resources¶
dict[str, pathlib.Path]
orNone
holding resource files to make available to theimportlib.resources
APIs via filesystem access. Thename
of this object will be a Python package name. Keys in this dict are filenames under that package. Values are relative paths to files from which to read data.
- relative_path_distribution_resources¶
dict[str, pathlib.Path]
orNone
holding resource files to make available to theimportlib.metadata
APIs via filesystem access. Thename
of this object will be a Python package name. Keys in this dict are filenames under that package. Values are relative paths to files from which to read data.
The OxidizedResourceCollector
Class¶
- class oxidized_importer.OxidizedResourceCollector¶
Provides functionality for turning instances of Python resource types into a collection of
OxidizedResource
for loading into anOxidizedFinder
instance.- __new__(cls, allowed_locations: list[str])¶
Construct an instance by defining locations that resources can be loaded from.
The accepted string values are
in-memory
andfilesystem-relative
.
- allowed_locations¶
(
list[str]
) Exposes allowed locations where resources can be loaded from.
- add_in_memory_resource(resource)¶
Adds a Python resource type (
PythonModuleSource
,PythonModuleBytecode
, etc) to the collector and marks it for loading via in-memory mechanisms.
- add_filesystem_relative(prefix, resource)¶
Adds a Python resource type (
PythonModuleSource
,PythonModuleBytecode
, etc) to the collector and marks it for loading via a relative path next to some origin path (as specified to theOxidizedFinder
). That relative path can have aprefix
value prepended to it. If no prefix is desired and you want the resource placed next to the origin, use an emptystr
forprefix
.
- oxidize() tuple[list[OxidizedResource], list[tuple[pathlib.Path, bytes, bool]]] ¶
Takes all the resources collected so far and turns them into data structures to facilitate later use.
The first element in the returned tuple is a list of
OxidizedResource
instances.The second is a list of 3-tuples containing the relative filesystem path for a file, the content to write to that path, and whether the file should be marked as executable.
The OxidizedResourceReader
Class¶
- class oxidized_importer.OxidizedResourceResource¶
An implementation of importlib.abc.ResourceReader to facilitate resource reading from an
OxidizedFinder
.See Support for ResourceReader for more.
The OxidizedZipFinder
Class¶
- class oxidized_importer.OxidizedZipFinder¶
A meta path finder that operates on zip files.
This type attempts to be a pure Rust reimplementation of the Python standard library
zipimport.zipimporter
type.This type implements the following interfaces:
importlib.abc.MetaPathFinder
importlib.abc.Loader
importlib.abc.InspectLoader
- from_zip_data(cls, source: bytes, path: Union[bytes, str, pathlib.Path, None] = None) OxidizedZipFinder ¶
Construct an instance from zip archive data.
The source argument can be any bytes-like object. A reference to the original Python object will be kept and zip I/O will be performed against the memory tracked by that object. It is possible to trigger an out-of-bounds memory read if the source object is mutated after being passed into this function.
The
path
argument denotes the path to the zip archive. This path will be advertised in__file__
attributes. If not defined, the path of the current executable will be used.
- from_path(cls, path: Union[bytes, str, pathlib.Path]) OxidizedZipFinder ¶
Construct an instance from a filesystem path.
The source represents the path to a file containing zip archive data. The file will be opened using Rust file I/O. The content of the file will be read lazily.
If you don’t already have a copy of the zip data and the zip file will be immutable for the lifetime of the constructed instance, this method may yield better performance than opening the file, reading its content, and calling
OxidizedZipFinder.from_zip_data()
because it may incur less overall I/O.
The PythonModuleSource
Class¶
- class oxidized_importer.PythonModuleSource¶
Represents Python module source code. e.g. a
.py
file.- module¶
(
str
) The fully qualified Python module name. e.g.my_package.foo
.
- source¶
(
bytes
) The source code of the Python module.Note that source code is stored as
bytes
, notstr
. Most Python source is stored asutf-8
, so you can.encode("utf-8")
or.decode("utf-8")
to convert betweenbytes
andstr
.
- is_package¶
(
bool
) Whether this module is a Python package.
The PythonModuleBytecode
Class¶
- class oxidized_importer.PythonModuleBytecode¶
Represents Python module bytecode. e.g. what a
.pyc
file holds (but without the header that a.pyc
file has).- module¶
(
str
) The fully qualified Python module name.
- bytecode¶
(
bytes
) The bytecode of the Python module.This is what you would get by compiling Python source code via something like
marshal.dumps(compile(source, "exe"))
. The bytecode does not contain a header, like what would be found in a.pyc
file.
- optimize_level¶
(
int
) The bytecode optimization level. Either0
,1
, or2
.
- is_package¶
(
bool
) Whether this module is a Python package.
The PythonPackageResource
Class¶
- class oxidized_importer.PythonPackageResource¶
Represents a non-module resource file. These are files that live next to Python modules that are typically accessed via the APIs in
importlib.resources
.- package¶
(
str
) The name of the leaf-most Python package this resource is associated with.With
OxidizedFinder
, animportlib.abc.ResourceReader
associated with this package will be used to load the resource.
- name¶
(
str
) The name of the resource within itspackage
. This is typically the filename of the resource. e.g.resource.txt
orchild/foo.png
.
- data¶
(
bytes
) The raw binary content of the resource.
The PythonPackageDistributionResource
Class¶
- class oxidized_importer.PythonPackageDistributionResource¶
Represents a non-module resource file living in a package distribution directory (e.g.
<package>-<version>.dist-info
or<package>-<version>.egg-info
).These resources are typically accessed via the APIs in
importlib.metadata
.- package¶
(
str
) The name of the Python package this resource is associated with.
- version¶
(
str
) Version string of Python package this resource is associated with.
- name¶
(
str
) The name of the resource within the metadata distribution. This is typically the filename of the resource. e.g.METADATA
.
- data¶
(
bytes
) The raw binary content of the resource.
The PythonExtensionModule
Class¶
- class oxidized_importer.PythonExtensionModule¶
Represents a Python extension module. This is a shared library defining a Python extension implemented in native machine code that can be loaded into a process and defines a Python module. Extension modules are typically defined by
.so
,.dylib
, or.pyd
files.
Note
Properties of this type are read-only.
Footnotes