Technical Notes¶
CPython Initialization¶
Most code lives in pylifecycle.c
.
Call tree with Python 3.7:
``Py_Initialize()``
``Py_InitializeEx()``
``_Py_InitializeFromConfig(_PyCoreConfig config)``
``_Py_InitializeCore(PyInterpreterState, _PyCoreConfig)``
Sets up allocators.
``_Py_InitializeCore_impl(PyInterpreterState, _PyCoreConfig)``
Does most of the initialization.
Runtime, new interpreter state, thread state, GIL, built-in types,
Initializes sys module and sets up sys.modules.
Initializes builtins module.
``_PyImport_Init()``
Copies ``interp->builtins`` to ``interp->builtins_copy``.
``_PyImportHooks_Init()``
Sets up ``sys.meta_path``, ``sys.path_importer_cache``,
``sys.path_hooks`` to empty data structures.
``initimport()``
``PyImport_ImportFrozenModule("_frozen_importlib")``
``PyImport_AddModule("_frozen_importlib")``
``interp->importlib = importlib``
``interp->import_func = interp->builtins.__import__``
``PyInit__imp()``
Initializes ``_imp`` module, which is implemented in C.
``sys.modules["_imp"} = imp``
``importlib._install(sys, _imp)``
``_PyImportZip_Init()``
``_Py_InitializeMainInterpreter(interp, _PyMainInterpreterConfig)``
``_PySys_EndInit()``
``sys.path = XXX``
``sys.executable = XXX``
``sys.prefix = XXX``
``sys.base_prefix = XXX``
``sys.exec_prefix = XXX``
``sys.base_exec_prefix = XXX``
``sys.argv = XXX``
``sys.warnoptions = XXX``
``sys._xoptions = XXX``
``sys.flags = XXX``
``sys.dont_write_bytecode = XXX``
``initexternalimport()``
``interp->importlib._install_external_importers()``
``initfsencoding()``
``_PyCodec_Lookup(Py_FilesystemDefaultEncoding)``
``_PyCodecRegistry_Init()``
``interp->codec_search_path = []``
``interp->codec_search_cache = {}``
``interp->codec_error_registry = {}``
# This is the first non-frozen import during startup.
``PyImport_ImportModuleNoBlock("encodings")``
``interp->codec_search_cache[codec_name]``
``for p in interp->codec_search_path: p[codec_name]``
``initsigs()``
``add_main_module()``
``PyImport_AddModule("__main__")``
``init_sys_streams()``
``PyImport_ImportModule("encodings.utf_8")``
``PyImport_ImportModule("encodings.latin_1")``
``PyImport_ImportModule("io")``
Consults ``PYTHONIOENCODING`` and gets encoding and error mode.
Sets up ``sys.__stdin__``, ``sys.__stdout__``, ``sys.__stderr__``.
Sets warning options.
Sets ``_PyRuntime.initialized``, which is what ``Py_IsInitialized()``
returns.
``initsite()``
``PyImport_ImportModule("site")``
CPython Importing Mechanism¶
Lib/importlib
defines importing mechanisms and is 100% Python.
Programs/_freeze_importlib.c
is a program that takes a path to an input
.py
file and path to output .h
file. It initializes a Python interpreter
and compiles the .py
file to marshalled bytecode. It writes out a .h
file with an inline const unsigned char _Py_M__importlib
array containing
bytecode.
Lib/importlib/_bootstrap_external.py
compiled to
Python/importlib_external.h
with _Py_M__importlib_external[]
.
Lib/importlib/_bootstrap.py
compiled to
Python/importlib.h
with _Py_M__importlib[]
.
Python/frozen.c
has _PyImport_FrozenModules[]
effectively mapping
_frozen_importlib
to importlib._bootstrap
and
_frozen_importlib_external
to importlib._bootstrap_external
.
initimport()
calls PyImport_ImportFrozenModule("_frozen_importlib")
,
effectively import importlib._bootstrap
. Module import doesn’t appear
to have meaningful side-effects.
importlib._bootstrap.__import__
is installed as interp->import_func
.
C implemented _imp
module is initialized.
importlib._bootstrap._install(sys, _imp
is called. Calls
_setup(sys, _imp)
and adds BuiltinImporter
and FrozenImporter
to sys.meta_path
.
_setup()
defines globals _imp
and sys
. Populates __name__
,
__loader__
, __package__
, __spec__
, __path__
, __file__
,
__cached__
on all sys.modules
entries. Also loads builtins
_thread
, _warnings
, and _weakref
.
Later during interpreter initialization, initexternal()
effectively calls
importlib._bootstrap._install_external_importers()
. This runs
import _frozen_importlib_external
, which is effectively
import importlib._bootstrap_external
. This module handle is aliased to
importlib._bootstrap._bootstrap_external
.
importlib._bootstrap_external
import doesn’t appear to have significant
side-effects.
importlib._bootstrap_external._install()
is called with a reference to
importlib._bootstrap
. _setup()
is called.
importlib._bootstrap._setup()
imports builtins _io
, _warnings
,
_builtins
, marshal
. Either posix
or nt
imported depending
on OS. Various module-level attributes set defining run-time environment.
This includes _winreg
. SOURCE_SUFFIXES
and EXTENSION_SUFFIXES
are updated accordingly.
importlib._bootstrap._get_supported_file_loaders()
returns various
loaders. ExtensionFileLoader
configured from _imp.extension_suffixes()
.
SourceFileLoader
configured from SOURCE_SUFFIXES
.
SourcelessFileLoader
configured from BYTECODE_SUFFIXES
.
FileFinder.path_hook()
called with all loaders and result added to
sys.path_hooks
. PathFinder
added to sys.meta_path
.
sys.modules
After Interpreter Init¶
Module |
Type |
Source |
---|---|---|
|
|
|
|
builtin |
|
|
builtin |
|
|
frozen |
|
|
frozen |
|
|
builtin |
|
|
builtin |
|
|
builtin |
|
|
builtin |
|
|
builtin |
|
|
builtin |
|
|
builtin |
|
|
py |
|
|
builtin |
|
|
py |
|
|
py |
|
|
py |
|
|
py |
|
|
py |
|
|
py |
|
|
builtin |
|
|
builtin |
|
|
builtin |
|
|
builtin |
|
|
builtin |
|
|
builtin |
|
Modules Imported by site.py
¶
_collections_abc
_sitebuiltins
_stat
atexit
genericpath
os
os.path
posixpath
rlcompleter
site
stat
Random Notes¶
Frozen importer iterates an array looking for module names. On each item, it
calls _PyUnicode_EqualToASCIIString()
, which verifies the search name is
ASCII. Performing an O(n) scan for every frozen module if there are a large
number of frozen modules could contribute performance overhead. A better frozen
importer would use a map/hash/dict for lookups. This //may// require CPython
API breakages, as the PyImport_FrozenModules
data structure is documented
as part of the public API and its value could be updated dynamically at
run-time.
importlib._bootstrap
cannot call import
because the global import
hook isn’t registered until after initimport()
.
importlib._bootstrap_external
is the best place to monkeypatch because
of the limited run-time functionality available during importlib._bootstrap
.
It’s a bit wonky that Py_Initialize()
will import modules from the
standard library and it doesn’t appear possible to disable this. If
site.py
is disabled, non-extension builtins are limited to
codecs
, encodings
, abc
, and whatever encodings.*
modules
are needed by initfsencoding()
and init_sys_streams()
.
An attempt was made to freeze the set of standard library modules loaded
during initialization. However, the built-in extension importer doesn’t
set all of the module attributes that are expected of the modules system.
The from . import aliases
in encodings/__init__.py
is confused
without these attributes. And relative imports seemed to have issues as
well. One would think it would be possible to run an embedded interpreter
with all standard library modules frozen, but this doesn’t work.