I have a full time job as a software engineer and my open source work is effectively a side job. (Albeit one I try very hard to not let intersect with my day job.)
Historically, my biggest contributions to my open source projects have come when I'm not working full time:
When working full time, my time to contribute to open source has been carved out of weekday nights and weekends, especially in the winter months. I believe that code is an art form and programming a form of creative expression. My open source contributions provide a relaxing avenue for me to express my artistic creativity, when able.
My open source contributions reflect my personal priorities of where and what to spend my free time on.
The only constant in life is change.
In the middle of 2022, I switched job roles and found myself reinvigorated by my new role - Infrastructure Performance - which is at the intersection of some of my strongest technical and professional skills. I found myself willingly pouring more energy and time into my day job. That had the side effect of reducing my open source contributions.
In 2023Q1 I got married. In the months leading up to and after, I chose to prioritize spending time with my now wife and all the time commitments that entails. This also reduced the amount of time available for open source contributions.
In 2023Q4 I became a father to a beautiful baby girl. While on my employer's generous-for-the-United-States fourteen week paternity leave, I somehow found some time to contribute to open source. As refreshing as that was, it didn't last. My man cave where my desktop computer resides has been converted into a nursery. And for the past few months it has been occupied by my mother-in-law, who has been generously effectively serving as a live-in nanny. Even when I'm able to sit down at my desktop, it's hard to get into a state of flow due to the added entropy from the additional three people now living with me.
After realizing the new normal in 2024Q1, I purchased a Wahoo KICKR MOVE bicycle trainer and now spend considerable time doing virtual bicycle rides on Zwift because its one of the few leisure activities I can do at home without drawing scrutiny from my wife and mother-in-law (but 98% my mother-in-law because I've observed that my wife is effectively infallible). I now get excited about virtually summiting famous climbs instead of contributing to open source. (Today's was Mont Ventoux - an absolute beast of a climb that reminded me a lot of my real world ride up Pike's Peak in 2020.)
Various changes in the past eighteen or so months have created additional time constraints and prioritization changes that have resulted in my open source contributions withering.
In addition, my technical interests have been shifting.
I've always gravitated to more systems-level areas of computers. My degree is in Computer Engineering and I have a stereotypical engineer mindset: I have an insatiable curiosity about how things work and interact and I want to always be tinkering. I prefer to be closer to hardware instead of abstracted far away from it. I enjoy interacting with the building blocks of software ecosystems: operating systems, filesystems, runtimes, file formats, compilers, etc.
Historically, my open source contributions to my preferred areas of computing were limited. Again, to me open source is an enjoyable form of creative expression. That means I do it for fun. Historically, the systems-level programming space was limited to languages like C and C++, which I consider frustrating and painful to use. If I'm going to subject myself to misery when programming, you are going to have to pay me well to do it.
As part of creating PyOxidizer, I learned Rust.
When I became proficient in Rust, I realized that Rust unlocks all kinds of systems-level problems that were effectively off-limits for my open source contributions. Would I implement Debian packaging primitives in Python? Or a tool to bulk analyze Linux packages and peek inside ELF binaries for insights about what compiler/linker features are used in the wild in Python/C/C++? Not unless you pay me to do it!
As I learned Rust, I also found myself being drawn away from Python, my
prior go-to language. As I wrote in Rust is for Professionals,
Rust feels surprisingly high level. It isn't as terse as Python but it is
a lot closer than I thought it would be. And Rust gives you vastly stronger
compile-time guarantees and run-time performance than Python. I felt like
Rust's tooling ecosystem was supporting me instead of standing in my way. I
felt that when you consider the overall software development lifecycle - not
just the edit-build-run loop that people tend to fixate on, likely because
it is the easiest to measure - Rust was vastly more productive and a joy to
work with than Python. All those countless hours debugging, fixing, and
authoring tests for TypeError
and ValueError
Python exceptions you see
in production just don't happen with Rust and that time can be better spent
iterating on core functionality, which is what actually matters.
On top of the Rust undercurrents, I've also become somewhat disenchanted with the Python ecosystem. As I wrote in 2020's Mercurial's Journey to and Reflections on Python 3, the Python 3 transition was bungled and resulted in years - if not a full decade - of lost opportunity. As I wrote in 2023's My User Experience Porting Off setup.py, the Python packaging story feels as discombobulated and frustrating as ever. PyOxidizer additionally brushed up against several limitations in how Python is designed and implemented, many of which are not trivially fixable. As a systems-level guy, I am frequently questioning various aspects of the Python ecosystem which I have contrasting opinions on, including the importance of correctness and performance.
Starting in 2021, I started gravitating towards writing more Rust code and solving problems in the systems domain that were previously off-limits to me, like Apple code signing. Initially the work was in support of PyOxidizer: I was going to implement all these packaging primitives in pure Rust and enable people to distribute Python applications without requiring access to a Windows or macOS machine! Over time, this work consumed me. Apple code signing turned into a major time sink because of its complexity and the fact I was having to reverse engineer a lot of its internals. But I was having a ton of fun doing it: more fun than swimming upstream against decades of encrusted technical debts in the Python ecosystem.
By late 2021, I realized I made a series of mistakes with PyOxidizer.
I started PyOxidizer as a science experiment to see if it was possible to achieve a single file executable Python application without requiring a temporary filesystem at run-time. I succeeded. But the cost was compatibility with the larger pre-built Python package ecosystem. I built all this complexity into PyOxidizer to allow people to tweak how Python resources are packaged so they could choose to build a single file application if they wanted. This ballooned into a hot mess and was obviously not user-friendly. It violated various personal principles about optimizing for end-user experience.
Armed with knowledge of all the pitfalls, I realized that there was a 90% use case for Python application packaging that was simple for end users and technically achievable using all the code primitives - like the pyembed Rust crate - that I built out for PyOxidizer.
Thus the PyOxy project was born and released in May 2022.
While I believe PyOxy is already a generally useful primitive to have in the Python ecosystem, I had bigger goals in mind.
My intent with PyOxy was to build in a simplified and opinionated PyOxidizer
lite mode. The pyoxy
executable is already a chameleon: if you rename it to
python
it behaves like a python
executable. I wanted to extend this so you
could do something like pyoxy build-app
and it would collect all dependencies,
assemble a
Python packed resources
blob, and embed that in a copy of the pyoxy
binary as an ELF, Mach-O, or PE
segment. Then at run-time, the variant executable binary would load the application
configuration and Python resources metadata from its own binary and execute the
application. Essentially, PyOxy would evolve into a self-packaging Python
application. I just needed to evolve the Python packed resources format,
implement a very crude ELF, Mach-O, and PE linker to append resources data to an
executable, and teach pyembed
to read resources data from an ELF, Mach-O, or
PE segment. All within my sphere of technical competency. And I was excited to
build it and forever alter people's perceptions of how easy it could be to produce
a distributable Python application.
Then the roller coaster of my personal life took over. I felt newly invigorated with a new job role. I got engaged and married. I became a father.
By early 2023, it was clear my ability to contribute to open source would be vastly diminished for the foreseeable future. PyOxidizer and PyOxy fell into a state of neglect. Weeks went by without me even tinkering on my local computer, much less push commits or publish a release. Weeks turned into months. Months into quarters. At this point, I haven't pushed a commit to indygreg/PyOxidizer since January 2023. And I'm not sure when I next will, if ever.
In my limited open source contribution time, I've prioritized other projects over PyOxidizer.
python-build-standalone has gained a life outside PyOxidizer. It is now used by rye, Bazel's rules_python, briefcase, and a myriad of other consumers. The release assets have been downloaded over 23 million times and the download rate appears to be accelerating. I still actively support python-build-standalone and intend for the project to be actively supported for the indefinite future: it has become too important to abandon. I'm actively recruiting assistance to help maintain the project and I'm not concerned about its future.
Apple code signing still actively draws my engagement. What I love about the project is it either works or it doesn't: there's limited extra features we can add to it since Apple mostly dictates the feature set. And I perceive the current project to be mostly done.
python-zstandard is downloaded ~8 million times per month. The project is long overdue for some modernization. I'm sitting on a pile of commits to improve it, but progress has been slow. I just learned this weekend that the maintainer of the other popular zstandard Python package deleted their GitHub account recently and now users are looking to onboard to my package. Nothing quite like unanticipated distractions!
That's a very long-winded way of saying that PyOxidizer and all the projects under its umbrella are effectively in a zombie state. I'm hesitant to say dead because if I suddenly found myself with lots of free time I'd love to brush off the cobwebs and bring the projects back to life. But who am I kidding: they are effectively dead at the moment because with everything happening in my personal life, I don't see where I find the time to resuscitate the project. And that assumes I even want to: again, I've become somewhat disenchanted by the state of Python. The main thing that draws me to it is the size of the community and the potential for impact. But to realize that impact I feel like I'd be pushing Python in directions it isn't well-equipped to go in. Quite franky - and, yes, selfishly - I don't want to subject myself to the misery unless I'm being well paid to do it. Again, I view my open source contributions as a fun outlet for my creative expression and nudging Python packaging in directions it is obviously ill-equipped to go in just isn't fun.
If anyone reading has an interest in taking ownership or maintenance responsibilities of PyOxidizer, any projects under its umbrella, or any of my other open source projects, I'm receptive to proposals. Send me an email or create an issue or discussion on GitHub if you want to do it publicly.
But I'm going to assume that PyOxidizer is going to wither and die - or at least incur some massive backwards incompatible breaks if it continues to live. I've already filed issues against python-build-standalone - such as removing Windows static builds - to make the project easier to support and less work for future maintainers.
If I have one regret about how this has played out, it is my failure to communicate developments in my open source commitments / expectations in a timely manner. I knew the future was bleak in early 2023 but didn't publicly say anything. I still thought there was a chance that things were going to change and I didn't want to make a hard decision prematurely. Writing this post has been on my mind since the middle of 2023 but I just couldn't bring myself to write it. And - surprise - having a newborn at home is a giant time and mental commitment! I'm writing this now because people are (finally!) noticing my lack of contributions to PyOxidizer and asking questions. And I'm home alone for a few days and actually have time to sit down and compose this post. (Yes, I'm that stretched for time in my personal life.)
In 2023, I struggled with the idea of letting people down by declaring PyOxidizer dead. But when I wake up every morning, walk into the nursery, and cause my daughter to smile and flail her arms and legs with unbridled excitement when she sees me, I'd have it no other way. When it comes to choosing between open source and family, I choose family.
It feels appropriate to end this post with a link to XKCD 2347: Dependency. But I'm not the random person in Nebraska: I'm a husband and father.
]]>This blog post is purposefully verbose and contains a very lightly edited stream of my mental thoughts. Think of it as a self-assessed user experience study of Python packaging.
I'm no stranger to the Python ecosystem or Python packaging. I've been programming Python for 10+ years. I've even authored a Python application packaging tool, PyOxidizer.
When programming, I strive to understand how things work. I try to not blindly copy-paste or cargo cult patterns unless I understand how they work. This means I often scope bloat myself and slow down velocity in the short term. But I justify this practice because I find it often pays dividends in the long term because I actually understand how things work.
I also have a passion for security and supply chain robustness. After you've helped maintain complex CI systems for multiple companies, you learn the hard way that it is important to do things like transitively pin dependencies and reduce surface area for failures so that build automation breaks in reaction to code changes in your version control, not spooky-action-at-a-distance when state on a third party server changes (e.g. a new package version is uploaded).
I've been aware of the emergence of pyproject.toml
. But I've largely sat on
the sidelines and held off adopting them, mainly for if it isn't broken, don't
fix it reasons. Plus, my perception has been that the tooling still hasn't
stabilized: I'm not going to incur work now if it is going to invite avoidable
churn that could be avoided by sitting on my hands a little longer.
Now, on to my user experience of adding Python 3.12 to python-zstandard and the epic packaging yak shave that entailed.
When I attempted to run CI against Python 3.12 on GitHub Actions, running
python setup.py
complained that setuptools
couldn't be imported.
Huh? I thought setuptools
was installed in pretty much every Python
distribution by default? It was certainly installed in all previous Python
versions by the actions/setup-python
GitHub Action. I was aware distutils
was removed from the Python 3.12
standard library. But setuptools and distutils are not the same! Why did
setuptools
disappear?
I look at the CI logs for the passing Python 3.11 job and notice a message:
******************************************************************************** Please avoid running ``setup.py`` directly. Instead, use pypa/build, pypa/installer or other standards-based tools. See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details. ********************************************************************************
I had several immediate reactions:
pyproject.toml
and
moving away from python setup.py
. Maybe the missing setuptools
in the
3.12 CI environment is a side-effect of this policy shift?pypa/build
and pypa/installer
? I've never heard of them. I know
pypa
is the Python Packaging Authority (I suspect most Python developers
don't know this). Are these GitHub org/repo identifiers?python setup.py
I open https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html in my browser and see a 4,000+ word blog post. Oof. Do I really want/need to read this? Fortunately, the author included a tl;dr and linked to a summary section telling me a lot of useful information! It informs me (my commentary in parentheses):
pyproject.toml
.)python setup.py
solution for 10+ years, this workflow is now quasi deprecated,
and the recommended replacement is it's complicated?! I'm just trying to get my
package modernized. Why does that need to be complicated?)Then I look at the table mapping old ways to new ways. In the new column, it
references the following tools: build,
pytest, tox, nox,
pip, and twine. That's quite the
tooling salad! (And that build
tool must be the pypa/build referenced in the
setuptools warning message. One mystery solved!)
I scroll back to the top of the article and notice the date: October 2021. Two
years old. The summary section also mentioned that there's been a lot of
activity around packaging tooling occurring. So now I'm wondering if this blog
post is outdated. Either way, it is clear I have to perform some additional
research to figure out how to migrate off python setup.py
so I can be
compliant with the new world order.
pyproject.toml
and Build SystemsI had pre-existing knowledge of pyproject.toml
as the modern way to define
build system metadata. So I decide to start my research by Googling
pyproject.toml
. The first results are:
I click pip's documentation first because pip is known to me and it seems a
canonical source. Pip's documentation proceeds to link to
PEP-518, PEP-517,
PEP-621, and PEP-660
before telling me how projects with pyproject.toml
are built, without giving
me - a package maintainer - much useful advice for what to do or how to port from
setup.py
. This seems like a dead end.
Then I look at the Stack Overflow link. Again, telling me a lot of what I don't really care about. (I've somewhat lost faith in Stack Overflow and only really skimmed this page: I would much prefer to get an answer from a first party source.)
I click on the Poetry link. It
documents TOML fields. But only for the [tool.poetry]
section. While I've
heard about Poetry, I know that I probably don't want to scope bloat myself
to learn how Poetry works so I can use it. (No offence meant to the Poetry
project here but I don't perceive my project as needing whatever features
Poetry provides: I'm just trying to publish a simple library package.) I
go back to the search results.
I click on the setuptools
link. I'm using setuptools via setup.py
so this content looks promising! It gives
me a nice example TOML of how to configure a [build-system]
and [project]
metadata. It links to PyPA's Declaring project metadata
content, which I open in a new tab, as the content seems useful. I continue
reading setuptools documentation. I land on its
Quickstart
documentation, which seems useful. I start reading it and it links to the
build tool documentation.
That's the second link to the build
tool. So I open that in a new tab.
At this point, I think I have all the documentation on pyproject.toml
. But
I'm still trying to figure out what to replace python setup.py
with. The
build
tool certainly seems like a contender since I've seen multiple
references to it. But I'm still looking for modern, actively maintained
documentation pointing me in a blessed direction.
The next Google link is A Practical Guide to Setuptools and Pyproject.toml.
I start reading that. I'm immediately confused because it is recommending I
put setuptools metadata in setup.cfg
files. But I just read all about defining
this metadata in pyproject.toml
files in setuptools' own documentation! Is
this blog post out of date? March 12, 2022. Seems pretty modern. I look at the
setuptools documentation again and see the pyproject.toml
metadata pieces are
in version 61.0.0 and newer. I go to
https://github.com/pypa/setuptools/releases/tag/v61.0.0
and see version 61.0.0 was released on March 25, 2022. So the fifth Google link
was seemingly obsoleted 13 days after it was published. Good times. I pretend I
never read this content because it seems out of date.
The next Google link is https://towardsdatascience.com/pyproject-python-9df8cc092f61. I click through. But Medium wants me to log in to read it all and it is unclear it is going to tell me anything important, so I back out.
build
ToolI give up on Google for the moment and start reading up on the build
tool
from its docs.
The only usage documentation for the build
tool is on its
root documentation page.
And that documentation basically prints what python -m build --help
would
print: says what the tool does but doesn't give any guidance or where I should
be using it or how to replace existing tools (like python setup.py
invocations).
Yes, I can piece the parts together and figure out that python -m build
can be
used as a replacement for python setup.py sdist
and python setup.py bdist_wheel
(and maybe pip wheel
?). But should it be the replacement I choose? I make
use of python setup.py develop
and the aforementioned blog post recommended
replacing that with python -m pip install -e
. Perhaps I can use pip
as the
singular replacement for building source distributions and binary wheels so I
have N-1 packaging tools? I keep researching.
I had previously opened https://packaging.python.org/en/latest/specifications/declaring-project-metadata/
in a browser tab without really looking at it. On second glance, I see it is part
of a broader Python Packaging User Guide.
Oh, this looks promising! A guide on how to do what I'm seeking maintained by the
Python Packaging Authority (PyPA), the group who I know to be the, well, authorities
on Python packaging. It is is published under the canonical python.org
domain.
Surely the answer will be here.
I immediately click on the link to Packaging Python Projects to hopefully see what the PyPA folks are recommending.
I skim through. I see recommendations to use a pyproject.toml
with a
[build-system]
to define the build backend. This matches my expectations.
But they are using Hatchling as their build backend. Another tool I don't
really know about. I click through some inline links and eventually arrive
at https://github.com/pypa/hatch. (I'm kind of
confused why the PyPA tutorial said Hatchling when the project and tool is
apparently named Hatch. But whatever.)
I skim Hatch's GitHub README. It looks like a unified packaging tool. Build
system. Package uploading/publishing. Environment management (sounds like a
virtualenv alternative?). This tool actually seems quite nice! I start skimming
the docs. Like Poetry, it seems like this is yet another new tool that I'd need
to learn and would require me to blow up my existing setup.py
in order to
adopt. Do I really want to put in that effort? I'm just trying to get
python-zstandard back on the paved road and avoid seemingly deprecated workflows:
I'm not looking to adopt new tooling stacks.
I'm also further confused by the existence of Hatch under the PyPA GitHub Organization. That's the same GitHub organization hosting the Python packaging tools that are known to me, namely build, pip, and setuptools. Those three projects are pinned repositories. (The other three pinned repositories are virtualenv, wheel, and twine.) Hatch is seemingly a replacement for pip, setuptools, virtualenv, twine, and possibly other tools. But it isn't a pinned repository. Yet it is the default tool used in the PyPA maintained Packaging Python Projects guide. (That guide also suggests using other tools like setuptools, flit, and pdm. But the default is Hatch and that has me asking questions. Also, I didn't initially notice that Creating pyproject.toml has multiple tabs for different backends.)
While Hatch looks interesting, I'm just not getting a strong signal that Hatch is sufficiently stable or warrants my time investment to switch to. So I go back to reading the Python Packaging User Guide.
As I click around the User Guide, it is clear the PyPA folks really want me
to use pyproject.toml
for packaging. I suppose that's the future and that's
a fair ask. But I'm still confused how I should migrate my setup.py
to it.
What are the risks with replacing my setup.py
with pyproject.toml
? Could
I break someone installing my package on an old Linux distribution or old
virtualenv using an older version of setuptools or pip? Will my adoption of
build, hatch, poetry, whatever constitute a one way door where I lock out
users in older environments? My package is downloaded over one million times
per month and if I break packaging someone is likely to complain.
I'm desperately looking for guidance from the PyPA at https://packaging.python.org/ on how to manage this migration. But I just... can't find it. Guides surprisingly has nothing on the topic.
Finally I find Tool recommendations in the PyPA User Guide. Under Packaging tool recommendations it says:
Finally, some canonical documentation from the PyPA that comes out and suggests what to use!
But my relief immediately turns to questioning whether this tooling recommendations documentation is up to date:
[build-system]
backend? The existence of
define seemingly implies using setup.py
or setup.cfg
to define metadata.
But I thought these distutils/setuptools specific mechanisms were deprecated
in favor of the more generic pyproject.toml
?distutils
as if it is still a modern
practice. No mention that it was removed from the standard library in
Python 3.12.build
tool is referenced and that tool is relatively new. So the
docs have to be somewhat up-to-date, right?Sadly, I reach the conclusion that this
Tool recommendations
documentation is inconsistent with newer documentation and can't be trusted.
But it did mention the build
tool and we now have multiple independent
sources steering me in the direction of the build
tool (at least for source
distribution and wheel building), so it seems like we have a winner on our
hands.
build
So let's use the build
tool. I remember docs saying to invoke it with
python -m build
, so I try that:
$ python3.12 -m build --help No module named build.__main__; 'build' is a package and cannot be directly executed
So the build
package exists but it doesn't have a __main__
. Ummm.
$ python3.12R Python 3.12.0 (main, Oct 23 2023, 19:58:35) [Clang 15.0.0 (clang-1500.0.40.1)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import build >>> build.__spec__ ModuleSpec(name='build', loader=<_frozen_importlib_external.NamespaceLoader object at 0x10d403bc0>, submodule_search_locations=_NamespacePath(['/Users/gps/src/python-zstandard/build']))
Oh, it picked up the build
directory from my source checkout because
sys.path
has the current directory by default. Good times.
$ (cd ~ && python3.12 -m build) /Users/gps/.pyenv/versions/3.12.0/bin/python3.12: No module named build
I guess build
isn't installed in my Python distribution / environment.
You used to be able to build packages using just the Python standard library.
I guess this battery is no longer included in the stdlib. I shrug and continue.
build
I go to the Build installation docs.
It says to pip install build
. (I thought I read years ago that one should
use python3 -m pip
to invoke pip. Strange that a PyPA maintained tool is
telling me to invoke pip
directly since I'm pretty sure a lot of the reasons
to use python -m
to invoke tools are still valid. But I digress.)
I follow the instructions, installing it to the global site-packages
because I figure I'll use this tool a lot and I'm not a virtual environment
purist:
$ python3.12 -m pip install build Collecting build Obtaining dependency information for build from https://files.pythonhosted.org/packages/93/dd/b464b728b866aaa62785a609e0dd8c72201d62c5f7c53e7c20f4dceb085f/build-1.0.3-py3-none-any.whl.metadata Downloading build-1.0.3-py3-none-any.whl.metadata (4.2 kB) Collecting packaging>=19.0 (from build) Obtaining dependency information for packaging>=19.0 from https://files.pythonhosted.org/packages/ec/1a/610693ac4ee14fcdf2d9bf3c493370e4f2ef7ae2e19217d7a237ff42367d/packaging-23.2-py3-none-any.whl.metadata Downloading packaging-23.2-py3-none-any.whl.metadata (3.2 kB) Collecting pyproject_hooks (from build) Using cached pyproject_hooks-1.0.0-py3-none-any.whl (9.3 kB) Using cached build-1.0.3-py3-none-any.whl (18 kB) Using cached packaging-23.2-py3-none-any.whl (53 kB) Installing collected packages: pyproject_hooks, packaging, build Successfully installed build-1.0.3 packaging-23.2 pyproject_hooks-1.0.0
That downloads and installs wheels for build
, packaging
, and
pyproject_hooks
.
At this point the security aware part of my brain is screaming because we
didn't pin versions or SHA-256 digests of any of these packages
anywhere. So if a malicious version of any of these packages is somehow
uploaded to PyPI that's going to be a nightmare software supply chain
vulnerability having similar industry impact as
log4shell. Nowhere in build's
documentation does it mention this or say how to securely install build.
I suppose you have to just know about the supply chain gotchas with
pip install
in order to mitigate this risk for yourself.
build
Are PromisingAfter getting build
installed, python3.12 -m build --help
works now
and I can build a wheel:
$ python3.12 -m build --wheel . * Creating venv isolated environment... * Installing packages in isolated environment... (setuptools >= 40.8.0, wheel) * Getting build dependencies for wheel... ... * Installing packages in isolated environment... (wheel) * Building wheel... running bdist_wheel running build running build_py ... Successfully built zstandard-0.22.0.dev0-cp312-cp312-macosx_14_0_x86_64.whl
That looks promising! It seems to have invoked my setup.py
without me
having to define a [build-system]
in my pyproject.toml
! Yay for backwards
compatibility.
cffi
PackageBut I notice something.
My setup.py
script conditionally builds a zstandard._cffi
extension
module if import cffi
succeeds. Building with build
isn't building this
extension module.
Before using build
, I had to run setup.py
using a python
having the
cffi
package installed, usually a project-local virtualenv. So let's try
that:
$ venv/bin/python -m pip install build cffi ... $ venv/bin/python -m build --wheel . ...
And I get the same behavior: no CFFI extension module.
Staring at the output, I see what looks like a smoking gun:
* Creating venv isolated environment... * Installing packages in isolated environment... (setuptools >= 40.8.0, wheel) * Getting build dependencies for wheel... ... * Installing packages in isolated environment... (wheel)
OK. So it looks like build
is creating its own isolated environment
(disregarding the invoked Python environment having cffi
installed),
installing setuptools >= 40.8.0
and wheel
into it, and then executing
the build from that environment.
So build
sandboxes builds in an ephemeral build environment. This actually
seems like a useful feature to help with deterministic and reproducible
builds: I like it! But at this moment it stands in the way of progress. So
I run python -m build --help
, spot a --no-isolation
argument and do the
obvious:
$ venv/bin/python -m build --wheel --no-isolation . ... building 'zstandard._cffi' extension ...
Success!
And I don't see any deprecation warnings either. So I think I'm all good.
But obviously I've ventured off the paved road here, as we had to violate
the default constraints of build
to get things to work. I'll get back to that
later.
pip
Just for good measure, let's see if we can use pip wheel
to produce wheels,
as I've seen references that this is a supported mechanism for building wheels.
$ venv/bin/python -m pip wheel . Processing /Users/gps/src/python-zstandard Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Building wheels for collected packages: zstandard Building wheel for zstandard (pyproject.toml) ... done Created wheel for zstandard: filename=zstandard-0.22.0.dev0-cp312-cp312-macosx_14_0_x86_64.whl size=407841 sha256=a2e1cc1ad570ab6b2c23999695165a71c8c9e30823f915b88db421443749f58e Stored in directory: /Users/gps/Library/Caches/pip/wheels/eb/6b/3e/89aae0b17b638c9cdcd2015d98b85ee7fb3ef00325bb44a572 Successfully built zstandard
That output is a bit terse, since the setuptools build logs are getting swallowed.
That's fine. Rather than run with -v
to get those logs, I manually inspect
the built wheel:
$ unzip -lv zstandard-0.22.0.dev0-cp312-cp312-macosx_14_0_x86_64.whl Archive: zstandard-0.22.0.dev0-cp312-cp312-macosx_14_0_x86_64.whl Length Method Size Cmpr Date Time CRC-32 Name -------- ------ ------- ---- ---------- ----- -------- ---- 7107 Defl:N 2490 65% 10-23-2023 08:36 7bb42fff zstandard/__init__.py 13938 Defl:N 2498 82% 10-23-2023 08:36 8d8d1316 zstandard/__init__.pyi 919352 Defl:N 366631 60% 10-26-2023 08:28 3aeefc48 zstandard/backend_c.cpython-312-darwin.so 152430 Defl:N 32528 79% 10-26-2023 05:37 fc1a3c0c zstandard/backend_cffi.py 0 Defl:N 2 0% 12-26-2020 16:12 00000000 zstandard/py.typed 1484 Defl:N 784 47% 10-26-2023 08:28 facba579 zstandard-0.22.0.dev0.dist-info/LICENSE 2863 Defl:N 847 70% 10-26-2023 08:28 b8d80875 zstandard-0.22.0.dev0.dist-info/METADATA 111 Defl:N 106 5% 10-26-2023 08:28 878098e6 zstandard-0.22.0.dev0.dist-info/WHEEL 10 Defl:N 12 -20% 10-26-2023 08:28 a5f38e4e zstandard-0.22.0.dev0.dist-info/top_level.txt 841 Defl:N 509 40% 10-26-2023 08:28 e9a804ae zstandard-0.22.0.dev0.dist-info/RECORD -------- ------- --- ------- 1098136 406407 63% 10 files
(Python wheels are just zip files with certain well-defined paths having special meanings. I know this because I wrote Rust code for parsing wheels as part of developing PyOxidizer.)
Looks like the zstandard/_cffi.cpython-312-darwin.so
extension module is missing.
Well, at least pip
is consistent with build
! Although somewhat confusingly I don't
see any reference to a separate build environment in the pip output. But I suspect
it is there because cffi
is installed in the virtual environment I invoke pip from!
Reading pip help output, I find the relevant argument to not spawn a new environment and try again:
$ venv/bin/python -m pip wheel --no-build-isolation . <same exact output except the wheel size and digest changes> $ unzip -lv zstandard-0.22.0.dev0-cp312-cp312-macosx_14_0_x86_64.whl ... 1002664 Defl:N 379132 62% 10-26-2023 08:33 48afe5ba zstandard/_cffi.cpython-312-darwin.so ...
(I'm happy to see build
and pip
agreeing on the no isolation terminology.)
OK, so I got build
and pip
to behave nearly identically. I feel like I
finally understand this!
I also run pip -v wheel
and pip -vv wheel
to peek under the covers and see
what it's doing. Interestingly, I don't see any hint of a virtual environment
or temporary directory until I go to -vv
. I find it interesting that build
presents details about this by default but you have to put pip
in very verbose
mode to get it. I'm glad I used build
first because the ephemeral build
environment was the source of my missing dependency and pip
buried this
important detail behind a ton of other output in -vv
, making it much harder
to discover!
setuptools
Gets InstalledWhen looking at pip's verbose output, I also see references to installing the
setuptools
and wheel
packages:
Processing /Users/gps/src/python-zstandard Running command pip subprocess to install build dependencies Collecting setuptools>=40.8.0 Using cached setuptools-68.2.2-py3-none-any.whl.metadata (6.3 kB) Collecting wheel Using cached wheel-0.41.2-py3-none-any.whl.metadata (2.2 kB) Using cached setuptools-68.2.2-py3-none-any.whl (807 kB) Using cached wheel-0.41.2-py3-none-any.whl (64 kB) Installing collected packages: wheel, setuptools Successfully installed setuptools-68.2.2 wheel-0.41.2 Installing build dependencies ... done
There's that setuptools>=40.8.0
constraint again. (We also saw it in build
.)
I rg 40.8.0
my source checkout (note: the .
in there are wildcard characters
since 40.8.0
is a regexp so this could over match) and come up with nothing.
If it's not coming from my code, where is it coming from?
In the pip documentation, Fallback behaviour
says that a missing [build-system]
from pyproject.toml
is implicitly
translated to the following:
[build-system] requires = ["setuptools>=40.8.0", "wheel"] build-backend = "setuptools.build_meta:__legacy__"
For build
, I go to the source code and discover that
similar functionality
was added in May 2020.
I'm not sure if this default behavior is specified in a PEP or what. But
build
and pip
seem to be agreeing on the behavior of adding
setuptools>=40.8.0
and wheel
to their ephemeral build environments and
invoking setuptools.build_meta:__legacy__
as the build backend as
implicit defaults if your pyproject.toml
lacks a [build-system]
. OK.
Perhaps I should consider defining [build-system]
and being explicit
about things? After all, the tools aren't printing anything indicating they
are assuming implicit defaults and for all I know the defaults could change
in a backwards incompatible manner in any release and break my build. (Although
I would hope to see a deprecation warning before that occurs.)
So I modify my pyproject.toml
accordingly:
[build-system] requires = [ "cffi==1.16.0", "setuptools==68.2.2", "wheel==0.41.2", ] build-backend = "setuptools.build_meta:__legacy__"
I pinned all the dependencies to specific versions because I like determinism and reproducibility. I really don't like when the upload of a new package version breaks my builds!
pyproject.toml
When I pinned dependencies in [build-system]
in pyproject.toml
, the
security part of my brain is screaming over the lack of SHA-256
digest pinning.
How am I sure that we're using well-known, trusted versions of these dependencies? Are all the transitive dependencies even pinned?
Before pyproject.toml
, I used pip-compile
from
pip-tools to generate a requirements.txt
containing SHA-256 digests for all transitive dependencies. I would use
python3 -m venv
to create a virtualenv,
venv/bin/python -m pip install -r requirements.txt
to materialize a (highly
deterministic) set of packages, then run venv/bin/python setup.py
to invoke
a build in this stable and securely created environment. (Some) software supply chain
risks averted! But, uh, how do I do that with pyproject.toml
build-system.requires
? Does it even support pinning SHA-256 digests?
I skim the PEPs related to pyproject.toml
and don't see anything. Surely
I'm missing something.
In desperation I check the pip-tools project and sure enough they
document pyproject.toml integration.
However, they tell you how to feed requirements.txt
files into the dynamic
dependencies consumed by the build backend: there's nothing on how to securely
install the build backend itself.
As far as I can tell pyproject.toml
has no facilities for securely
installing (read: pinning content digests for all transitive dependencies)
the build backend itself. This is left as an exercise to the reader. But,
um, the build frontend (which I was also instructed to download insecurely
via python -m pip install
) is the thing installing the build backend. How am
I supposed to subvert the build frontend to securely install the build backend?
Am I supposed to disable default behavior of using an ephemeral environment
in order to get secure backend installs? Doesn't the ephemeral environment
give me additional, desired protections for build determinism and
reproducibility? That seems wrong.
It kind of looks like pyproject.toml
wasn't designed with software supply
chain risk mitigation as a criteria. This is extremely surprising for a build
system abstraction designed in the past few years. I shrug my shoulders and
move on.
python setup.py develop
InvocationsNow that I figure I have a working pyproject.toml
, I move onto removing
python setup.py
invocations.
First up is a python setup.py develop --rust-backend
invocation.
My setup.py
performs very crude scanning
of sys.argv
looking for command arguments like --system-zstd
and
--rust-backend
as a way to influence the build. We just sniff these special
arguments and remove them from sys.argv
so they don't confuse the setuptools
options parser. (I don't believe this is a blessed way of doing custom options
handling in distutils/setuptools. But it is simple and has worked since I
introduced the pattern in 2016.)
--global-option
the Answer?With python setup.py
invocations going away and a build frontend invoking
setup.py
, I need to find an alternative mechanism to pass settings into my
setup.py
.
Why you shouldn't invoke setup.py directly
tells me I should use pip install -e
. I'm guessing there's a way to instruct
pip install
to pass arguments to setup.py
.
$ venv/bin/python -m pip install --help ... -C, --config-settings <settings> Configuration settings to be passed to the PEP 517 build backend. Settings take the form KEY=VALUE. Use multiple --config-settings options to pass multiple keys to the backend. --global-option <options> Extra global options to be supplied to the setup.py call before the install or bdist_wheel command. ...
Hmmm. Not really sure which of these to use. But--global-option
mentions
setup.py
and I'm using setup.py
. So I try that:
$ venv/bin/python -m pip install --global-option --rust-backend -e . Usage: /Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] <requirement specifier> [package-index-options] ... /Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] -r <requirements file> [package-index-options] ... /Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] [-e] <vcs project url> ... /Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] [-e] <local project path> ... /Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] <archive url/path> ... no such option: --rust-backend
Oh, duh, --rust-backend
looks like an argument and makes pip's own argument
parsing ambiguous as to how to handle it. Let's try that again with
--global-option=--rust-backend
:
$ venv/bin/python -m pip install --global-option=--rust-backend -e . DEPRECATION: --build-option and --global-option are deprecated. pip 24.0 will enforce this behaviour change. A possible replacement is to use --config-settings. Discussion can be found at https://github.com/pypa/pip/issues/11859 WARNING: Implying --no-binary=:all: due to the presence of --build-option / --global-option. Obtaining file:///Users/gps/src/python-zstandard Installing build dependencies ... done Checking if build backend supports build_editable ... done Getting requirements to build editable ... done Preparing editable metadata (pyproject.toml) ... done Building wheels for collected packages: zstandard WARNING: Ignoring --global-option when building zstandard using PEP 517 Building editable for zstandard (pyproject.toml) ... done Created wheel for zstandard: filename=zstandard-0.22.0.dev0-0.editable-cp312-cp312-macosx_14_0_x86_64.whl size=4379 sha256=05669b0a5fd8951cac711923d687d9d4192f6a70a8268dca31bdf39012b140c8 Stored in directory: /private/var/folders/dd/xb3jz0tj133_hgnvdttctwxc0000gn/T/pip-ephem-wheel-cache-6amdpg21/wheels/eb/6b/3e/89aae0b17b638c9cdcd2015d98b85ee7fb3ef00325bb44a572 Successfully built zstandard Installing collected packages: zstandard Successfully installed zstandard-0.22.0.dev0
I immediately see the three DEPRECATION
and WARNING
lines (which are color
highlighted in my terminal, yay):
DEPRECATION: --build-option and --global-option are deprecated. pip 24.0 will enforce this behaviour change. A possible replacement is to use --config-settings. Discussion can be found at https://github.com/pypa/pip/issues/11859 WARNING: Implying --no-binary=:all: due to the presence of --build-option / --global-option. WARNING: Ignoring --global-option when building zstandard using PEP 517
Yikes. It looks like --global-option
is deprecated and will be removed in pip 24.0.
And, later it says --global-option
was ignored. Is that true?!
$ ls -al zstandard/*cpython-312*.so -rwxr-xr-x 1 gps staff 1002680 Oct 27 11:35 zstandard/_cffi.cpython-312-darwin.so -rwxr-xr-x 1 gps staff 919352 Oct 27 11:35 zstandard/backend_c.cpython-312-darwin.so
Not seeing a backend_rust
library like I was expecting. So, yes, it does look
like --global-option
was ignored.
This behavior is actually pretty concerning to me. It certainly
seems like at one time --global-option
(and a --build-option
which doesn't
exist on the pip install
command I guess) did get threaded through to setup.py
.
However, it no longer does.
I find an entry in the pip 23.1 changelog:
Deprecate --build-option and --global-option. Users are invited to switch
to --config-settings. (#11859)
. Deprecate. What is pip's definition of
deprecate? I click the link to #11859.
An open issue with a lot of comments. I scan the issue history to find
referenced PRs and click on #11861.
OK, it is just an advertisement. Maybe --global-option
never got threaded
through to setup.py
? But its help usage text clearly says it is related to
setup.py
! Maybe the presence of [build-system]
in pyproject.toml
is
somehow engaging different semantics that result in --global-option
not
being passed to setup.py
? The warning message did say
Ignoring --global-option when building zstandard using PEP 517
.
I try commenting out the [build-system]
section in my pyproject.toml
and trying again. Same result. Huh? Reading the pip install --help
output,
I see --no-use-pep517
and try it:
$ venv/bin/python -m pip install --global-option=--rust-backend --no-use-pep517 -e . ... $ ls -al zstandard/*cpython-312*.so -rwxr-xr-x 1 gps staff 1002680 Oct 27 11:35 zstandard/_cffi.cpython-312-darwin.so -rwxr-xr-x 1 gps staff 919352 Oct 27 11:35 zstandard/backend_c.cpython-312-darwin.so -rwxr-xr-x 1 gps staff 2727920 Oct 27 11:53 zstandard/backend_rust.cpython-312-darwin.so
Ahh, so pip's default PEP-517 build mode is causing --global-option
to get
ignored. So I guess older versions of pip honored --global-option
and when
pip switched to PEP-517 build mode by default --global-option
just stopped
working and emitted a warning instead. That's quite the backwards incompatible
behavior break! I really wish tools would fail fast when making these kinds of
breaks or at least offer a --warnings-as-errors
mode so I can opt into fatal
errors when these kinds of breaks / deprecations are introduced. I would 100%
opt into this since these warnings are often the figurative needle in a haystack
of CI logs and easy to miss. Especially if the build environment is
non-deterministic and new versions of tools like pip get installed randomly
without a version control commit.
Pip's allowing me to specify --global-option
but then only issuing a
warning when it is ignored doesn't sit well with me. But what can I do?
It is obvious --global-option
is a non-starter here.
--config-setting
Fortunately, pip's deprecation message suggests a path forward:
A possible replacement is to use --config-settings. Discussion can be found at https://github.com/pypa/pip/issues/11859
First, kudos for actionable warning messages. However, the wording says
possible replacement. Are there other alternatives I didn't see in the
pip install --help
output?
Anyway, I decide to go with that --config-settings
suggestion.
$ venv/bin/python -m pip install --config-settings=--rust-backend -e . Usage: /Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] <requirement specifier> [package-index-options] ... /Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] -r <requirements file> [package-index-options] ... /Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] [-e] <vcs project url> ... /Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] [-e] <local project path> ... /Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] <archive url/path> ... Arguments to --config-settings must be of the form KEY=VAL
Hmmm. Let's try adding a trailing =
?
$ venv/bin/python -m pip install --config-settings=--rust-backend= -e . Obtaining file:///Users/gps/src/python-zstandard Installing build dependencies ... done Checking if build backend supports build_editable ... done Getting requirements to build editable ... done Preparing editable metadata (pyproject.toml) ... done Building wheels for collected packages: zstandard Building editable for zstandard (pyproject.toml) ... done Created wheel for zstandard: filename=zstandard-0.22.0.dev0-0.editable-cp312-cp312-macosx_14_0_x86_64.whl size=4379 sha256=619db9806bc4c39e973c3197a0ddb9b03b49fff53cd9ac3d7df301318d390b5e Stored in directory: /private/var/folders/dd/xb3jz0tj133_hgnvdttctwxc0000gn/T/pip-ephem-wheel-cache-gtsvw78d/wheels/eb/6b/3e/89aae0b17b638c9cdcd2015d98b85ee7fb3ef00325bb44a572 Successfully built zstandard Installing collected packages: zstandard Attempting uninstall: zstandard Found existing installation: zstandard 0.22.0.dev0 Uninstalling zstandard-0.22.0.dev0: Successfully uninstalled zstandard-0.22.0.dev0 Successfully installed zstandard-0.22.0.dev0
No warnings or deprecations. That's promising. Did it work?
$ ls -al zstandard/*cpython-312*.so -rwxr-xr-x 1 gps staff 1002680 Oct 27 12:11 zstandard/_cffi.cpython-312-darwin.so -rwxr-xr-x 1 gps staff 919352 Oct 27 12:11 zstandard/backend_c.cpython-312-darwin.so
No backend_rust
extension module. Boo. So what actually happened?
$ venv/bin/python -m pip -v install --config-settings=--rust-backend= -e .
I don't see --rust-backend
anywhere in that log output. I try with
more verbosity:
$ venv/bin/python -m pip -vvvvv install --config-settings=--rust-backend= -e .
Still nothing!
Maybe That --
prefix is wrong?
$ venv/bin/python -m pip -vvvvv install --config-settings=rust-backend= -e .
Still nothing!
I have no clue how --config-settings=
is getting passed
to setup.py
nor where it is seemingly getting dropped on the floor.
--config-settings
?This must be documented in the setuptools project. So I open those docs in my web browser and do a search for settings. I open the first three results in separate tabs:
That first link has docs on the deprecated setuptools commands and how to
invoke python setup.py
directly. (Note: there is a warning box here saying
that python setup.py
is deprecated. I guess I somehow missed this document
when looking at setuptools documentation earlier! In hindsight, it appears to
be buried at the figurative bottom of the docs tree as the last item under
a Backward compatibility & deprecated practice section
. Talk about burying
the lede!) These docs aren't useful.
The second link also takes me to deprecated documentation related to direct
python setup.py
command invocations.
The third link is also useless.
I continue opening search results in new tabs. Surely the answer is in here.
I find an Adding Arguments
section telling me that Adding arguments to setup is discouraged as such
arguments are only supported through imperative execution and not supported
through declarative config.
. I think that's an obtuse of saying that
sys.argv
arguments are only supported via python setup.py
invocations
and not via setup.cfg
or pyproject.toml
? But the example only shows me
how to use setup.cfg
and doesn't have any mention of pyproject.toml
. So
is this documentation even relevant to pyproject.toml
?
Eventually I stumble across Build System Support. In the Dynamic build dependencies and other build_meta tweaks section, I notice the following example code:
from setuptools import build_meta as _orig from setuptools.build_meta import * def get_requires_for_build_wheel(config_settings=None): return _orig.get_requires_for_build_wheel(config_settings) + [...] def get_requires_for_build_sdist(config_settings=None): return _orig.get_requires_for_build_sdist(config_settings) + [...]
config_settings=None
. OK, this might be the --config-settings
values passed to the build frontend getting fed into the build backend.
I Google get_requires_for_build_wheel
. One of the top results is
PEP-517, which I click on.
I see that the Build backend interface
consists of a handful of functions that are invoked by the build frontend.
These functions all seem to take a config_settings=None
argument. Great,
now I know the interface between build frontends and backends at the Python
API level. Where was I in this yak shave?
I remember from pyproject.toml
that one of the lines is
build-backend = "setuptools.build_meta:__legacy__"
. That
setuptools.build_meta:__legacy__
bit looks like a Python symbol reference.
Since the setuptools documentation didn't answer my question on how to
thread --config-settings
into setup.py
invocations, I
open the build_meta.py source code.
(Aside: experience has taught me that when in doubt on how something works,
consult the source code: code doesn't lie.)
I search for config_settings
. I immediately see
class _ConfigSettingsTranslator:
whose purported job is
Translate config_settings into distutils-style command arguments.
Only a limited number of options is currently supported.
Oh, this looks
relevant. But there's a fair bit of code in here. Do I really need to grok
it all? I keep scanning the source.
In a def _build_with_temp_dir()
I spot the following code:
sys.argv = [ *sys.argv[:1], *self._global_args(config_settings), *setup_command, "--dist-dir", tmp_dist_dir, *self._arbitrary_args(config_settings), ]
Ahh, cool. It looks to be calling self._global_args()
and
self._arbitrary_args()
and adding the arguments those functions return
to sys.argv
before evaluating setup.py
in the current interpreter.
I look at the definition of _arbitrary_args()
and I'm onto something:
def _arbitrary_args(self, config_settings: _ConfigSettings) -> Iterator[str]: """ Users may expect to pass arbitrary lists of arguments to a command via "--global-option" (example provided in PEP 517 of a "escape hatch"). ... """ args = self._get_config("--global-option", config_settings) global_opts = self._valid_global_options() bad_args = [] for arg in args: if arg.strip("-") not in global_opts: bad_args.append(arg) yield arg yield from self._get_config("--build-option", config_settings) if bad_args: SetuptoolsDeprecationWarning.emit( "Incompatible `config_settings` passed to build backend.", f""" The arguments {bad_args!r} were given via `--global-option`. Please use `--build-option` instead, `--global-option` is reserved for flags like `--verbose` or `--quiet`. """, due_date=(2023, 9, 26), # Warning introduced in v64.0.1, 11/Aug/2022. )
It looks to peek inside config_settings
and handle --global-option
and --build-option
specially. But we clearly see --global-option
is
deprecated in favor of --build-option
.
So is the --config-settings
key name --build-option
and its value
the setup.py
argument we want to insert?
I try that:
$ venv/bin/python -m pip install --config-settings=--build-option=--rust-backend -e . ... $ ls -al zstandard/*cpython-312*.so -rwxr-xr-x 1 gps staff 1002680 Oct 27 12:54 zstandard/_cffi.cpython-312-darwin.so -rwxr-xr-x 1 gps staff 919352 Oct 27 12:53 zstandard/backend_c.cpython-312-darwin.so -rwxr-xr-x 1 gps staff 2727920 Oct 27 12:54 zstandard/backend_rust.cpython-312-darwin.so
It worked!
--config-settings=--build-option=
But, um, --config-settings=--build-option=--rust-backend
. We've triple encoded
command arguments here. This feels exceptionally weird. Is that really the
supported/preferred interface? Surely there's something simpler.
def _arbitrary_args()
's docstring mentioned escape hatch in the context
of PEP-517. I open PEP-517 and search for that term, finding
Config settings. Sure
enough, it is describing the mechanism I just saw the source code to. And its
pip example is using pip install
's --global-option
and --build-option
arguments. So this all seems to check out. (Although these pip arguments are
deprecated in favor of -C/--config-settings
.)
Thinking I missed some obvious documentation, I search the setuptools
documentation for --build-option.
The only hits are in the v64.0.0 changelog entry.
So you are telling me this feature of passing arbitrary config settings into
setup.py
via PEP-517 build frontends is only documented in the changelog?!
Ok, I know my setup.py
is abusing sys.argv
. I'm off the paved road for
passing settings into setup.py
. What is the preferred pyproject.toml
era mechanism for passing settings into setup.py
? These settings can't
be file based because they are dynamic. There must be a config_settings
mechanism to thread dynamic settings into setup.py
that doesn't rely on
these magical --build-option
and --global-option
settings keys.
I stare and stare at the
build_meta.py source code
looking for find an answer. But all I see is the def _build_with_temp_dir()
calling into self._global_args()
and self._arbitrary_args()
to append
arguments to sys.argv
. Huh? Surely this isn't the only solution. Surely
there's a simpler way. The setuptools documentation said Adding arguments
to setup is discouraged, seemingly implying a better way of doing it. And
yet the only code I'm seeing in build_meta.py
for passing custom
config_settings
values in is literally via additional setup.py
process
arguments. This can't be right.
I start unwinding my mental stack and browser tabs trying to come across something I missed.
I again look at
Dynamic build dependencies and other build_meta tweaks
and see its code is defining a custom [build-system]
backend that
does a from setuptools.build_meta import *
and defines some custom
build backend interface APIs (which receive config_settings
) and then
proxy into the original implementations. While the example is related to
build metadata, I'm thinking do I need to implement my own setuptools
wrapping build backend that implements a custom
def build_wheel() to
intercept config_settings
? Surely this is avoidable complexity.
I keep unwinding context and again notice pip's warning message telling
me A possible replacement is to use --config-settings
. Discussion can
be found at https://github.com/pypa/pip/issues/11859.
I open pip issue #11859.
Oh, that's the same issue tracking the --global-option
deprecation I
encountered earlier. I again scan the issue timeline. It is mostly
references from other GitHub projects. Telltale sign that this
deprecation is creating waves.
The issue is surprisingly light on comments for how many references it has.
The comment with the most emoji reactions says:
Is there an example showing how to use --config-settings with setup.py and/or newer alternatives? The setuptools documentation is awful and the top search results are years/decades out-of-date and wildly contradictory.`
I don't know who you are, @alexchandel, but we're on the same wavelength.
Then the next comment says:
Something like this seems to work to pass global options to setuptools. pip -vv install --config-setting="--global-option=--verbose" . Passing --build-option in the same way does not work, as setuptools attempts to pass these to the egg_info command where they are not supported.
So there it seemingly is, confirmation that my independently derived
solution of --config-settings=--build-option=-...
is in fact the way to
go. But this commenter says to use --global-option
, which appears to
be deprecated in modern setuptools. Oof.
The next comment links to pypa/setuptools#3896
where apparently there's been an ongoing conversation since April about how
setuptools should design and document a stable mechanism to pass config_settings
to PEP517 backend.
If I'm interpreting this correctly, it looks like distutils/setuptools - the
primary way to define Python packages for the better part of twenty years -
doesn't have a stable mechanism for passing configuration settings from
modern pyproject.toml
[build-system]
frontends. Meanwhile pip is deprecating
long-working mechanisms to pass options to setup.py
and forcing people to
use a mechanism that setuptools doesn't explicitly document much less say is
stable. This is all taking place six years
after PEP-517 was accepted.
I'm kind of at a loss for words here. I understand pip's desire to delete some
legacy code and standardize on the new way of doing things. But it really looks
like they are breaking backwards compatibility for setup.py
a bit too
eagerly. That's a questionable decision in my mind, so I
write a detailed comment on the pip issue
explaining how the interface works and asking the pip folks to hold off on
deprecation until setuptools has a stable, documented solution. Time will
tell what happens.
What an adventure that Python packaging yak shave was! I feel
like I just learned a whole lot of things that I shouldn't have needed to learn
in order to keep my Python package building without deprecation warnings.
Yes, I scope bloated myself to understanding how things worked because
that's my ethos. But even without that extra work, there's a lot here that I
feel I shouldn't have needed to do, like figure out the undocumented
--config-settings=--build-option=
interface.
Despite having ported my python setup.py
invocation to modern, PEP-517 build
frontends (build
and pip
) and gotten rid of various deprecation messages
and warnings, I'm still not sure the implications of that transition. I really
want to understand the trade-offs for adopting pyproject.toml
and using the
modern build frontends for doing things. But I couldn't find any documentation
on this anywhere! I don't know basic things like whether my adoption of
pyproject.toml
will break end-users stuck on older Python versions or what.
I still haven't ported my project metadata from setup.py
to pyproject.toml
because I don't understand the implications. I feel like I'm flying blind and
am bound to make mistakes with undesirable impacts to end-users of my package.
But at least I was able to remove deprecation warnings from my packaging CI with just several hours of work.
I recognize this post is light on constructive feedback and suggestions for how to improve matters.
One reason is that I think a lot of the improvements are self-explanatory - clearer warning messages, better documentation, not deprecating things prematurely, etc. I prefer to just submit PRs instead of long blog posts. But I just don't know what is appropriate in some cases: one of the themes of this post is I just don't grok the state of Python packaging right now.
This post did initially contain a few thousand words expanding on what all I thought was broken and how it should be fixed. But I stripped the content because I didn't want my (likely controversial) opinions to distract from the self-assessed user experience study documented in this post. This content is probably better posted to a PyPA mailing list anyway, otherwise I'm just another guy complaining on the Internet.
I've posted a link
to this post to the
packaging category on
discuss.python.org so the PyPA (and other
subscribed parties) are aware of all the issues I stumbled over. Hopefully
people with more knowledge of the state of Python packaging see this post,
empathize with my struggles, and enact meaningful improvements so others
can port off setup.py
with a fraction of the effort as it took me.
apple-codesign
crate / library and its rcodesign
CLI executable.
(Documentation /
GitHub project /
crates.io).
As of that most recent post in April, I was pretty happy with the relative
stability of the implementation: we were able to sign, notarize, and staple
Mach-O binaries, directory bundles (.app
, .framework
bundles, etc), XAR
archives / flat packages / .pkg
installers, and DMG disk images. Except for
the known limitations,
if Apple's official codesign
and notarytool
tools support it, so do we.
This allows people to sign, notarize, and release Apple software from non-Apple
operating systems like Linux and Windows. This opens up new avenues for
Apple platform access.
A major limitation in previous versions of the apple-codesign
crate was our
reliance on Apple's Transporter
tool for notarization. Transporter is a Java application made available for macOS,
Linux, and Windows that speaks to Apple's servers and can upload assets to their
notarization service. I used this tool at the time because it seemed to
be officially supported by Apple and the path of least resistance to standing
up notarization. But Transporter was a bit wonky to use and an extra
dependency that you needed to install.
At WWDC 2022, Apple announced
a new Notary API as
part of the App Store Connect API. In what felt like a wink directly at me,
Apple themselves even calls out the possibility for leveraging this API to
notarize from Linux! I knew as soon as I saw this that it was only a matter
of time before I would be able to replace Transporter with a pure Rust client
for the new HTTP API. (I was already thinking about using the unpublished HTTP
API that notarytool
uses. And from the limited reversing notes I have from
before WWDC it looks like the new official Notary API is very similar - possibly
identical to - what notarytool
uses. So kudos to Apple for opening up this
access!)
I'm very excited to announce that we now have a pure Rust implementation
of a client for Apple's Notary API in the apple-codesign
crate. This means we
can now notarize Apple software from any machine where you can get the Rust
crate to compile. This means we no longer have a dependency on the 3rd party
Apple Transporter application. Notarization, like code signing, is 100% open
source Rust code.
As excited as I am to announce this new feature, I'm even more excited that it was largely implemented by a contributor, Robin Lambertz / @roblabla! They filed a GitHub feature request while WWDC 2022 was still ongoing and then submitted a PR a few days later. It took me a few months to get around to reviewing it (I try to avoid computer screens during summers), but it was a fantastic PR given the scope of the change. It never ceases to bring joy to me when someone randomly contributes greatness to open source.
So, as of the just-released 0.17 release
of the apple-codesign
Rust crate and its corresponding rcodesign
CLI tool, you can now
rcodesign notary-submit
to speak to Apple's Notary API using a pure Rust client. No
more requirements on 3rd party, proprietary software. All you need to sign and
notarize Apple applications is the self-contained rcodesign
executable and a Linux,
Windows, macOS, BSD, etc machine to run it on.
I'm stoked to finally achieve this milestone! There are probably thousands of companies and individuals who have wanted to release Apple software from non-macOS operating systems. (The existence and popularity of tools like fastlane seems to confirm this.) The historical lack of an Apple code signing and notarization solution that worked outside macOS has prevented this. Well, that barrier has officially fallen.
Release notes, documentation, and (self-signed) pre-built executables of the
rcodesign
executable for major platforms are available on the
0.17 release page.
(Yes, I used my pure Rust Apple code signing implementation to remotely sign the macOS binaries from GitHub Actions using a YubiKey plugged into my Windows desktop: that experience still feels magical to me.)
PyOxy is all of the following:
python
driver providing more control over the
interpreter than what python
itself provides.Read the following sections for more details.
pyoxy
Acts Like python
The pyoxy
executable has a run-python
sub-command that will essentially
do what python
would do:
$ pyoxy run-python
Python 3.9.12 (main, May 3 2022, 03:29:54)
[Clang 14.0.3 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
A Python REPL. That's familiar!
You can even pass python
arguments to it:
$ pyoxy run-python -- -c 'print("hello, world")'
hello, world
When a pyoxy
executable is renamed to any filename beginning with python
,
it implicitly behaves like pyoxy run-python --
.
$ mv pyoxy python3.9
$ ls -al python3.9
-rwxrwxr-x 1 gps gps 120868856 May 10 2022 python3.9
$ ./python3.9
Python 3.9.12 (main, May 3 2022, 03:29:54)
[Clang 14.0.3 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
The official pyoxy
executables are built with PyOxidizer and leverage the
Python distributions provided by my
python-build-standalone
project. On Linux and macOS, a fully featured Python interpreter and its library
dependencies are statically linked into pyoxy
. The pyoxy
executable also embeds
a copy of the Python standard library and imports it from memory using the
oxidized_importer
Python extension module.
What this all means is that the official pyoxy
executables can function as
single file CPython distributions! Just download a pyoxy
executable, rename it
to python
, python3
, python3.9
, etc and it should behave just like a normal
python
would!
Your Python installation has never been so simple. And fast: pyoxy
should be
a few milliseconds faster to initialize a Python interpreter mostly because of
oxidized_importer
and it avoiding filesystem overhead to look for and load
.py[c]
files.
The pyoxy run-yaml
command is takes the path to a YAML file defining the
embedded Python interpreter configuration and then launches that Python
interpreter in-process:
$ cat > hello_world.yaml <<EOF
---
allocator_debug: true
interpreter_config:
run_command: 'print("hello, world")'
...
EOF
$ pyoxy run-yaml hello_world.yaml
hello, world
Under the hood, PyOxy uses the
pyembed Rust crate to manage embedded
Python interpreters. The YAML document that PyOxy uses is simply deserialized
into a
pyembed::OxidizedPythonInterpreterConfig
Rust struct, which pyembed
uses to spawn a Python interpreter. This Rust struct
offers near complete control over how the embedded Python interpreter behaves: it
even allows you to tweak settings that are impossible to change from environment
variables or python
command arguments! (Beware: this power means you can
easily cause the interpreter to crash if you feed it a bad configuration!)
pyoxy run-yaml
ignores all file content before the YAML ---
start document
delimiter. This means that on UNIX-like platforms
you can create executable YAML files defining your Python application. e.g.
$ mkdir -p myapp
$ cat > myapp/__main__.py << EOF
print("hello from myapp")
EOF
$ cat > say_hello <<"EOF"
#!/bin/sh
"exec" "`dirname $0`/pyoxy" run-yaml "$0" -- "$@"
---
interpreter_config:
run_module: 'myapp'
module_search_paths: ["$ORIGIN"]
...
EOF
$ chmod +x say_hello
$ ./say_hello
hello from myapp
This means that to distribute a Python application, you can drop a copy
of pyoxy
in a directory then define an executable YAML file masquerading
as a shell script and you can run Python code with as little as two files!
PyOxy is very young. I hacked it together on a weekend in September 2021. I wanted to shore up some functionality before releasing it then. But I got perpetually sidetracked and never did the work. I figured it would be better to make a smaller splash with a lesser-baked product now than wait even longer. Anyway...
As part of building PyOxidizer I've built some peripheral technology:
oxidized_importer
).I conceived PyOxy as a vehicle to enable people to leverage PyOxidizer's technology without imposing PyOxidizer onto them. I feel that PyOxidizer's broader technology is generally useful and too valuable to be gated behind using PyOxidizer.
PyOxy is only officially released for Linux and macOS for the moment.
It definitely builds on Windows. However, I want to improve the single file
executable experience before officially releasing PyOxy on Windows. This
requires an extensive overhaul to oxidized_importer
and the way it
serializes Python resources to be loaded from memory.
I'd like to add a sub-command to produce a
Python packed resources
payload. With this, you could bundle/distribute a Python application as
pyoxy
plus a file containing your application's packed resources alongside
YAML configuring the Python interpreter. Think of this as a more modern and
faster version of the venerable zipapp
approach. This would enable PyOxy to
satisfy packaging scenarios provided by tools like Shiv, PEX, and XAR.
However, unlike Shiv and PEX, pyoxy
also provides an embedded Python
interpreter, so applications are much more portable since there isn't
reliance on the host machine having a Python interpreter installed.
I'm really keen to see how others want to use pyoxy
.
The YAML based control over the Python interpreter could be super useful for testing, benchmarking, and general Python interpreter configuration experimentation. It essentially opens the door to things previously only possible if you wrote code interfacing with Python's C APIs.
I can also envision tools that hide the existence of Python wanting to
leverage the single file Python distribution property of pyoxy
. For
example, tools like Ansible could copy pyoxy
to a remote machine to provide
a well-defined Python execution environment without having to rely on what
packages are installed. Or pyoxy
could be copied into a container or
other sandboxed/minimal environment to provide a Python interpreter.
And that's PyOxy. I hope you find it useful. Please file any bug reports or feature requests in PyOxidizer's issue tracker.
]]>But first, some background on why we're here.
(Skip this section if you just want to get to the technical bits.)
Apple runs some of the largest and most profitable software application ecosystems in existence. Gaining access to these ecosystems has traditionally required the use of macOS and membership in the Apple Developer Program.
For the most part this makes sense: if you want to develop applications for Apple operating systems you will likely utilize Apple's operating systems and Apple's official tooling for development and distribution. Sticking to the paved road is a good default!
But many people want more... flexibility. Open source developers, for example, often want to distribute cross-platform applications with minimal effort. There are entire programming language ecosystems where the operating system you are running on is abstracted away as an implementation detail for many applications. By creating a de facto requirement that macOS, iOS, etc development require the direct access to macOS and (often above market priced) Apple hardware, the distribution requirements imposed by Apple's software ecosystems are effectively exclusionary and prevent interested parties from contributing to the ecosystem.
One of the aspects of software distribution on Apple platforms that trips a lot of people up is code signing and notarization. Essentially, you need to:
Historically, these steps required Apple proprietary software run exclusively from macOS. This means that even if you are in a software ecosystem like Rust, Go, or the web platform where you can cross-compile apps without direct access to macOS (testing is obviously a different story), you would still need macOS somewhere if you wanted to sign and notarize your application. And signing and notarization is effectively required on macOS due to default security settings. On mobile platforms like iOS, it is impossible to distribute applications that aren't signed and notarized unless you are running a jailbreaked device.
A lot of people (myself included) have grumbled at these requirements. Why should I be forced to involve an Apple machine as part of my software release process if I don't need macOS to build my application? Why do I have to go through a convoluted dance to sign and notarize my application at release time - can't it be more streamlined?
When I looked at this space last year, I saw some obvious inefficiencies and room to improve. So as I said then, I foolishly set out to reimplement Apple code signing so developers would have more flexibility and opportunity for distributing applications to Apple's ecosystems.
The ultimate goal of this work is to expand Apple ecosystem access to more developers. A year later, I believe I'm delivering a product capable of doing this.
Foremost, I'm excited to announce release of
rcodesign 0.14.0.
This is the first time I'm publishing pre-built binaries (Linux, Windows, and macOS)
of rcodesign
. This reflects my confidence in the relative maturity of the
software.
In case you are wondering, yes, the macOS rcodesign
executable is self-signed:
it was signed by a GitHub Actions Linux runner using a code signing certificate
exclusive to a YubiKey. That YubiKey was plugged into a Windows 11 desktop next to
my desk. The rcodesign
executable was not copied between machines as part of the
signing operation. Read on to learn about the sorcery that made this possible.
A lot has changed in the apple-codesign project / Rust crate in the last year! Just look at the changelog!
The project was renamed from tugger-apple-codesign
.
(If you installed via cargo install
, you'll need to
cargo install --force apple-codesign
to force Cargo to overwrite the rcodesign
executable with one from a different crate.)
The rcodesign
CLI executable is still there and more powerful than ever.
You can still sign Apple applications from Linux, Windows, macOS, and any other
platform you can get the Rust program to compile on.
There is now Sphinx documentation for the project. This is published on readthedocs.io alongside PyOxidizer's documentation (because I'm using a monorepo). There's some general documentation in there, such as a guide on how to selectively bypass Gatekeeper by deploying your own alternative code signing PKI to parallel Apple's. (This seems like something many companies would want but for whatever reason I'm not aware of anyone doing this - possibly because very few people understand how these systems work.)
There are bug fixes galore. When I look back at the state of rcodesign
when I first blogged about it, I think of how naive I was. There were a myriad
of applications that wouldn't pass notarization because of a long tail of bugs.
There are still known issues. But I believe many applications will
successfully sign and notarize now. I consider failures novel and worthy of
bug reports - so please report them!
Read on to learn about some of the notable improvements in the past year (many of them occurring in the last two months).
.pkg
InstallersWhen I announced this project last year, only Mach-O binaries and trivially
simple .app
bundles were signable. And even then there were a ton of subtle
issues.
rcodesign sign
can now sign more complex bundles, including many nested
bundles. There are reports of iOS app bundles signing correctly! (However, we
don't yet have good end-user documentation for signing iOS apps. I will gladly
accept PRs to improve the documentation!)
The tool also gained support for signing .dmg
disk image files and .pkg
flat package installers.
Known limitations with signing are now documented in the Sphinx docs.
I believe rcodesign
now supports signing all the major file formats used
for Apple software distribution. If you find something that doesn't sign
and it isn't documented as a known issue with an existing GitHub issue tracking
it, please report it!
Apple publishes a Java tool named Transporter that enables you to upload artifacts to Apple for notarization. They make this tool available for Linux, Windows, and of course macOS.
While this tool isn't open source (as far as I know), usage of this tool enables you to notarize from Linux and Windows while still using Apple's official tooling for communicating with their servers.
rcodesign
now has support for invoking Transporter and uploading artifacts
to Apple for notarization. We now support notarizing bundles, .dmg
disk
images, and .pkg
flat installer packages. I've successfully notarized all
of these application types from Linux.
(I'm capable of implementing
an alternative uploader in pure Rust but without assurances that Apple won't
bring down the ban hammer for violating terms of use, this is a bridge I'm
not yet willing to cross. The requirement to use Transporter is literally the
only thing standing in the way of making rcodesign
an all-in-one single
file executable tool for signing and notarizing Apple software and I really
wish I could deliver this user experience win without reprisal.)
With support for both signing and notarizing all application types, it is now possible to release Apple software without macOS involved in your release process.
I try to use my YubiKeys as much as possible because a secret or private key stored on a YubiKey is likely more secure than a secret or private key sitting around on a filesystem somewhere. If you hack my machine, you can likely gain access to my private keys. But you will need physical access to my YubiKey and to compel or coerce me into unlocking it in order to gain access to its private keys.
rcodesign
now has support for using YubiKeys for signing operations.
This does require an off-by-default smartcard
Cargo feature. So if
building manually you'll need to e.g.
cargo install --features smartcard apple-codesign
.
The YubiKey integration comes courtesy of the amazing
yubikey Rust crate. This crate will speak
directly to the smartcard APIs built into macOS and Windows. So if you have an
rcodesign
build with YubiKey support enabled, YubiKeys should
just work. Try it by plugging in your YubiKey and running
rcodesign smartcard-scan
.
YubiKey integration has its own documentation.
I even implemented some commands to make it easy to manage the code signing
certificates on your YubiKey. For example, you can run
rcodesign smartcard-generate-key --smartcard-slot 9c
to generate a new private
key directly on the device and then
rcodesign generate-certificate-signing-request --smartcard-slot 9c --csr-pem-path csr.pem
to export that certificate to a Certificate Signing Request (CSR), which you can
exchange for an Applie-issued signing certificate at developer.apple.com. This
means you can easily create code signing certificates whose private key was
generated directly on the hardware device and can never be exported.
Generating keys this way is widely considered to be more secure than storing
keys in software vaults, like Apple's Keychains.
The feature I'm most excited about is what I'm calling remote code signing.
Remote code signing allows you to delegate the low-level cryptographic signature operations in code signing to a separate machine.
It's probably easiest to just demonstrate what it can do.
Earlier today I signed a macOS universal Mach-O executable from a GitHub-hosted Linux GitHub Actions runner using a YubiKey physically attached to the Windows 11 machine next to my desk at home. The signed application was not copied between machines.
Here's how I did it.
I have a GitHub Actions workflow that calls rcodesign sign --remote-signer
.
I manually triggered that workflow and started watching the near real time
job output with my browser. Here's a screenshot of the job logs:
rcodesign sign --remote-signer
prints out some instructions (including a
wall of base64 encoded data) for what to do next. Importantly, it requests that
someone else run rcodesign remote-sign
to continue the signing process.
And here's a screenshot of me doing that from the Windows terminal:
This log shows us connecting and authenticating with the YubiKey along with some status updates regarding speaking to a remote server.
Finally, here's a screenshot of the GitHub Actions job output after I ran that command on my Windows machine:
Remote signing enabled me to sign a macOS application from a GitHub Actions runner operated by GitHub while using a code signing certificate securely stored on my YubiKey plugged into a Windows machine hundreds of kilometers away from the GitHub Actions runner. Magic, right?
What's happening here is the 2 rcodesign
processes are communicating
with each other via websockets bridged by a central relay server.
(I operate a
default server free of charge.
The server is open source and a Terraform module is available if you want
to run your own server with hopefully just a few minutes of effort.)
When the initiating machine wants to create a signature, it sends a
message back to the signer requesting a cryptographic signature. The
signature is then sent back to the initiator, who incorporates it.
I designed this feature with automated releases from CI systems (like GitHub Actions) in mind. I wanted a way where I could streamline the code signing and release process of applications without having to give a low trust machine in CI ~unlimited access to my private signing key. But the more I thought about it the more I realized there are likely many other scenarios where this could be useful. Have you ever emailed or Dropboxed an application for someone else to sign because you don't have an Apple issued code signing certificate? Now you have an alternative solution that doesn't require copying files around! As long as you can see the log output from the initiating machine or have that output communicated to you (say over a chat application or email), you can remotely sign files on another machine!
At this point, I'm confident the more security conscious among you have been grimacing for a few paragraphs now. Websockets through a central server operated by a 3rd party?! Giving remote machines access to perform code signing against arbitrary content?! Your fears and skepticism are 100% justified: I'd be thinking the same thing!
I fully recognize that a service that facilitates remote code signing makes for a very lucrative attack target! If abused, it could be used to coerce parties with valid code signing certificates to sign unwanted code, like malware. There are many, many, many wrong ways to implement such a feature. I pondered for hours about the threat modeling and how to make this feature as secure as possible.
Remote Code Signing Design and Security Considerations captures some of my high level design goals and security assessments. And Remote Code Signing Protocol goes into detail about the communications protocol, including the crypto (actual cryptography, not the fad) involved. The key takeaways are the protocol and server are designed such that a malicious server or man-in-the-middle can not forge signature requests. Signing sessions expire after a few minutes and 3rd parties (or the server) can't inject malicious messages that would result in unwanted signatures. There is an initial handshake to derive a session ephemeral shared encryption key and from there symmetric encryption keys are used so all meaningful messages between peers are end-to-end encrypted. About the worst a malicious server could do is conduct a denial of service. This is by design.
As I argue in Security Analysis in the Bigger Picture, I believe that my implementation of remote signing is more secure than many common practices because common practices today entail making copies of private keys and giving low trust machines (like CI workers) access to private keys. Or files are copied around without cryptographic chain-of-custody to prove against tampering. Yes, remote signing introduces a vector for remote access to use signing keys. But practiced as I intended, remote signing can eliminate the need to copy private keys or grant ~unlimited access to them. From a threat modeling perspective, I think the net restriction in key access makes remote signing more secure than the private key management practices by many today.
All that being said, the giant asterisk here is I implemented my own cryptosystem to achieve end-to-end message security. If there are bugs in the design or implementation, that cryptosystem could come crashing down, bringing defenses against message forgery with it. At that point, a malicious server or privileged network actor could potentially coerce someone into signing unwanted software. But this is likely the extent of the damage: an offline attack against the signing key should not be possible since signing requires presence and since the private key is never transmitted over the wire. Even without the end-to-end encryption, the system is arguably more secure than leaving your private key lingering around as an easily exfiltrated CI secret (or similar).
(I apologize to every cryptographer I worked with at Mozilla who beat into me the commandment that thou shall not roll their own crypto: I have sinned and I feel remorseful.)
Cryptography is hard. And I'm sure I made plenty of subtle mistakes. Issue #552 tracks getting an audit of this protocol and code performed. And the aforementioned protocol design docs call out some of the places where I question decisions I've made.
If you would be interested in doing a security review on this feature, please get in touch on issue #552 or send me an email. If there's one immediate outcome I'd like from this blog post it would be for some white hat^Hknight to show up and give me peace of mind about the cryptosystem implementation.
Until then, please assume the end-to-end encryption is completely flawed. Consider asking someone with security or cryptographer in their job title for their opinion on whether this feature is safe for you to use. Hopefully we'll get a security review done soon and this caveat can go away!
If you do want to use this feature, Remote Code Signing contains some usage documentation, including how to use it with GitHub Actions. (I could also use some help productionizing a reusable GitHub Action to make this more turnkey! Although I'm hesitant to do it before I know the cryptosystem is sound.)
That was a long introduction to remote code signing. But I couldn't in good faith present the feature without addressing the security aspect. Hopefully I didn't scare you away! Traditional / local signing should have no security concerns (beyond the willingness to run software written by somebody you probably don't know, of course).
As of today's 0.14 release we now have early support for signing with code signing certificates stored in Apple Keychains! If you created your Apple code signing certificates in Keychain Access or Xcode, this is probably where you code signing certificates live.
I held off implementing this for the longest time because I didn't perceive
there to be a benefit: if you are on macOS, just use Apple's official tooling.
But with rcodesign
gaining support for remote code signing and some other
features that could make it a compelling replacement for Apple tooling on
all platforms, I figured we should provide the feature so we stop discouraging
people to export private keys from Keychains.
This integration is very young and there's still a lot that can be done, such as automatically using an appropriate signing certificate based on what you are signing. Please file feature request issues if there's a must-have feature you are missing!
Apple's code signing is complex. It is easy for there to be subtle differences
between Apple's tooling and rcodesign
.
rcodesign
now has print-signature-info
and diff-signatures
commands to
dump and compare YAML metadata pertinent to code signing to make it easier to
compare behavior between code signing implementations and even multiple
signing operations.
The documentation around debugging and reporting bugs now emphasizes using these tools to help identify bugs.
I now believe rcodesign
to be generally usable. I've thrown a lot of
random software at it and I feel like most of the big bugs and major missing
features are behind us.
But I also feel it hasn't yet received wide enough attention to have confidence in that assessment.
If you want to help the development of this tool, the most important actions you can take are to attempt signing / notarization operations with it and report your results.
Does rcodesign
spark joy? Please leave a comment in the
GitHub discussion for the latest release!
Does rcodesign
not work? I would very much appreciate a bug report!
Details on how to file good bugs are
in the docs.
Have general feedback? UI is confusing? Documentation is insufficient? Leave a comment in the aforementioned discussion. Or create a GitHub issue if you think it is actionable. I can't fix what I don't know about!
Have private feedback? Send me an email.
I could write thousands of words about all I learned from hacking on this project.
I've learned way too much about too many standards and specifications in the
crypto space. RFCs 2986, 3161, 3280, 3281, 3447, 4210, 4519, 5280, 5480,
5652, 5869, 5915, 5958, and 8017 plus probably a few more. How cryptographic
primitives are stored and expressed: ASN.1, OIDs, BER, DER, PEM, SPKI,
PKCS#1, PKCS#8. You can show me the raw parse tree for an ASN.1 data structure
and I can probably tell you what RFC defines it. I'm not proud of this. But
I will say actually knowing what every field in an X.509 certificate does
or the many formats that cryptographic keys are expressed in seems empowering.
Before, I would just search for the openssl
incantation to do something.
Now, I know which ASN.1 data structures are involved and how to manipulate
the fields within.
I've learned way too much around minutia around how Apple code signing actually works. The mechanism is way too complex for something in the security space. There was at least one high profile Gatekeeper bug in the past year allowing improperly signed code to run. I suspect there will be more: the surface area to exploit is just too large.
I think I'm proud of building an open source implementation of Apple's code signing. To my knowledge nobody else has done this outside of Apple. At least not to the degree I have. Then factor in that I was able to do this without access (or willingness) to look at Apple source code and much of the progress was achieved by diffing and comparing results with Apple's tooling. Hours of staring at diffoscope and comparing binary data structures. Hours of trying to find the magical settings that enabled a SHA-1 or SHA-256 digest to agree. It was tedious work for sure. I'll likely never see a financial return on the time equivalent it took me to develop this software. But, I suppose I can nerd brag that I was able to implement this!
But the real reward for this work will be if it opens up avenues to more (open source) projects distributing to the Apple ecosystems. This has historically been challenging for multiple reasons and many open source projects have avoided official / proper distribution channels to avoid the pain (or in some cases because of philosophical disagreements with the premise of having a walled software garden in the first place). I suspect things will only get worse, as I feel it is inevitable Apple clamps down on signing and notarization requirements on macOS due to the rising costs of malware and ransomware. So having an alternative, open source, and multi-platform implementation of Apple code signing seems like something important that should exist in order to provide opportunities to otherwise excluded developers. I would be humbled if my work empowers others. And this is all the reward I need.
]]>So, I built Linux Package Analyzer to facilitate answering questions like this.
Linux Package Analyzer is a Rust crate providing the lpa
CLI tool. lpa
currently
supports importing Debian and RPM package repositories (the most popular Linux
packaging formats) into a local SQLite database so subsequent analysis can be
efficiently performed offline. In essence:
lpa import-debian-repository
or lpa import-rpm-repository
and point
the tool at the base URL of a Linux package repository..deb
and .rpm
files are downloaded.The LPA-built database currently stores the following:
.dynamic
section, etc).Using a command like lpa import-debian-repository --components main,multiverse,restricted,universe
--architectures amd64 http://us.archive.ubuntu.com/ubuntu impish
, I can import
the (currently) ~96 GB of package data from 63,720 packages defining Ubuntu 21.10
to a local ~12 GB SQLite database and answer tons of random questions. Interesting
insights yielded so far include:
usr/lib/x86_64-linux-gnu/libmshr.so.2019.2.0.dev0
provided by
the libmshr2019.2
package.strlen
appears to be the most
(recognized) widely used libc symbol. It even bests memcpy
(52,726) and
free
(42,603).MOV
is the most frequent x86 instruction, followed by CALL
. (I could
write an entire blog post about observations about x86 instruction use.)There's a trove of data in the SQLite database and the lpa
commands only
expose a fraction of it. I reckon a lot of interesting tweets, blog posts,
research papers, and more could be derived from the data that lpa
assembles.
lpa
does all of its work in-process using pure Rust. The Debian and RPM
repository interaction is handled via the
debian-packaging and
rpm-repository crates (which I
wrote). ELF file parsing is handled by the (amazing)
object crate. And x86 disassembling via
the iced-x86 crate. Many tools similar
to lpa
call out to other processes to interface with .deb
/.rpm
packages,
parse ELF files, disassemble x86, etc. Doing this in pure Rust makes life so
much simpler as all the functionality is self-contained and I don't have to
worry about run-time dependencies for random tools. This means that lpa
should just work from Windows, macOS, and other non-Linux environments.
Linux Package Analyzer is very much in its infancy. And I don't really have a grand vision for it. (I built it and some of the packaging code it is built on) in support of some even grander projects I have cooking.) Please file bugs, feature requests, and pull requests in GitHub. The project is currently part of the PyOxidizer repo (because I like monorepos). But I may pull it and other os/packaging/toolchain code into a new monorepo since target audiences are different.
I hope others find this tool useful!
]]>apt
in their name to manage
system packages? If so, your system packages are using Debian packaging.
Most tools interfacing with Debian packages (.deb
files) and repositories
use functionality provided by the apt
repository. This repository provides libraries like libapt
as well as
tools like apt-get
and apt
. Most of the functionality is implemented in
C++.
I wanted to raise awareness that I've begun implementing Debian packaging
primitives in pure Rust. The debian-packaging
crate is
published on crates.io. For
now, it is developed inside the
PyOxidizer repository (because I
like monorepos).
So far, a handful of useful functionality is implemented:
.deb
files.Hopefully the documentation contains all you would want to know for how to use the crate.
The crate is designed to be used as a library so any Rust program can (hopefully) easily tap the power of the Debian packaging ecosystem.
As with most software, there are likely several bugs and many features not yet
implemented. But I have bulk downloaded the entirety of some distribution's
repositories without running into obvious parse/reading failures. So I'm
reasonably confident that important parts of the code (like control file parsing,
repository indices file handling, and .deb
file reading) work as advertised.
Hopefully someone out there finds this work useful!
]]>Here are my reasons for not using Git LFS.
Git LFS was developed outside the official Git project to fulfill a very real market need that Git didn't/doesn't handle large files very well.
I believe it is inevitable that Git will gain better support for handling of large files, as this seems like a critical feature for a popular version control tool.
If you make this long bet, LFS is only an interim solution and its value proposition disappears after Git has better native support for large files.
LFS as a stop gap solution would be tolerable except for the fact that...
The adoption or removal of Git LFS in a repository is an irreversible decision that requires rewriting history and losing your original commit SHAs.
In some contexts, rewriting history is tolerable. In many others, it is an extremely expensive proposition. My experience maintaining version control in professional contexts aligns with the opinion that rewriting history is expensive and should only be considered a measure of last resort. Maybe if tools made it easier to rewrite history without the negative consequences (e.g. GitHub would redirect references to old SHA1 in URLs and API calls) I would change my opinion here. Until that day, the drawbacks of losing history are just too high to stomach for many.
The reason adoption or removal of LFS is irreversible is due to the way Git LFS works. What LFS does is change the blob content that a Git commit/tree references. Instead of the content itself, it stores a pointer to the content. At checkout and commit time, LFS blobs/records are treated specially via a mechanism in Git that allows content to be rewritten as it moves between Git's core storage and its materialized representation. (The same filtering mechanism is responsible for normalizing line endings in text files. Although that feature is built into the core Git product and doesn't work exactly the same way. But the principles are the same.)
Since the LFS pointer is part of the Merkle tree that a Git commit derives from, you can't add or remove LFS from a repo without rewriting existing Git commit SHAs.
I want to explicitly call out that even if a rewrite is acceptable in the short term, things may change in the future. If you adopt LFS today, you are committing to a) running an LFS server forever b) incurring a history rewrite in the future in order to remove LFS from your repo, or c) ceasing to provide an LFS server and locking out people from using older Git commits. I don't think any of these are great options: I would prefer if there were a way to offboard from LFS in the future with minimal disruption. This is theoretically possible, but it requires the Git core product to recognize LFS blobs/records natively. There's no guarantee this will happen. So adoption of Git LFS is a one way door that can't be easily reversed.
LFS is more complex for Git end users.
Git users have to install, configure, and sometimes know about the existence of Git LFS. Version control should just work. Large file handling should just work. End-users shouldn't have to care that large files are handled slightly differently from small files.
The usability of Git LFS is generally pretty good. However, there's an upper limit on that usability as long as LFS exists outside the core Git product. And LFS will likely never be integrated into the core Git product because the Git maintainers know that LFS is only a stop gap solution. They would rather solve large files storage correctly than ~forever carry the legacy baggage of having to support LFS in the core product.
LFS is more complexity for Git server operators as well. Instead of a self-contained Git repository and server to support, you now have to support a likely separate HTTP server to facilitate LFS access. This isn't the hardest thing in the world, especially since we're talking about key-value blob storage, which is arguably a solved problem. But it's another piece of infrastructure to support and secure and it increases the surface area of complexity instead of minimizing it. As a server operator, I would much prefer if the large file storage were integrated into the core Git product and I simply needed to provide some settings for it to just work.
Since I'm a maintainer of the Mercurial version control tool, I thought I'd throw out how Mercurial handles large file storage better than Git. Mercurial's large file handling isn't great, but I believe it is strictly better with regards to the trade-offs of adopting large file storage.
In Mercurial, use of LFS is a dynamic feature that server/repo operators can choose to enable or disable whenever they want. When the Mercurial server sends file content to a client, presence of external/LFS storage is a flag set on that file revision. Essentially, the flag says the data you are receiving is an LFS record, not the file content itself and the client knows how to resolve that record into content.
Conceptually, this is little different from Git LFS records in terms of content resolution. However, the big difference is this flag is part of the repository interchange data, not the core repository data as it is with Git. Since this flag isn't part of the Merkle tree used to derive the commit SHA, adding, removing, or altering the content of the LFS records doesn't require rewriting commit SHAs. The tracked content SHA - the data now stored in LFS - is still tracked as part of the Merkle tree, so the integrity of the commit / repository can still be verified.
In Mercurial, the choice of whether to use LFS and what to use LFS for is made by the server operator and settings can change over time. For example, you could start with no use of LFS and then one day decide to use LFS for all file revisions larger than 10 MB. Then a year later you lower that to all revisions larger than 1 MB. Then a year after that Mercurial gains better native support for large files and you decide to stop using LFS altogether.
Also in Mercurial, it is possible for clients to push a large file inline as part of the push operation. When the server sees that large file, it can be like this is a large file: I'm going to add it to the blob store and advertise it as LFS. Because the large file record isn't part of the Merkle tree, you can have nice things like this.
I suspect it is only a matter of time before Git's wire protocol learns the ability to dynamically advertise remote servers for content retrieval and this feature will be leveraged for better large file handling. Until that day, I suppose we're stuck with having to rewrite history with LFS and/or funnel large blobs through Git natively, with all the pain that entails.
This post summarized reasons to avoid Git LFS. Are there justifiable scenarios for using LFS? Absolutely! If you insist on using Git and insist on tracking many large files in version control, you should definitely consider LFS. (Although, if you are a heavy user of large files in version control, I would consider Plastic SCM instead, as they seem to have the most mature solution for large files handling.)
The main point of this post is to highlight some drawbacks with using Git LFS because Git LFS is most definitely not a magic bullet. If you can stomach the short and long term effects of Git LFS adoption, by all means, use Git LFS. But please make an informed decision either way.
]]>codesign
tool does) in pure Rust.
I wanted to quickly announce on this blog the existence of the project and
the news that as of a few minutes ago, the tugger-apple-codesign
crate
implementing the code signing functionality is now
published on crates.io!
So, you can now sign Apple binaries and bundles on non-Apple hardware by doing something like this:
$ cargo install tugger-apple-codesign
$ rcodesign sign /path/to/input /path/to/output
Current features include:
rcodesign extract
can be used to extract various signature
data in raw or human readable form.CodeResources
XML file will be
produced.The most notable missing features are:
csreq -b
and convert it back to this DSL. But we
don't parse the human friendly language.All of these could likely be implemented. However, I am not actively working on any of these features. If you would like to contribute support, make noise in the GitHub issue tracker.
The Rust API, CLI, and documentation are still a bit rough around the edges. I
haven't performed thorough QA on aspects of the functionality. However, the
tool is able to produce signed binaries that Apple's canonical codesign
tool
says are well-formed. So I'm reasonably confident some of the functionality
works as intended. If you find bugs or missing features, please
report them on GitHub. Or even
better: submit pull requests!
As part of this project, I also created and published the cryptographic-message-syntax crate, which is a pure Rust partial implementation of RFC 5652, which defines the cryptographic message signing mechanism. This RFC is a bit dated and seems to have been superseded by RPKI. So you may want to look elsewhere before inventing new signing mechanisms that use this format.
Finally, it appears the Windows code signing mechanism (Authenticode) also uses RFC 5652 (or a variant thereof) for cryptographic signatures. So by implementing Apple code signatures, I believe I've done most of the legwork to implement Windows/PE signing! I'll probably implement Windows signing in a new crate whenever I hook up automatic code signing to PyOxidizer, which was the impetus for this work (I want to make it possible to build distributable Apple programs without Apple hardware, using as many open source Rust components as possible).
]]>Programmers rely on various tools to author software. Arguably the most important and consequential choice of tool is the programming language.
In this post, I will articulate why I believe Rust is a highly compelling choice of a programming language for software professionals. I will state my case that Rust disposes software to a lower defect rate, reduces total development and deployment costs, and is exceptionally satisfying to use. In short, I hope to convince you to learn and deploy Rust.
Before I go too far, I'm targeting this post towards professional programmers - people who program (or support programming through roles like management) as their primary line of work or who spend sufficient time programming outside of work. I consider myself a professional programmer both because I am a full-time engineer in the software industry and because I contribute to some significant open source projects outside of my day job.
The statement Rust is for Professionals does not imply any logical variant thereof. e.g. I am not implying Rust is not for non-professionals. Rather, the subject/thesis merely defines the audience I want to speak to: people who spend a lot of time authoring, maintaining, and supporting software and are invested in its longer-term outcomes.
I think opinion pieces about programming languages benefit from knowing the author's experience with programming. I first started hacking on code in the late 1990's. I've been a full-time software developer since 2007 after graduating with a degree in Computer Engineering (after an aborted attempt at Biomedical Engineering - hence my affinities for hardware and biological sciences). I've programmed in the following languages: C, C++ (only until C++11), C#, Erlang, Go, JavaScript, Java, Lua, Perl, PHP, Python, Ruby, Rust, shell, SQL, and Verilog. Notably missing from this list is a Lisp and a Haskell/Scala type language. Of these languages, I've spent the most time with C, C#, JavaScript, Perl, PHP, Python, and Rust.
I'm not that strong in computer science or language theory: many colleagues can talk circles around me when it comes to describing computer science and programming language concepts like algorithms, type theory, and common terms used to describe languages. (I have failed many technical interviews because of my limitations here.) In contrast, I perceive my technical strengths as applying an engineering rigor and practicality to problem solving. I care vastly more about how/why things work the way they do and the practical consequences of decisions/choices we make when it comes to software. I find that I tend to think about 2nd and 3rd order effects and broader or longer-term consequences more often than others. Some would call this systems engineering.
I've programmed all kinds of different software. Backend web services, desktop applications, web sites, Firefox browser internals, the Mercurial version control tool, build systems, system/machine management. Notably missing are mobile programming (e.g. iOS/Android) and serious embedded systems (I've hacked around with Raspberry Pis and Arduinos, but those seem very friendly compared to other embedded devices). My strongest affinity is probably towards systems software and general purpose tools: I enjoy building software that other people use to build things. Infrastructure if you will.
Finally, I am expressing my personal opinion in this post. I do not speak for any employer, present or former. While I would love to see more Rust at my current employer, this post is not an attempt to influence what happens behind my employer's walls: there a better ways to conduct successful nemawashi / 根回し than a public blog post. I am not affiliated with the Rust Project in any capacity beyond a very infrequent code contributor and issue filer: I view myself as a normal Rust user. I did work at Mozilla - the company who bankrolled most of Rust's initial development. I even briefly worked in the same small Vancouver office as Graydon Hoare, Rust's primary credited inventor! While I was keen for Rust to succeed because it was affiliated with my then employer, I was most definitely not a Rust evangelist or fan boy while at Mozilla. I have little to personally benefit from this post: I'm writing it because I enjoy writing and I believe the message is important.
With that out of the way, let's talk about Rust!
When I look back at my professional self when I was in my 20s, I feel like I was young and dumb and overly exuberant about computers, technology, new software, and the like. An older, more grizzled professional, I now accept the reality that it is a miracle computers and software work as well as they do as often as they do. Point at any common task on a computer and an iceberg of complexity and nuance lingers under the surface. Our industry is abound in the repetition of proven sub-optimal ideas. You see practices cargo culted across the decades (like the 80 character terminal/line width and null-terminated strings, which can both be traced back to Hollerith punchcards from the late 19th century). You witness cycles of pendulum swings, the same fads and trends, just with different labels (microservices are the new SOA, YAML is the new XML, etc). I can definitely relate to people in this industry who want to drop everything and move to a farm or something (but I grew up in Indiana and had cows living down the street, so I know this lifestyle isn't for me).
Rust is the first programming language I've encountered in years that makes me excited. And not just normal excited: irrationally excited. Like the kind of excitement you have for something when you are naive about its limitations and don't know any better (like many blockchain/cryptocurrency advocates). I feel like the discovery of Rust is transporting me back to my younger self, before I discovered the ugly realities of how computers and software work, and is giving me hope that better tools, better ways of building software could actually exist. To channel my inner Marie Kondo: Rust sparks joy.
When I started learning Rust in earnest in 2018, I thought this was a fluke. It is just the butterflies you get when you think you fall in love, I told myself. Give it time: your irrational excitement will fade. But after using Rust for ~2.5 years now, my positive feelings about it have only grown stronger. There's a reason Rust has claimed the top spot in Stack Overflow's most loved languages survey for 5 years and running. And not by the skin of its teeth: Rust is blowing the competition out of the water. 19% over TypeScript and Python. 23% over Kotlin and Go. If this were a Forrester report for a company-offered product, Rust would be the clear market leader and marketers and salespeople would be using this result to sign up new customers in droves and print money hand over fist.
Let me tell you why Rust excites me.
After you've learned enough programming languages, you start to see common
patterns. Manual versus garbage collected memory management. Control flow
primitives like if
, else
, do
, while
, for
, unless
. Nullable types.
Variable declaration syntax. The list goes on.
To me, Rust introduced a number of new concepts, like match
for control
flow, enums as algebraic types, the borrow checker, the Option
and Result
types/enums and more. There were also behaviors of Rust that were different
from languages I knew: variables are immutable by default, Result
types
must be checked they aren't an error to avoid a compiler warning, refusing
to compile if there are detectable memory access issues, and tons more.
Many of the new concepts weren't novel to Rust. But considering I've had exposure to many popular programming languages, the fact many were new to me means these aren't common features in mainstream languages. Learning Rust felt like fresh air to me: here was a language designed to be general purpose and make inroads into industry adoption while also willing to buck many of the trends of conventional language design from the last several decades.
When going against conventional practice, it is very easy to unintentionally alienate yourself from potential users. Design a programming language too unlike anything in common use and you are going to have a difficult time attracting users. This is a problem with many academic/opinionated programming languages (or so I hear). Rust does venture away from the tried and popular. And that does contribute to a steeper learning curve. However, there is enough familiarity in Rust's core language to give you a foothold when learning Rust. (And Rust's official learning resources are terrific.)
I feel like Rust's language designers set out to take a first principles approach to the language using modern ideas and ignoring old, disproven ones, realized they needed to ground the language in familiarity to achieve market penetration, and produced reasonable compromises to yield something that was new and novel but familiar enough to not completely alienate its large potential user base.
If you don't like being exposed to new ideas and ways of working, Rust's approach is probably a negative to you. But if you are like me and enjoy continuously expanding your knowledge and testing new ideas, Rust's novelty and willingness to be different is a much welcomed attribute.
It used to be that programming languages were just compilers or interpreters. In recent years, we've seen more and more programming languages bundled with other tools, such as build/packaging tools, code formatters, linters, documentation generators, language servers, centralized package repositories, and more.
I'm not sure what spurred this trend (maybe it was Go?), but I think it is a good move. Programming languages are ecosystems and the compiler/interpreter is just one part of a complex system. If you care about end-user experience and adoption (especially if you are a new language), you want an as turnkey on-boarding experience as possible. I think that's easier to pull off when you offer a cohesive, multi-tool strategy to attract and retain users.
We refer to programming languages with a comprehensive standard library as batteries included. I'm going to refer to programming languages with additional included tools beyond the compiler/interpreter as toolbox included.
Rust, is very much a toolbox included language. (Unless you are installing it via your Linux distribution: in that case Linux packagers have likely unbundled all the tools into separate packages, making the experience a bit more end-user hostile, as Linux packagers tend to do for reasons that merit their own blog post. If you want to experience Rust the way its maintainers intended - the Director's Cut if you will - install Rust via rustup.)
In addition to the Rust compiler (rustc
) and the Rust standard library,
the following components are all officially developed and offered as part
of the Rust programming language on GitHub:
As an end-user, having all these tools and resources at my fingertips, maintained by the official Rust project is an absolute joy.
For the local tools, rustup
ensures they are upgraded as a group, so I don't
have to worry about managing them. I periodically run rustup update
to
ensure my Rust toolbox is up-to-date and that's all I have to do.
Contrast with say Node.js, Python, and Ruby, where the package manager is on a separate release cadence from the core language and I have to think about managing multiple tools. (Rust will likely have to cross this bridge once there are multiple implementations of Rust or multiple popular package managers. But until then, things are very simple.)
Further contrast with languages like JavaScript/Node.js, Python, and Ruby, where tools like a code formatter, linter, and documentation generator aren't always developed under the core project umbrella. As an end-user, you have to know to seek out these additional value-add tools. Furthermore, you have to know which ones to use and how to configure them. The fragmentation also tends to yield varying levels of quality and end-user experience, to the detriment of end-users. The Rust toolbox, by contrast, feels simple and polished.
Rust's toolbox included approach enables me to follow unified practices (arguably best practices) while expending minimal effort. As a result, the following tend to be very similar across nearly every Rust project you'll run into:
rustfmt
.)clippy
.)rustdoc
.)Cargo could warrant its own dedicated section. But I'll briefly touch on it here.
Cargo is Rust's official package manager and build system. With cargo
, you
can:
rustdoc
).As a build system, Cargo is generally a breeze to work with. Configuration
files are TOML. Adding dependencies is often a 1 line addition to a
Cargo.toml
file. Dependencies often just work on the first try. It's
not like say C/C++, where taking on a new dependency can easily consume
a day or two to get it integrated in your build system and compatible with
your source code base. I can't emphasize enough how much joy it brings
to be able to leverage an it just works build tool for systems-level
programming: I'm finding myself doing things in Rust like parsing ELF, PE,
and Mach-O binaries because it is so easy to integrate low-level functionality
like this into any Rust program. Cargo is boring. And when it comes to
build systems, that's a massive compliment!
No other language I've used has as comprehensive and powerful of a toolbox as Rust does. This toolbox is highly leveraged by the Rust community, resulting is remarkable consistency across projects. This consistency makes it easier to understand, use, and contribute back to other Rust projects. Contrast this with say C/C++, where large code bases often employ multiple tools in the same space on different parts of the same code base, leading to cognitive dissonance and overhead.
As a professional programmer, Rust's powerful and friendly toolbox enables me to build Rust software more easily than with other languages. I spend less time wrangling tools and more time coding. That translates to less overhead delivering value through software. Other languages would be wise to emulate aspects of Rust's model.
Of all the programming languages I've used, Rust seems to empathize with its users the most.
There's a few facets to this.
A lot of care seems to have gone into the end-user experience of the Rust toolbox.
The Rust compiler often gives extremely actionable error and warning messages. If something is wrong, it tells me why it is wrong, often pointing out exactly where in source code the problem resides, drawing carets to the source code where things went wrong. In many cases, the compiler will emit a suggested fix, which I can incorporate automatically by pressing a few keys in my IDE. Contrast this with C/C++ and even Go, which tend to have either too-terse-to-be-actionable or too-verbose-to-make-sense-of feedback. By comparison, output from other compilers often comes across as condescending, as if they are saying git gud, idiot. Rust's compiler output tends to come across as I'm sorry you had a problem: how can I help? I feel like the compiler actually cares about my [valuable] time and satisfaction. It wants to keep me in flow.
Then there's Clippy, a Rust linter maintained as part of the Rust project.
One thing I love about Clippy is - like the compiler - many of the lints contain suggestions, which I can incorporate automatically through my IDE. So many other linters just tell you what is wrong and don't seem to go the extra mile to be respectful of my time by offering to fix it for me.
Another aspect of Clippy I love is it is like having an invisible Rust mentor continuously providing constructive feedback to help me level-up my Rust. I don't know how many times I've written Rust code similarly to how I would write code in other languages and Clippy suggests a more Rustic solution. Most of the time I'm like oh, I didn't know about that: that's a much better pattern/solution than what I wrote!
Do I agree with Clippy all the time? Nope. But I do find its signal to noise ratio is exceptionally high compared to other linters I've used. And Clippy is trivial to configure and override, so disagreements are easy to manage. Like the Rust compiler, I feel that Clippy is respectful of my time and has the long term maintainability and correctness of my software at heart.
Then there's the Rust Community - the people behind the core Rust projects. The Rust Community is one of the most professional and welcoming I've seen. Their Code of Conduct is sufficiently comprehensive and actionable. They have their vigorous debates like any other community. But the conversation is civil. Bad apples are discarded when they crop up.
At a talk I made about PyOxidizer at a Rust meetup a few years back, I made a comment in passing about a negative comment I encountered on a Rust sub-Reddit. After the talk, a moderator of that sub who was in the audience (unbeknownst to me) approached for more information so they could investigate, which they did.
I once tweeted about a somewhat confusing, not-very-actionable compiler error I encountered. A few minutes later, some compiler developers were conversing in replies. A few hours later, a pull request was created and a much better error message was merged in short order. I'm not a special one-off here either: I've stumbled across Stack Overflow questions and other forums where Rust core developers see that someone is encountering a confusing issue, question the process that got them to that point, and then make refinements to minimize it from happening in the future. The practice is very similar to what empathetic product managers and user experience designers do.
Not many other communities (or companies for that matter) seem to demonstrate such a high level of compassion and empathy for their users. To be honest, I'm not sure how Rust manages to pull it off, as this tends to be very expensive in terms of people time and it can be very easy to not prioritize. One thing is for certain: the Rust Community is loaded with empathetic people who care about the well-being of users of their products. And it shows from the interaction in forums to the software tools they produce. To everyone who has contributed in the Rust Community: thank you for all that you have done and for setting an example for the rest of us to live up to.
One of the reasons I avoided learning Rust for years is that I perceived it was too low level and therefore tedious. Rust was being advertised as a systems programming language and you would hear stories of fighting the borrow checker. I assumed I'd need to be thinking a lot about memory and ownership. I assumed the cost to author and maintain Rust code would be high. I thought Rust would be a safer C/C++, with many of the software development lifecycle caveats that apply. And for the software I was writing at a time, the value proposition of Rust seemed weak. I thought a combination of C and say Python was good enough. When I started writing PyOxidizer, I initially thought only the run-time code calling into the Python interpreter C APIs would be written in Rust and the rest would be Python.
How wrong I was!
When I actually started coding Rust, I was shocked at how high-level it felt. Now, depending on the space of your software, Rust code can be very low-level and tedious (not unlike C/C++). However, for the vast majority of code I author, Rust feels more like Python than C. And even the lower-level code feels much higher level than C or even C++.
In my mind, the expressiveness of Rust comes very close to higher-level, dynamic languages (like JavaScript, Python, and Ruby) while maintaining the raw speed of C/C++ all without sacrificing low-level control for cases when you need it. And it does all of this while maintaining strong safety guarantees (unlike say Go, which has the billion dollar mistake: null references).
I had a mental Venn diagram of the properties of programming languages (gc versus non-gc, static versus dynamic typing, compiled versus interpreted, etc) and which traits (like execution speed, development time, etc) would be possible and Rust invalidated large parts of that model!
You often don't need to think about memory management in Rust: once you understand the rules the borrow checker enforces, memory is largely something that exists but is managed for you by the language, just like in garbage collected languages. Of course there are scenarios where you should absolutely be thinking about memory and should have a grasp on what Rust is doing under the hood. But in my experience, most code can be blissfully ignorant of what is actually happening at the memory level. (However, awareness of value ownership when programming Rust does add overhead, so it's not like the cognitive load required for reasoning about memory disappears completely.)
Rust has both a stack and a heap. But when programming you often don't need to distinguish these locations. You can do things in Rust like return a reference to a stack allocated value and pass this reference around to other functions. This would be a CVE factory in C/C++. But because of Rust's borrow checker, this is safe (and a common practice) in Rust. It also predisposes the code towards better performance! Often in C/C++ you will allocate on the heap because you need to return a reference to memory and returning a reference to a stack allocated value is extremely dangerous. This heap allocation incurs run-time overhead. So Rust allowing you to do the fast thing safely is a nice mini win.
In many statically typed languages, I feel like my programming speed
is substantially reduced by having to repeatedly spell out or think
about type names. In C, it feels like I'm always writing type names
so I can perform casting. Newer versions of C++ and Java have improved
matters significantly (e.g. the auto
keyword). However, I haven't
programmed them enough recently to know how they compare to Rust on this
front. All I know is that I'm writing type names a lot less frequently
in Rust than I thought I would be and that my programming output isn't
limited by my typing speed as much as it historically was in C/C++.
Despite being compiled down to assembly and exposing extremely low-level control, Rust often feels like a higher-level language. Equivalent functionality in Rust is often more concise and/or readable than in C/C++, while performing similarly, all while having substantially stronger safety guarantees. As a professional programmer, the value proposition is blinding: Rust enables me to do more with less, achieve a lower defect rate, and not sacrifice on performance.
The operation of computers and operating systems is exceptionally complex.
All programming languages justifiably attempt to abstract away aspects of this complexity to make it easier to deliver value through software. For example:
These abstractions often have undesirable consequences/trade-offs:
In other words, there are trade-offs with nearly every decision in programming language and [standard] library design. There are usually no obviously correct and undesirable consequence-free decisions.
And we further have to consider the fallibility of people and the inevitability that mistakes will be made, that bugs and regressions will be introduced and will need addressing. As an industry, we generally accept that mistakes occur and bugs are an unavoidable aspect of software development. If new features and enhancements are value, bugs and defects are anti-value. Like financial debt, existence of bugs and sub-optimal code can be tolerated to varying extents. But this is a highly nuanced topic and different people, companies, and projects will have different perspectives on it. We can all agree that bugs are an inevitable fact of software.
We also need to confront the reality that as an industry we have very little empirical data that says much of significance about topics like static versus dynamic typing. Although we do know some things. As Alex Gaynor informs us in What science can tell us about C and C++'s security, the result of ~2/3 of security vulnerabilities being caused by memory unsafety seems to reproduce against a sufficiently diverse set of projects and companies. That result and the implications of it are worth paying attention to!
With that being said, let's dive into my take on the matter.
Of all the programming languages I've used, I feel that Rust has the strongest disposition towards authoring and maintaining correct, high-quality software. It does this by offering a myriad of features that are designed to prevent (or at least minimize) defects. In addition, I believe Rust shifts the detection of defects to earlier in the software development lifecycle, greatly reducing the cost to mitigate defects and therefore develop software.
(As an aside, every time the topic of Rust's safety and correctness comes up, random people on the Internet rush to their keyboards to say things along the lines of C/C++ and other languages can be made to be just as safe as Rust: it's the bad programmers who are using C/C++ wrong. To those people: please stop. Your belief implies the infallibility of people and machines and that mistakes won't be made. If things like memory unsafety bugs in C/C++ could be prevented, industry titans like Apple, Google, and Microsoft would have found a way. These companies are likely taking many more measures to prevent security vulnerabilities than you are and yet the ~2/3 of security vulnerabilities being caused by memory unsafety (read: humans and machines failing to reason about run-time behavior) result still occurs. To the wiser among us, I urge you to call out perpetrators of this good programmers don't create bugs myth when you see it, just like you would/should if you encounter racist, sexist, or other non-inclusive behaviors. The reason is that belief in this myth can lead to physical or emotional harm, just like non-inclusive -isms. Security bugs, for example, can lead to disclosure of private or sensitive data, which can result in real world harm. Think a stalker or abusive former partner learning where you now live. Or a memory unsafety error in a medical device leading to device malfunction, injuring or killing a patient. Because this is a sensitive topic, I want to be clear that I'm not trying to compare the relative harms incurred by racism, sexism, other -isms, or the mythical perfect programmer. Rather, all I'm saying is each of these surpass the minimum threshold of harm incurred that justifies calling out and stopping the harmful behavior. I believe that as professionals we have an ethical and professional obligation to actively squash the mythical perfect programmer fallacy when we encounter it. Debates on the merits and limits of tools to prevent/find defects is fine: belief in the perfect programmer is not. Sorry for the mini rant: I just get upset by people who think software exists in a vacuum and doesn't have real-world implications for people's safety.)
In the sections below, I'll outline some of Rust's features and behaviors that support my assertion that Rust is biased towards correct and higher quality outcomes and lowers total development cost.
To the uninitiated, the borrow checker is perhaps Rust's most novel contribution to programming. It is a compile time mechanism that enforces various rules about how Rust code must behave. Think of these as laws that Rust code must obey. But these are more like societal laws, not scientific laws (which are irrefutable), as Rust's laws can be broken, often leading to negative consequences, just like societal laws.
Rust's ownership rules are as follows:
Then there are rules about references (think intelligent pointers) to owned values:
Put together, these rules say:
The implications of these rules on the behavior of Rust code are significant:
I used to think that these rules limited the behavior of Rust code. That statement is true. However, as I've thought about it more, I've refined my take to be that ownership and reference rules reinforce properties that well-behaved software exhibits.
If a C/C++ program had illegal memory access, you would say it is buggy and the behavior is not correct. If a Java program attempted to mutate a value on thread A without a lock or other synchronization primitive and thread B raced to read it, leading to data inconsistency, you would also call that a bug and incorrect behavior. If a JavaScript/Python/Ruby function were changed such that it started mutating a value that should be constant, you would call that a bug and incorrect behavior.
While Rust's ownership and reference rules do limit what software can do, the functionality they are limiting is often unsafe or buggy, so losing this functionality is often desirable from a quality and correctness standpoint. Put another way, Rust's borrow checker eliminates entire classes of [common] bugs by preventing patterns that lead to incorrect, buggy behavior.
This. Is. Huge.
Rust's borrow checker catches bugs for which other languages have no automated mechanism or no low cost, low latency mechanism for detecting. There are ways to achieve aspects of what the borrow checker does in other languages. But they tend to require contorting your coding style to accomplish and/or employing high cost tools (often running asynchronously to the compiler) such as {address, memory, thread} sanitizers or fuzzing. With Rust, you get this bug detection built into the language and compiler: no additional tools needed. (I'm not saying you shouldn't run additional tools like sanitizers or fuzz testing against Rust: just that you get a significant benefit of these tools for a drastically lower cost since they are built in to the core language.)
Rust's ownership and reference rules help ensure your software is more well-behaved and bug-free. But, sometimes those rules are too strict. Fortunately, Rust isn't dogmatic about enforcing them. There are legitimate cases where you can't work in the confines of these rules.
Say you want to share a cache between multiple threads. Caches need to be
both readable and writable by multiple threads. This violates
the reference rules and maybe the single owner ownership rule, depending on
how things are implemented. Fortunately, there are primitives in the
std::sync module like
RwLock
and Arc (atomically
reference counted) you can use here. Arc
(and its non-threadsafe Rc
counterpart) give you reference counting, just like a garbage collected
language. Primitives like RwLock
allow you to wrap an inner value
and temporarily acquire an appropriate reference to it, mutable or
non-mutable. There's a bit of slight of hand here, but the tricks
employed enable you to satisfy the ownership and reference rules and
use common programming techniques and patterns while still having the
safety and correctness protections the borrow checker enforces.
Multi-threaded and concurrent programming is hard. Really hard. Like it is exceptionally easy to introduce hard-to-diagnose-and-debug bugs hard.
There are many reasons for this. We can all probably relate to the fact that reasoning about multi-threaded code is just hard: instead of 1 call stack to reason about there are N. Further complicating matters are that many of us don't have a firm grasp on how memory works at a very low level. Do you know all the ins and outs on how CPU caches work on the architecture you are targeting? Me neither! (But this is a very good place to start excavating a rabbit hole.)
If you are like me, you've spent many years of your professional career
not having to care about multi-threading or concurrent programming because you
spend so much time in languages with single threads, are only implementing code
that runs in single threaded contexts, or you've recognized the reality that
implementing this code safely and correctly is hard and you've intentionally
avoided the space or chosen software architectures (like queue-based message
passing) to minimize risks. Or maybe if you are say a Java programmer you
sprinkle synchronized
everywhere out of precaution or in response to race
conditions / bugs once they are found. (Everyone's personal experience is
different, of course.)
Long story short, the aforementioned ownership and reference rules enforced by the borrow checker eliminate data races. This was a major oh wow moment for me when I learned Rust: I had heard about memory safety but didn't realize the same forces behind it were also responsible for making concurrency safe!
This property is referred to as fearless concurrency. I encourage you to read Aaron Turon's Fearless Concurrency as well as the Fearless Concurrency chapter in the Rust Book as well.
Rust is the only programming language I've used that attempts to expose operating system primitives like environment variables, command arguments, and filesystem paths and doesn't completely mess it up. Truth be told, this is kind of a niche topic. But as I help maintain a version control tool which needs to care about preserving content identically across systems, this topic is near and dear to my heart.
In POSIX land, primitives like environment variables, command arguments,
and filesystem paths are char*
, or a bag of null-terminated bytes.
On Windows, these primitives are wchar_t*
, or wide bytes.
On both POSIX and Windows, the encoding of the raw bytes can be... complicated.
Nearly every programming language / standard library in existence attempts to normalize these values to its native string type, which is typically Unicode or UTF-8. That's doable and correct a lot of the time. Until it isn't.
Rust, by contrast, has standard library APIs like
std::env::vars()
that will coerce operating system values to Rust's UTF-8 backed String
type. But Rust also exposes the
OsString
type, which represents operating system native strings. And there are
function variants like
std::env::vars_os()
to access the raw values instead of the UTF-8 normalized ones.
Rust paths internally
are stored as OsString
, as that as the value passed to the C API
to perform filesystem I/O. However, you can coerce paths to String
easily enough or define paths in terms of String
without jumping through
hoops.
The point I'm trying to get across is that Rust's abstractions are ground in the reality of how computers work. Given the choice, Rust will rarely sacrifice the ability to do something correctly. In cases like operating system interop, Rust gives you the choice of convenience or correctness, rather than forcing inconvenience or incorrectness on you, like nearly every other language.
Rust enums are algebraic data types. Rust enum variants can have values associated with them and Rust enums, like structs (Rust's main way to define a type), can have functions/methods hung off of them. Rust enums are effectively fully-featured, specialized types, where value instances must be a certain variant of that enum. This makes Rust enums much more powerful than in other languages where enums simply map to integer values and/or can't have associated functions. This power unlocks a lot of possibility and harnessed the right way can drastically improve correctness of code and lead to fewer defects.
Programming inevitably needs to deal with invariants, the various possibilities that can occur. Programmers will reach for control flow operators to handle these: if x do this, else if y do that, switch statements, and the like. Handling every possible invariant can be complex, especially as software evolves over time and the ground beneath you is constantly shifting.
As you become more familiar with Rust, you'll find yourself encoding and enforcing invariants in the type system more and more. And enums are likely the main way you accomplish this.
Let's start with a contrived example. In C/C++, if you had a function
that accepted either an Apple
or an Orange
value, you might do
something like: void eat(Apple* apple, Orange* orange)
. Then you'd
have inline logic like if apple != null
. In a dynamically typed
language, you could pass a single argument, but you'd perform inline
type comparison. e.g. with Python you'd write if isinstance(fruit, Apple)
.
With Rust, you'd declare and use an enum. e.g.
struct Apple {} struct Orange {} enum Fruit { Apple(Apple), Orange(Orange), } impl Fruit { fn eat(&self) { match self { Self::Apple(apple) => { ... }, Self::Orange(orange) => { ... }, } } } let apple = Fruit::Apple(Apple { }); apple.eat();
This (again contrived) example shows how we Rust enum variants can hold
inner values, how we can define methods on Rust enums (so they behave like
regular types), and introduces the match
control flow operator.
Quickly, match is a
super powerful operator. It will compare its argument against provided
patterns and evaluate the arm that matches. Patterns must be comprehensive
or the compiler will error. In the case of enums, if you add a variant - say
Banana
for our Fruit
example - and fail to add that variant to existing
match
expressions, you will get compiler errors!
As you become more proficient with Rust, you'll find yourself moving
lots of (often redundant) control flow expressions and conditional dispatch
(if X do this, if Y do that) into enum variants and encoding the dispatched
actions into that enum/type directly. Conceptually, this is logically
little different from having a base type or interface or by having a single
wrapper class holds various possible values. But the guarantees are stronger
because each distinct possibility is strongly defined as an enum variant.
And when combined with the match
control flow operator, you can have
the Rust compiler verify that all variants are accounted for every time
you take conditional action based on the variant.
The 2 most common enums in Rust are Option
and Result
. The following
sections will explain how they work and further demonstrate how invariants
can be encoded and enforced in Rust's type system.
Option
: A Better Way to Handle NullabilityMany programming languages have the concept of
nullable types: the ability
for a value to be null or some null-like value. You will often find this
expressed in languages as null
, nil
, None
, or some variant thereof.
When programming in these languages, nullable values must be accounted
or it could lead to errors. Languages like C/C++ and Go will attempt to
to resolve the address behind null
/nil
, leading to at least a program
crash and possibly a security vulnerability. Languages like Java and Python
will raise exceptions (NullPointerException
in Java - frequently abbreviated
NPE
because it is so common - and likely TypeError
in Python).
The prevalence of failure to account for nullable values is a major reason why null references were coined by their inventor as the billion dollar mistake. (I suspect the real world value is much greater than $1B.)
Having an easy-to-ignore nullable invariant lingering in type systems seems like a massive foot gun to me. And indeed every programmer with sufficient experience has likely introduced a bug due to failure to account for null. I sure have!
Rust doesn't have a null value. Therefore no null references and no
billion dollar mistake. Instead, Rust's standard library has
OptionOption
is vastly superior to
null values.
Option<T>
is an enum with 2 variants, Some(T)
or None
: an instance of some
type or nothing. What makes Option
different from languages with null
references is you have to explicitly ask for the inner value: there is no
automatic dereference. Rust forces you to confront the reality that a
value is nullable and by doing so can drastically reduce a very common bug
class. I say drastically reduce instead of eliminate because it is
still possible to shoot yourself in the foot. For example, you can call
Option.unwrap()
to obtain the inner value, triggering a panic if the None
variant is
present. Despite the potential for programming errors, this solution is
strictly better than null references because Option
forces you to confront
the reality of nullability and use of the dangerous access mechanisms is
relatively easy to audit for. (Clippy has some lints to encourage best
practices here.)
The existence of Option<T>
means that if you are operating on a non-Option
value, that value is guaranteed to exist and not be null. If you are operating
on Option
, the fact it is optional is explicitly encoded in the type and you
know you need to account for it. If the value passed into a function was once
always defined and a later refactor changed it to be optional (or vice versa), that
semantic change is reflected in the type system and Rust forces you to confront
the implications when that change is made, not after it was deployed to
production and you started seeing segfaults, NPEs, and the like.
After using Rust's Option<T>
to express nullability, you will look at
every other language with null references and bemoan how primitive and
unsafe it feels by comparison. You will yearn for Rust's safer approach
biasing towards correctness and higher quality software. Option<T>
is
massive feature for the professional programmer who cares about these
traits.
Result
: A Better Way to Handle ErrorsDifferent programming languages have different ways of handling errors. Returning integers or booleans to express success or failure is common. As is throwing and trapping/catching exceptions.
Like nullability, history has shown us that programmers often fail to handle error invariants, with bugs of varying severity ensuing. Even Linux filesystems fail to handle errors!
I argue that the traditional programming patterns we use to handle errors bias
towards buggy outcomes, especially with the return an integer/error value
approach. It is easy to forget to check the return value of a function. In
C/C++, maybe a function once returned nothing (void
) and was later refactored
to return an integer error code. You have to know to audit for existing callers
when making these changes or updating dependencies. Furthermore, handling errors
requires effort. That if err != 0
or if err != nil
pattern gets mighty
annoying to type all of the time! Plus, you have to know what value to compare
against: success can often be 0, -1, or 1 or any other arbitrary value.
Getting error handling correct 100% of the time is hard. You will fail and
this will lead to bugs.
Result
Like Option<T>
, Result<T, E>
is an enum with 2 variants: Ok(T)
and
Err(E)
. That is, a value is either success, wrapping an inner value of
type T
or error, wrapping an inner value of type E
describing that
error.
Like Option<T>
, Result<T, E>
forces you to confront the existence of
invariants. Before operating on the value returned by a function, you need
to explicitly access it and that forces you to confront that an error could
have occurred. In addition, the Result
type is
annotated
and the compiler will emit a warning when you don't check it. Scenarios like
changing an infallible function returning a type T
to fallible returning a
Result<T, E>
will fail to compile (due to typing violations) or make compiler
warning noise if there are call sites that fail to account for that change.
In addition to making it more likely that errors are acted upon correctly, Rust also contains a ? operator for simplifying handling of errors.
As I said above, typing patterns like if err != 0
or if err != nil
can become
extremely tedious. Your brain knows what it needs to type to handle errors
but it takes precious seconds to do so, slowing you down. You may have code where
the majority of the lines are the same error handling boilerplate over and over,
increasing verbosity and arguably decreasing readability.
Rust's ?
operator will return
an Err(E)
variant or evaluate to the
inner value from the Ok(T)
variant. So you can often add an ?
operator after a function call returning a Result<T, E>
to automatically
propagate an error. Typing a single character is vastly easier and simpler
than writing explicit control flow for error handling!
The benefits of ?
are blatantly apparent when you have functions calling
into multiple fallible functions. Long functions with multiple if err != 0
blocks followed by the next logical operation often reduce to a 1-liner. e.g.
bar(foo()?)?
or foo.do_x()?.do_y()?
. When I said earlier that Rust feels
like a higher level language, the ?
operator is a significant contributor to
that.
There are some downsides to Result<T, E>
in terms of programming overhead
and consistency between Rust programs. I'll cover these later in the post.
Result<T, E>
biases Rust code towards correctness by forcing programmers
to confront the reality that an error could exist and should be handled.
Once you program in Rust, you will look at error handling mechanisms like
returning an error integer or nullable value, realize how brittle and/or
tedious they are, and yearn for something better.
unsafe
Escape HatchIf some of Rust's limitations are too much for you, Rust has an
in case of emergency break glass feature called unsafe
. This is kind of
like C mode where you can do things like access and manipulate raw memory
through pointers. You can cast a value to a pointer and back to a new Rust
reference/value, effectively short circuiting the borrow checker for that
particular reference/value.
A common misconception is unsafe
disables the borrow checker and/or loosens
type checking. This is incorrect: many of those features are still running
in unsafe
code. However, because Rust can't fully reason about what's
happening (e.g. it doesn't know who owns a raw memory address and when
it will be freed), it can't properly enforce all of its rules that guarantee
safety, leading to, well, unsafety. (See
Unsafe Rust for
more on this topic.)
unsafe
is a necessary evil. In many Rust programs, you won't have to
ever use it. But if you do have to use it, its presence will draw review
scrutiny like moths to light. So unlike say C/C++ where practically every
memory access is a potential security bug and it is effectively impossible
in many scenarios to comprehensively audit for memory safety (if there were
there would be no memory safety bugs), using unsafe
safely is often viable
because scrutiny can be concentrated on its relatively few occurrences.
And more experienced Rust programmers know how to encapsulate unsafe
into
safe wrappers, limiting how much code needs to be audited when code
around unsafe
changes.
What I've personally been enlightened by is the myriad of operations
that Rust considers unsafe. As you learn more and more Rust, you'll
encounter random functions sprinkled across the standard library that
are unsafe
and you'll wonder why. The docs usually tell you and that's
how you learn something new (and maybe horrifying) about how computers
actually work.
A significant portion of the software development lifecycle is evolving existing code. Fixing bugs. Extending existing code with new functionality. Refactoring code to fix bugs or prepare for new features. Using code in new, unplanned ways.
In many code bases, the amount of people time spent evolving the code dwarfs the time for creating actual greenfield code/features. (Unfortunately, quantifying when you are doing evolution versus greenfield coding is quite difficult, so both facets often get lumped together into simply software development time. But in my mind they are discrete - although highly interdependent - units of work and the evolution time tends to dwarf the greenfield time on established projects.) So it follows that long-term evolution/maintainability of code bases is more important than initial code creation time.
There is a sufficient body of industry research demonstrating that the cost to fix defects rises exponentially as you progress through the software development lifecycle (do a search for say software development lifecycle cost of fixing a bug).
Furthermore, human memory functions not unlike multi-tier caches and your ability to recall information will diminish over time. (You probably know what you were doing 5 minutes ago, might remember what you were doing at this time yesterday, and probably have no clue what you were doing on this date 20 years ago.)
In terms of coding, the best way to address a defect is to not introduce it in the first place. If you can't do that, your goal is to detect and correct it as early in the development process as possible, as close as possible to when the source code creating that defect came into existence. Practically, in order of descending desirability:
The earlier a defect is caught, the better the chances that the author (or other involved parties) have relevant code paged in and can fix it with less effort and with lower chances of introducing additional defects. For me, authoring new code is relatively easy compared to refactoring old code. That's because I have new code fully paged into my brain and I know it like the back of my hand. I know where the sharp edges are and how you'll get cut if you make certain changes. However, if several months pass without revisiting the code, most of that heightened awareness evaporates. If I need to change or review that code, my ability to do that with a high degree of confidence and efficiency is drastically eroded.
Generally speaking, the earlier a defect is caught, the less damage it can do. Ideally, a defect is caught and fixed at local development time, before you burden a reviewer with finding it and certainly before it causes harm or anti-value after being deployed!
In addition, compressing the software development lifecycle allows you to ship enhancements sooner, which enables you to deliver value sooner. This is what we're trying to do as professional programmers after all!
Because the cost to fix a defect rises exponentially as it moves through the software development lifecycle, it follows that you want defect detection to occur logarithmically to offset that cost. That means you want as many defects as possible to be caught as early as possible.
Compared to other programming languages I've used, Rust is exceptional at detecting defects earlier in the development lifecycle and as a result can drastically lower overall development costs. Here are the main factors contributing to this belief:
Option<T>
significantly curtails the billion dollar mistake.Result<T, E>
forces you to reckon about handling errors.The Rust compiler is just exceptional at detecting common defects.
Did your code refactor introduce a use-after-free or dangling reference? Don't worry: the borrow checker will detect that. CVE prevented.
Did you introduce a race condition by performing a mutation somewhere that was previously immutable? The borrow checker will detect that. You potentially just saved hours of time debugging a hard-to-reproduce bug.
Did you add an enum variant but forget to add that variant to a
match
expression? If you avoided using the match all _
expression, the compiler will tell you match arms aren't exhaustive
and give you an error.
Did a value that was previously always defined become nullable? Changing
the type from T
to Option<T>
will yield compiler errors due to type
mismatch.
Did an Option<T>
that was previously always Some(T)
suddenly
become None
? Hopefully following Rust best practices mean your code
will just work. In the worst case you get a panic (with a stack trace).
But that's on par with say a Java NPE and is strictly better than a
null dereference that you get with languages like C/C++.
Did you change or add a function returning Result<T, E>
but forget
to check if that Result
is an Ok(T)
or Err(E)
, the compiler
will tell you.
I could go on. Rust is full of little examples like these where the core language and standard library nudge you towards working code and help detect defects earlier during development, saving vast amounts of time and money later.
The Rust compiler is so good at rooting out problems that many Rust programmers have adopted the expression, if it compiles it works. This statement is obviously falsifiable. But compared to every other programming language I've used, I'm shocked by how often it is true.
For other programming languages, a working compile is the beginning of your verification or debugging journey. For Rust, it often feels like the hard part is over and you are almost done. With other languages, you often have an indefinite number of iterations to fix language defects (like null dereferences or dynamic typing errors) beyond the compile step. You need to address these in addition to any logical/intent defects in your code. And fixing logical/intent defects could introduce more post-compile defects. As a programmer, you just don't know when the process will be done. With Rust, the compiler errors tell you exactly what the language defects are. So by the time you appease the compiler, you are left with just your logical/intent defects. I greatly prefer the Rust workflow which separates these because I'm getting clearer feedback on my progress: I know that once I've addressed all the language defects the compiler complains about that is just a matter of fixing logical/intent defects. I know I'm a giant step closer to victory.
The Progress Principle is a psychological observation that people tend to prefer a series of more smaller wins over fewer larger wins. And (unexpected) setbacks can more than offset the benefits of wins. (The book is an easy read and I've found its insights applicable to software development workflows.) Whether Rust's language designers realized it or not, Rust's development workflow plays into our psychological dispositions as described by The Progress Principle: defects (setbacks) tend to occur earlier (at compile time), not at unexpected later times (during code review, CI testing, deploy, etc) and our progress towards a working solution is composed of small wins, such as fixing compiler errors and knowing when you transition from language defects into logical/intent defects. For me, this makes iterating on Rust more fulfilling and enjoyable than other languages.
Whether you realize it or not, every programmer has a personal, generalized model of how to program, how to reason about code, best practices, and what not. When we program, we specialize that model to the language and environment/project we're programming for. The mental model that each of us has its shaped by our experience: which languages we know, which concepts we've been exposed, mistakes we've made, people we've worked with and the practices they've instilled.
If for no other reason, you should learn Rust to expand your generalized model of how to program so that you can apply Rust's principles outside of Rust.
Before I learned Rust, I had a mental model of the lifetimes of various values/variables/memory and how they would be used. If I were coding C, I would attempt to document these in function comments. e.g. if returning a pointer, the comment would say how long the memory behind that pointer lives or who is responsible for freeing it. So when I encountered Rust's ownership and reference rules when learning Rust, they substantially overlapped with my personal mental model of how you should reason about memory in order to avoid bugs. I distinctly remember reading the Rust Book and thinking wow, this seems to be a formalization of some of the concepts and best practices living in my head!
After using Rust for several months, I realized that my prior mental model around reasoning about safe program behavior was woefully incomplete and that Rust's was far superior.
Rust's different ways of doing things will inevitably force you to think about type design, data access patterns, control flow, etc more than most other programming languages. In most other languages, it is much easier to just write runnable code and defer the complexity around ensuring the code is safe/correct and free from certain classes of bugs, like memory access violations and race conditions. Rust's ways of doing things forces you to confront many of these problems up-front, before anything runs.
Rust's stricter model and way about authoring software eventually percolates into your personal generalized model of how to program in any programming language. As you internalize patterns needed to program Rust proficiently, you will subconsciously cherry-pick aspects of Rust and apply them when programming in other languages, making you a better programmer in those languages.
For example, when you program C/C++, you will realize the minefield of memory safety issues that linger in those languages. Many of those mines never explode. But knowing Rust and the patterns needed to appease the borrow checker and write safe code, you have a better sense of where the mines are located, the patterns that lead to them exploding, and you can take preemptive steps or apply extra scrutiny to avoid tripping them. (If you are like me, you'll reach the conclusion that C/C++ is intrinsically unsafe and is beyond saving, vowing to avoid it as much as possible because it is just too dangerous to use safely/responsibly.)
Similarly, when programming in any language, you'll probably think more about variable mutability and non-mutability, even if those languages don't have the concept of mutability on variables. You'll be more attune to certain patterns for mutating data: where mutation occurs, who has a mutable reference, when there are both mutable and non-mutable references in existence. Again, your knowledge from Rust will subconsciously raise your awareness for classes of bugs, making you a better programmer.
The same thing applies to multi-threaded programming and race conditions. After internalizing Rust's model of how to achieve multi-threading safely, you will probably not look at multi-threading in other languages the same way again. If you are like me, you will be horrified by how the lack of Rust's enforced ownership/reference rules predisposes code to so many horrible and hard-to-debug bugs. Again, you will probably find yourself changing your approach to multi-threading to minimize risk.
Fun fact: while at Mozilla I heard multiple anecdotes of [very intelligent] Firefox developers thinking they had found a bug in Rust's borrow checker because they thought it was impossible for a flagged error to occur. However, after sufficient investigation the result was always (maybe with an exception or two because Mozilla adopted Rust very early) that the Rust compiler was correct and the developer's assertions about how code could behave was incorrect. In these cases, the Rust compiler likely prevented hard-to-debug bugs or even exploitable security vulnerabilities. I remember one developer exclaiming that if the bug had shipped, it would have taken weeks to debug and would likely have gone unfixed for years unless its severity warranted staffing.
I strongly feel that I am a better programmer overall after learning Rust because I find myself applying the [best] practices that Rust enforces on me when programming in other languages. For this reason, even if you don't plan to use Rust in any serious capacity, I encourage people to learn Rust because exposure to its ideas will likely transform the ways you think about programming for the better.
This post has been rather positive about Rust so far. Rust, like everything, is far from perfect and it has its downsides. Professionals know the limitations of their tools and you should know some of the issues you'll run into when using Rust.
In addition, Rust is still a relatively young and unpopular programming language. Since relatively few people know Rust, there are a handful of myths and inaccuracies circling about the language. I'll also dispel some of those here.
A common criticism levied against Rust is it is harder to learn than other programming languages. I think this is a valid concern. My experience is Rust took longer to learn and level-up than other languages I've learned recently, notably Go, Kotlin, and Ruby.
I think the primary reason for this is the borrow checker and the rules it enforces. Many programmers have never encountered forced following of ownership and reference rules before and this concept is completely foreign at first. I liken it to a new way to program. If you only have experience with dynamically typed languages that will allow you to compile a ham sandwich, there's a good chance you'll be frustrated by Rust. Rust will likely challenge your conceptions of how programming should work and may frustrate you in the process.
In addition to the borrow checker itself, there are a myriad of types and patterns you'll encounter and eventually need to understand to appease the borrow checker.
Beyond the borrow checker, Rust's standard library is comprehensive and offers a lot of types and traits. It will take a while to be exposed to many of them and know when/how to use each.
You will likely be adding 3rd party crates as dependencies to your project for common functionality not (yet) in the standard library. These expand the scope of concepts you need to learn.
I hope I'm not scaring anybody away: you can go pretty far in Rust without encountering or understanding most of the standard library. That being said, every new type, trait, concept, and crate you learn unlocks new possibilities and avenues for delivering value through programming. So there is an incentive to take the time to learn them sooner than later.
I learned Rust mostly independently for a personal project. While learning resources such as Learn Rust, the Rust Language Cheat Sheet, and even Clippy are fantastic, in hindsight I probably would have become more proficient sooner had I contributed to an existing Rust project and/or had ongoing technical collaboration with more experienced Rust developers. This is probably no different than any other programming language. But because of Rust's steeper learning curve, I think the benefits of peer exposure are more significant. That being said, I've heard anecdotes of teams with no Rust experience learning Rust together with successful results. So there's no formal recipe for success here.
Finally, despite the steeper learning curve, I'd say the return on investment pays off pretty quickly. As I've argued elsewhere in this post, the Rust compiler and type system helps prevent many classes of bugs. So while it may take longer to initially learn and compose idiomatic Rust code, it won't take long for Rust to offset the time that you would have spent chasing bugs, performance optimizations, and the like.
Rust releases a new version every 6 weeks. By contrast, many other programming languages release ~yearly. This faster release cadence has been a common complaint about Rust.
Quickly, I think people conflate release cadence with churn and hardship from that release cadence. Generally speaking, release cadence isn't the thing you care about: it's how disrupted you are from the releases. If your old release continues to work just as well as the new release, release cadence doesn't really matter (many major websites deploy/release dozens of times per day and you don't care because you can't tell: you only care when the UI or behavior changes). So the thing most of us care about is how frequently Rust releases cause disruption. And disruption is often caused by backwards incompatibility and the introduction of new features, which when adopted, force upgrades.
A few years ago, I think the concern that Rust moves too fast was valid: there were significant features in seemingly every release and crates were eager to jump on the new features, forcing you to upgrade if you wanted to keep your dependency tree up to date. I feel like I caught the tail end of this relative chaos in 2018-2019.
But in the last 18-24 months, things seem to have quieted down. Many of the major language features that people were eager to jump on have landed. The only ongoing churn I'm aware of in Rust is in the async ecosystem, and that seems to be stabilizing. New Rust releases are generally pretty quiet in terms of must use features. The last milestone release in my mind was 1.45 in July 2020, which stabilized procedural macros. The community was pretty quick to jump on that feature/release. My Rust projects have targeted 1.45+ for a while now with minimal issues.
9 months with no major disruptions is on par with the release cadence of other programming languages.
In my opinion, the concern that Rust moves too fast, while once valid, no longer generally applies. Pockets of truth for segments of users caring about niche and lesser-used features, yes. But nothing that applies to the entire Rust ecosystem.
A lot of people have commented that Rust builds take too long. It is true: compiling Rust tends to take longer than C/C++, Go, Java, and other languages requiring an ahead-of-time compile step.
While a lot has been done to make the Rust compiler faster (it feels substantially faster than it was a few years ago), it still isn't as fast as other languages.
Not to dismiss the problem, but in a lot of cases, the speed of Rust compilation is fast enough. Incremental builds for small libraries or programs will take a few hundred milliseconds to a second or two. I suspect most of the people complaining about build times today are developing very large Rust programs (tens of thousands of lines of code and/or hundreds of dependencies).
A contributing problem to build times is dependency count. The simplicity of Cargo makes it very easy to accumulate dependencies in Rust and each additional crate will slow your build down. PyOxidizer has ~400 dependencies at this point in time, for example (I've been throwing the kitchen sink at it in terms of features).
There are a few things under your control to mitigate this problem.
First, install sccache, a transparent compiler cache. By default it caches to the local filesystem. But you can also point it at Redis, Memcached, or blob stores in AWS, Azure, or GCP. Firefox's CI uses an S3 backed cache and the hit rate (for both Rust and C/C++) is 90-99% on nearly every build. For PyOxidizer - a medium sized Rust project - sccache reduces full build times from ~53s wall and ~572s CPU to ~32s wall to 225s CPU on my 16 core Ryzen 5950X. The wall time savings on a lower CPU core count machine are even more significant.
Speaking of CPU core counts, the second thing you can do is give yourself access to more CPU cores. Laptops tend to have at most 4 CPU cores. Consider buying desktops or moving builds to remote machines, often with dozens of CPU cores. This requires spending money. But when you factor in people time saved and the cost of that time and the value of someone's happiness/satisfaction, it can often be justified.
I'm not trying to dismiss the problems that slow builds can impose, but if you want to justify their cost, you can argue that the Rust compiler does more at compilation time than other languages and that this overhead brings benefits, such as preventing bugs earlier in the software development lifecycle. There's no such thing as a free lunch and Rust's relatively slower builds are a tax you pay for the correctness the compiler guarantees. To me, that's a justifiable trade-off.
The isn't production ready concern is likely disproven by the existence of Rust in production in critical roles at a sufficient number of reputable companies. At this point, there are very few technical reasons to say Rust isn't production ready. Non-technical reasons such as lack of organizational knowledge or a limited talent pool for hiring from, yes. But little on the technical front.
The too young part is ultimately a judgement call for how comfortable you are with new technologies.
I'm generally pretty conservative/skeptical about adopting new technology. If you are in this industry long enough you eventually get humbled by your exuberance.
I was probably in the Rust is too young boat as late as 2017, maybe 2018. While I was cheering on Rust as a Mozillian, I was skeptical it was going to take off. Birthing successfully languages is hard. The language still seemed to move too fast and have too many missing features. Things seemed to stabilize around the 2018 edition. That's also when you started commonly hearing of companies adopting Rust. Lots of startups at first. Then big companies started joining in.
Today, companies you have heard of like Amazon, Cloudflare, Discord, Dropbox, Facebook, Google, and Microsoft are adopting Rust to varying degrees. There are 58,750 published crates on crates.io.
I won't drop names, but I've heard of Rust spreading like wildfire at some companies you've heard of. The stories are pretty similar: random person or team wants to try Rust. Something small and isolated with a minimal blast radius in case of disaster is tried first. Rust is an overwhelming success. As more and more people are exposed to Rust, they see the light, cries for Rust become louder, and it becomes even more widely adopted.
When I program in Rust, I strongly feel that my base rate of defect introduction is substantially less than other programming languages. I have confidence that the Rust compiler coupled with practices like encoding and enforcing invariants in the type system leads to fewer defects. In some cases I feel like the surface area for bugs is limited to logical defects, which are mis-expressions of the human programmer's intent. And since no automated tool can reliably scan for human intent, there's no way to prevent logical bugs, and that surface area is the best we can ever expect from automated scanning.
Knowing what tests to write and how much effort to invest in test writing is a difficult skill to level up and is full of trade-offs. With Rust, I find myself writing fewer tests than in other languages because I have confidence that the compiler will detect issues that would otherwise require explicit testing.
I feel that my beliefs and practices are rooted in reality and justifiable. Yet I recognize the danger in placing too much faith in my tools, in Rust.
In theory, Rust alleviates the need for running additional verification tools, like {address, memory, thread} sanitizers because the safe subset of Rust prevents the issues these tools detect. Many defects caught by fuzzing are also similarly prevented by the design of Rust (but not all: fuzzing is generally a good idea).
What I'm trying to say is that it is really easy to fall into a trap where you are over-confident about the abilities of Rust to prevent defects and you find yourself letting your guard down and not maintaining testing and other verification best practices.
I'm still evolving my beliefs in this area. But my general opinion is that you
should still run things like {address, memory, thread} sanitizers and fuzzing
because unsafe
likely exists somewhere in the compiled code, as likely does
C or assembly code. And because a chain is only as strong as its weakest link,
it only takes any bug to undermine the safety of the entire system.
So while these additional verification tools likely won't find as many issues
as they would in unsafe languages, I still think it is a good idea to continue
to run them against Rust, especially for high value code bases.
Result<T, E>
isn't a panacea. Because errors are full on types rather
than simple primitives like integers, you need to spend effort reasoning
and coding about how different error types interact. And often you need
to write a bit of boilerplate code to facilitate that interaction. This
can cancel out a lot of the efficiency benefits of Rust's ?
operator
for handling errors.
There are a handful of 3rd party Rust crates specializing in error handling that you'll likely to encounter. These include anyhow, error-chain, failure, and thiserror.
Rust's error handling landscape can at times feel fragmented and make you yearn for something more defined/opinionated in the standard library. The Rust Community recognizes that this is an area that can be improved and has formed an error handling project group to improve this space. So hopefully we see some quality of life improvements to error handling in time.
I am irrationally effusive about Rust. When I see this level of excitement in others, I am extremely skeptical. I was skeptical myself when my former colleagues at Mozilla were talking up Rust years ago. But having used Rust for 2.5 years now and authored tens of thousands of lines of Rust code, the initial relationship euphoria has worn off and I am most definitely in love.
Cynically, Rust has ruined in programming in other languages for me. Less cynically, Rust has spoiled me.
When I look at other languages without the rules enforced by Rust's borrow checker, all I see are sharp edges waiting to materialize into bugs.
When I look at other languages with weaker type systems, I think about all the time I spend having to defend against invariants and how much cognitive load and programming/review effort I need to incur to maintain the baseline of quality that I get with Rust.
When I look at programming languages like Python, Ruby, and TypeScript where you can bolt a type system onto a language that doesn't have it, I think why would I want to do that when I can use an even better type system while likely achieving much better performance with Rust? (It's tempting to reach for a metaphor involving lipstick and pigs.)
When I look at other languages, I generally see the same pile of decades old ideas packaged in different boxes. Some of these ideas are good and probably timeless (e.g. functions and variables). Some are demonstrably bad and should be largely excised from common use (e.g. null references - the billion dollar mistake).
When I interface with Rust's tooling, I feel like it is respectful of my time and has my best interests (producing working software) at heart. I feel the maintainers of the tooling care about me.
When I program in Rust, I feel that I'm producing fewer defects overall. The compiler is catching defects that would otherwise be caught later in the software development lifecycle, leading to increased software development costs.
When I interact with Rust's community of people, respect and empathy abounds.
Does Rust have its problems and limitations? Of course it does: nothing is perfect! But in my opinion, its trade-offs are often strictly better than those found in other programming languages I've used.
At the end of the day, Rust is a programming language and therefore a tool. Adept professionals know not to get too attached to your tools: ultimately it is the value you deliver, not how you deliver it. (Of course the choice of tools can significantly impact the quality and timeline of value delivery!) Will my thoughts on Rust and preferred languages change over time as the landscape shifts: of course they will! But for the time being, Rust brings so much to the table that its competition lacks that I'm overly excited about Rust and its ability to advance the state of software/programming and therefore the industry.
In closing, my current CTO uses the phrase commitment to craft as a desired mindset for their technical organization. That phrase translates to various themes: higher quality / lower defect rate, build with the long-term in mind, implement efficient solutions, etc. Like an artist reaches for a preferred paintbrush or a chef for a preferred knife because their preferred tool enables them to better express their craft, I feel that Rust often enables me to better express the potential of my professional craft more than other programming languages. I strongly feel that Rust predisposes software to higher quality outcomes - both in terms of defect rate and run-time efficiency - while also reducing total development and execution costs over the entire software development lifecycle. That makes Rust my first choice language - my go-to tool - for many new projects at this point in time. If you likewise value commitment to craft, I urge you to explore Rust so that you too can better harness the potential of our programming craft.
But don't take my word on it, read what 42 companies using Rust in production have to say.
]]>