Gregory Szorc's Digital Home

The Mercurial Revlog

February 05, 2014 at 04:26 PM | categories: Mercurial, Mozilla

Mercurial stores lots of its data in append-only, intended-to-be-immutable data structures called revlogs. Each revlog is backed by a file on the filesystem. The main component of a revlog is a revision. When a new revision arrives, its content is compared against the previous revision in the file and a zlib-compressed delta/diff is generated. If the delta is smaller than the entry itself, the delta is appended to the revlog. If the compressed delta itself or the stream of deltas that must be applied to recreate the original text is larger than the source text, the compressed full content is appended to the revlog. In other words, Mercurial tries to maintain a balance between file size/growth rate and read times in terms of both I/O and CPU.

The changelog is a revlog that holds information about every changeset/commit in the repository. The manifest is a revlog that holds details about which files are present in every changeset. There exist a separate revlog for each file/path ever present in the repository. These are called filelogs.

When you commit a change touching a single file, Mercurial will append a revision to the changelog describing the commit, a revision to the manifest describing the set of files active in that specific commit, and a new revision to a filelog.

You can poke around your repository's revlogs by running some debug commands. For example:

# Show high-level information about the manifest revlog
$ hg debugrevlog -m

# Show details about each entry in the manifest revlog (this
# actually reads data from an index to the data revlog - pretend
# there aren't index files for now)
$ hg debugindex -m

# Dump the content of a single entry in the changelog.
$ hg debugdata -c 2

I'm generally a big fan of append-only data structures because I love the beneficial caching properties and linear I/O performance. Mercurial and the revlog can take advantage of these properties for some performance wins under key scenarios, especially server operation (pushing commits doesn't necessarily invalidate cached revlog entries, allowing the page cache to service many reads).

Edge cases and deficiencies

As with any system trying to solve a complex problem, the revlog storage format doesn't always work as well as intended.

Let's start by taking by looking at how revlogs do delta storage. I said earlier that revlogs try to store a compressed delta against the previous entry. Well, that previous entry is the previous physical revision in the file/revlog, not the parent revision. You see, entries/revisions in revlogs have a numeric index (0, 1, 2, 3, ...), a SHA-1 hash of their content, and a link to the logical parent of the entry. e.g. say you have two commits in your repo, one made after the other:

0:8ba995b74e18
1:9b2a99adc05e

The revlog has entries at indexes 0 and 1. They are also accessible by their hash/node values of 8ba995b74e18 and 9b2a99adc05e. The parent of 1 is 0. The entry for 0 is the full, compressed content of 0. The entry for 1 is likely a compressed delta from 0 to 1. As we commit, we keep appending new entries. These exist as compressed deltas until the sizes of 0..n is greater than n by itself. At that time, a compressed version of n is stored.

This model works great for linear histories, as changes from n to n+1 are usually small. Mercurial can store a very small delta for each entry. The world is good.

This model works great when entries in revlogs are small. By having small entries (and small changes), the number of deltas required to eclipse the size of a single entry remains small, in effect keeping the length of a delta chain in check. Limiting the length of delta chains is good because it keeps the cost of looking up a single revision's content low. If you have a delta chain of 1000, for example, Mercurial will need to read 1000 revlog entries and apply 1000 deltas to obtain the original value. That can get expensive computationally. Since reads are linear, I/O should remain in check. But you do need to scan a lot of bytes to read in all the deltas and you need to perform a lot of the same computations (such as zlib decompression).

Let's talk a little about where the revlog starts to break down.

First, there's the issue of multiple, interleaved branches in the revlog. Say we have a repository with many branches/heads. We alternate committing between all the heads. This can play havoc with the default revlog delta compression. Since revlogs compress against the previous physical entry in the revlog, if there is lots of alternating between branches and the contents of those branches diverges significantly, the deltas can grow quite large and full revisions will be stored with higher frequency. This means more storage space and more CPU tasked with resolving deltas (since deltas are larger). Although, CPU is kept in check since delta chains tend to be smaller since full revisions are stored with higher frequency.

Second, we have an issue with delta chain explosion of large entries with small turnover. If your base content is large and it isn't changing that much, it will take hundreds or even thousands of revisions before the sum of the delta sizes outgrows that of an entry. This means delta chains can be very long and Mercurial will have to spend a lot of CPU to resolve a single entry.

Third, revlogs are using zlib for compression. As many benchmarks have shown, zlib isn't the fastest or most efficient compression algorithm in the world. It's likely justified as a reasonable default. But alternatives exist. The choice of zlib has implications, especially when other factors (such as excessively long delta chains) come into play.

Let's talk about mitigation strategies.

True parent deltas

While I wasn't around for the original design decisions of revlogs, I'm guessing they were strongly influenced by the fact that sequential I/O on magnetic disks is much, much faster than random I/O. With SSDs and flash storage growing in popularity - a medium that offers random I/O commonly over 100 MB/s - this buys us the luxury of asking do revlogs need to be optimized for sequential I/O and how can revlogs change to take advantage of fast random I/O.

One of the ways revlogs can adapt to fast random I/O is to store the delta against the logical parent, not the physical. The delta chain will thus skip revisions in the revlog. e.g. the chain could be 1, 2, 5, 6, 10, not n, n+1, n+2, .... This would keep deltas small and would reduce the overall size of the revlog. Although, it would likely increase the length of delta chains, especially when dealing with non-linear histories.

It turns out Mercurial has a setting that enables this - format.generaldelta. To create a repository with this enabled, run:

$ hg --config format.generaldelta=true init path/to/repo

The revlogs in that repository with now have deltas computed against the logical parent!

To verify you are using general delta, look for generaldelta in the .hg/requires file. If it isn't there, you probably don't have generaldelta enabled. Please note that cloning a generaldelta repo won't necessarily give your repo generaldelta. You need to have the custom config option set on the client or the client will likely use the defaults.

On repositories with lots of interleaved heads, this can make a huge difference. As a pathalogical example, Mozilla's Try repository (where people push heads to trigger test/automation runs) has over 22,000 heads. The on-disk size of the repository is 3117 MB with the standard settings and 2139 MB with generaldelta! The bulk of this difference comes from the manifest revlog, which is 954 MB smaller with generaldelta.

Alternate compression format

Mercurial uses zlib for revlog compression by default. This is a safe choice. It's relatively fast and yields relatively good compression.

Since Mercurial is highly extensible, it's possible to plug in a custom compression format for revlogs. Facebook's lz4revlog extension will use lz4 for compression. lz4 is faster than zlib, but compression isn't as good. Repositories with lz4 are commonly ~1.5x larger. But CPU bound tasks can be significantly faster.

In theory, any compression format can be used. However, it's not trivially selectable in Mercurial (yet), so someone will need to provide an extension that implements your desired compression format.

Alternate storage backends

In theory, Mercurial revlogs can be backed by anything. This is the extensibility of Mercurial at work. There's just a Mercurial extension sitting between using SQL, S3, Cassandra, or any other storage backend for revlogs.

It's also possible to write custom revlog implementations that change the file layout for interesting scaling possibilities. For example, modern filesystems like ZFS and btrfs support block-level deduplication and transparent compression. If you had block-aligned revlog entries with deduplication enabled, servers could in theory only store each revision at most once. Another idea is to let the filesystem handle the compression. This would cause compression to occur in kernel space (rather than inside Python userspace). This may have beneficial performance properties. It may not. It may depend on the repository. These would be interesting experiments to conduct!

Caveat with alternate revlog implementations

Before you go experimenting with alternative revlog implementations, be forewarned that wall time performance for push and pull operations may suffer!

Currently, Mercurial isn't as smart as it could be when it comes to transferring bundles of changeset data between repositories. Let me explain.

When Mercurial transfers changeset data between repositories, it often uses an encoding format called a bundle. A bundle is effectively a custom archive format for revlog data. If you've ever used hg bundle or hg unbundle to transfer repository data, you've explicitly interacted with bundles. But what's lesser known is that the hg push and hg pull operations also transfer bundles. The only major difference is that they are created on the fly and their existence is hidden from view.

In theory, bundles can be encoded a number of different ways. The most common tunable is the compression format. Over the wire, bundles are zlib (or even uncompressed). If you run hg bundle, you'll likely produce a bzip2 bundle. (This is why hg unbundle can be slower than hg clone - the CPU time spent for bzip2 is much greater than for zlib.) But a problem with bundles as they exist today is that the format is rather static when it comes to what's transferred over the wire. Unless you've mucked about with settings, the client and server will send a zlib-compressed bundle using the physical revlog order. In other words, the bundle format is the Mercurial defaults.

If Mercurial were completely dumb, transferring bundles would involve 1) determining the full text of a revlog entry 2) compressing that entry into a bundle 3) sending that data to a peer 4) decompressing that entry 5) appending that entry to the appropriate revlog (which involves recompression). Fortunately, Mercurial is a bit smarter than that: Mercurial can detect when compressed bits in a bundle will match what is on disk and will avoid the unncessary compression operations.

Anyway, the current bundle transfer mechanism falls apart when there is a mismatch between client and server configurations or even when the server or client is using non-defaults. You can think of this as a revlog impedance mismatch. Essentially, when the client and server are operating with different or uncommon types of revlogs, Mercurial tends to default to the lowest common denominator, or zlib deltas against the physical parent. If, for example, a client wants to use generaldelta with a server employing defaults, the client will have to convert the server's deltas into generaldelta deltas. This requires a non-trivial amount of CPU and pull operations become slower. I believe the opposite is also true: generaldelta clients will emit generaldelta bundles, causing the server to recompute the deltas.

When it comes to custom revlog formats today, essentially nobody wins.

The good news is this will be fixed. There is an effort to improve the bundle format and wire protocols to make matters better. A little protocol negotiation can go a long way to make the situation a lot better. That said, there is still the underlying problem that some clients may want settings that differ from the server's. e.g. clients with SSDs likely want generaldelta because they don't need sequential I/O and SSD space is more expensive, so the smaller repository sizes achieved with generaldelta are appreciated. But a server operator may not want to force generaldelta on all clients because it would make clients on mechanical hard drives slower! The point is revlog impedance mismatch will occur and someone needs to spend the CPU cycles to rectify the matter. I suspect this will be pushed to clients since distributed CPU load is easier to deal with than centralized on the server. But, I wouldn't be surprised if Mercurial allowed server operators to configure the behavior. It's a hard problem. Time will tell.

In the mean time, just know that if you experiment with custom revlog settings, push and pull operations will likely be slower. You may not notice this on day-to-day operations. But on things like initial clone, you could experience a massive slow-down.

Here's some timings with mozilla-central with a Mercurial 1.9 server and client on the same SSD-backed multi-core machine:

clone default settings - 4:25
clone --uncompressed - 0:49
clone from generaldelta - 14:11
clone from generaldelta to generaldelta - 16:49
clone from generaldelta --uncompressed - 0:50
pull 2250 changesets, default from default - 6.7s
pull 2250 changesets, default from generaldelta - 17.9s
pull 2250 changesets, generaldelta from default - 28.5s
pull 2250 changesets, generaldelta from generaldelta - 20.0s

As you can see, anything involving compression and generaldelta is much slower. Once bundle format 2 is fully implemented, I expect the situation to improve. Until then, know that you'll get a ~2.5-4x slowdown from using generaldelta. You'll have to measure other revlog formats for their impact.

In conclusion, Mercurial's revlog file format is an interesting and tunable data structure. It works pretty well for most repositories. But if you are the 1%, you might want to spend some time to investigate changing its default configuration.

Review Board at Mozilla

January 27, 2014 at 04:30 PM | categories: Mozilla, mach

Some Mozillians recently stood up an instance of Review Board - a web-based code review tool for evaluation purposes. Having used Review Board at a previous company, I can say with high confidence that when properly configured, it beats the pants off Splinter (the code review interface baked into Bugzilla). Here are some advantages:

The HTML interface is more pleasant on the eyes (subjective).
Interdiffs actually work.
Intra-line diffs are rendered.
You can open issues for individual review comments and these issues can be tracked during subsequent reviews (Bugzilla doesn't really have anything similar and review comments tend to get lost unless the reviewer is sharp).
It integrates with the VCS, so you can expand code context from the review interface.
There are buttons to toggle whitespace differences.
Syntax hightlighting! It even recognizes things like TODO in comments.

You can read more from the official user guide.

If you have any interest in evaluating Review Board, the easiest way to upload patches to Mozilla's instance is to run mach rbt.

mach rbt will launch the Review Board tools command-line client (called RBTools). From there, you can do a number of things. Run mach rbt help to see the full list.

Here are some examples:

# See a diff that would be uploaded to Review Board:
$ mach rbt diff

# Create a review request based on the current Mercurial changeset:
$ mach rbt post

# That should print out a URL to the not-yet-published review
# request. If you go to that URL, you'll notice that the fields
# in that request are all empty.

# Next time, you can have some fields auto-populate by running:
$ mach rbt post --guess-summary --guess-description

# This grabs info from the commit message.

# To update an existing review request (e.g. to submit a new patch):
$ mach rbt post -r <review id>

# (where <review ID> is the ID of the review).

# You can also have it generate a "squashed" patch from multiple
# commits:
$ mach rbt post 164123::164125

Run mach rbt help post for more options. Also see the RBTools documentation for more.

It's worth noting that mach rbt will download an unreleased version of RBTools. This is because the released version doesn't work well with Mercurial. I contributed a handful of patches to RBTools to make Mercurial work better.

Before you dive in and start using Review Board for actual code review, there are some things you need to know:

Mozilla's Review Board instance does not yet send emails on changes. Bug 958236 tracks this. When it works, you'll see nice emails, just like you do for Bugzilla reviews.
Review Board doesn't currently interact with Bugzilla that well. In theory, we could have Review Board update corresponding Bugzilla bugs when actions are performed. Someone just needs to write this code and get it deployed.
If you create a Bugzilla attachment that contains the URL of a Review Board review (e.g. https://reviewboard.allizom.org/r/23/), Bugzilla will automatically set the MIME type as a Review Board review and set up an HTML redirect when the attachment is downloaded via the browser. You can even set r? on this attachment to have Bugzilla nag about reviews. See bug 875562 for an example.
There is currently no way to upload a patch to Review Board and update Bugzilla is one go. I have proof-of-concept code for this. Although, there is pushback on getting that checked in.
Review Board 2 is in development. It has a number of new and exciting features. And it looks better.

Finally and most importantly, Review Board at Mozilla is still in evaluation mode. It's usage has not been officially blessed as far as I know. I don't believe the SLA is as high as other services (like Bugzilla). Nobody is really using it yet. It still needs a lot of polish and integration for it to realize its potential. And, there is some talk about the future of code review at Mozilla that may or may not involve Review Board. In short, the future of Review Board at Mozilla is uncertain. I wouldn't rely on it to archive review comments from super important reviews/decisions.

Despite the shortcomings, I encourage people to play around with Review Board. If nothing else, at least gaze upon it's patch rendering beauty and witness what the future could hold.

Aggregating Version Control Info at Mozilla

January 21, 2014 at 10:50 AM | categories: Git, Mercurial, Mozilla, Python

Over the winter break, I set out on an ambitious project to create a service to help developers and others manage the flury of patches going into Firefox. While the project is far from complete, I'm ready to unleash the first part of the project upon the world.

If you point your browsers to moztree.gregoryszorc.com, you'll hopefully see some documentation about what I've built. Source code is available and free, of course. Patches very welcome.

Essentially, I built a centralized indexing service for version control repositories with Mozilla's extra metadata thrown in. I tell it what repositories to mirror, and it clones everything, fetches data such as the pushlog and Git SHA-1 mappings, and stores everything in a central database. It then exposes this aggregated data through world-readable web services.

Currently, I have the service indexing the popular project branches for Firefox (central, aurora, beta, release, esr, b2g, inbound, fx-team, try, etc). You can view the full list via the web service. As a bonus, I'm also serving these repositories via hg.gregoryszorc.com. My server appears to be significantly faster than hg.mozilla.org. If you want to use it for your daily needs, go for it. I make no SLA guarantees, however.

I'm also using this service as an opportunity to experiment with alternate forms of Mercurial hosting. I have mirrors of mozilla-central and the try repository with generaldelta and lz4 compression enabled. I may blog about what those are eventually. The teaser is that they can make Mercurial perform a lot faster under some conditions. I'm also using ZFS under the hood to manage repositories. Each repository is a ZFS filesystem. This means I can create repository copies on the server (user repositories anyone?) at a nearly free cost. Contrast this to the traditional method of full clones, which take lots of time, memory, CPU, and storage.

Anyway, some things you can do with the existing web service:

Obtain metadata about Mercurial changesets. Example.
Look up metadata about Git commits. Example.
Obtain a SPORE descriptor describing the web service endpoints. This allows you to auto-generate clients from descriptors. Yay!

Obviously, that's not a lot. But adding new endpoints is relatively straightforward. See the source. It's literally as easy as defining a URL mapping and writing a database query.

The performance is also not the best. I just haven't put in the effort to tune things yet. All of the querying hits the database, not Mercurial. So, making things faster should merely be a matter of database and hosting optimization. Patches welcome!

Some ideas that I haven't had time to implement yet:

Return changests in a specific repository
Return recently pushed changesets
Return pushes for a given user
Return commits for a given author
Return commits referencing a given bug
Obtain TBPL URLs for pushes with changeset
Integrate bugzilla metadata

Once those are in place, I foresee this service powering a number of dashboards. Patches welcome.

Again, this service is only the tip of what's possible. There's a lot that could be built on this service. I have ideas. Others have ideas.

The project includes a Vagrant file and Puppet manifests for provisioning the server. It's a one-liner to get a development environment up and running. It should be really easy to contribute to this project. Patches welcome.

Things Mozilla Could Do with Mercurial

January 17, 2014 at 03:00 PM | categories: Mercurial, Mozilla

As I've written before, Mercurial is a highly extensible version control system. You can do things with Mercurial you can't do in other version control systems.

In this post, I'll outline some of the cool things Mozilla could do with Mercurial. But first, I want to outline some features of Mercurial that many don't know exist.

pushkey and listkeys command

The Mercurial wire protocol (how two Mercurial peer repositories talk to each other over a network) contains two very useful commands: pushkey and listkeys. These commands allow the storage and listing of arbitrary key-value pair metadata in the repository.

This generic storage mechanism is how Mercurial stores and synchronizes bookmarks and phases information, for example.

By implementing a Mercurial extension, you can have Mercurial store key-value data for any arbitrary data namespace. You can then write a simple extension that synchronizes this data as part of the push and pull operations.

Extending the wire protocol

For cases where you want to transmit arbitrary data to/from Mercurial servers and where the pushkey framework isn't robust enough, it's possible to implement custom commands in the Mercurial wire protocol.

A server installs an extension making the commands available. A client installs an extension knowing how to use the commands. Arbitrary data is transferred or custom actions are performed.

When it comes to custom commands, the sky is really the limit. You could do pretty much anything from transfer extra data types (this is how the largefiles extension works) to writing commands that interact with remote agents.

Custom revision set queries and templating

Mercurial offers a rich framework for querying repository data and for formatting data. The querying is called revision sets and the later templates. If you are unfamiliar with the feature, I encourage you to run hg help revset and hg help templates right now to discover the awesomeness.

As I've demonstrated, you can do some very nifty things with custom revision sets and templating!

The possibilities

Now that you know some ways Mercurial can be extended, let's talk about some cool use cases at Mozilla. I want to be clear that I'm not advocating we do these things, just that they are possible and maybe they are a little cool.

Storing pushlog data

Mozilla records information about who pushed what changesets where and when in what's called the pushlog. The pushlog data is currently stored in a SQLite database inside the repository on the server. The data is made available via a HTTP+JSON API.

We could go a step further and make the pushlog data available via listkeys so Mercurial clients could download pushlog data with the same channel used to pull core repository data. (Currently, we have to open a new TCP connection and talk to the HTTP+JSON API.) This would make fetching of pushlog data faster, especially for clients on slow connections.

I concede this is an iterative improvement and adds little value beyond what we currently have. But if I were designing pushlog storage from scratch, this is how I'd do it.

Storing a changeset's automation results

The pushkey framework could be used to mark specific changesets as passing automation. When release automation or a sheriff determines that a changeset/push is green, they could issue an authenticated pushkey command to the Mercurial server stating such. Clients could then easily obtain a list of all changesets that are green.

Why stop there? We could also record automation failures in Mercurial as well. Depending on how complex this gets, we may outgrow pushkey and require a separate command. But that's all doable.

Anyway, clients could download automation results for a specific changeset as part of the repository data. The same extension that pulls down that data could also monkeypatch the bisection algorithm used by hg bisect to automatically skip over changesets that didn't pass automation. You'll never bisect a backed out changeset again!

If this automation data were stored on the Try repository, the autoland tool would just need to query the Mercurial repo to see which changesets are candidates for merging into mainline - there would be no need for a separate database and web service!

Marking a changeset as reviewed

Currently, Mozilla's review procedure is very patch and Bugzilla centric. But it doesn't have to be that way. (I argue it shouldn't be that way.)

Imagine a world where code review is initiated by pushing changesets to a special server, kind of like how Try magically turns pushes into automation jobs.

In this world, reviews could be initiated by issuing a pushkey or custom command to the server. This could even initiate server-side static analysis that would hold off publishing the review unless static analysis checks passed!

Granted review could be recorded by having someone issue a pushkey command to mark a changeset as reviewed. The channel to the Mercurial server is authenticated via SSH, so the user behind the current SSH key is the reviewer. The Mercurial server could store this username as part of the repository data. The autoland tool could then pull down the reviewer data and only consider changesets that have an appropriate reviewer.

It might also be possible to integrate crypto magic into this workflow so reviewers could digitally sign a changeset as reviewed. This could help with the verification of the Firefox source code that Brendan Eich recently outlined.

Like the automation data above, no separate database would be required: all data would be part of the repository. All you need to build is a Mercurial extension.

Encouraging best practices

Mozillians have written a handful of useful Mercurial extensions to help people become more productive. We have also noticed that many developers are still (unknowingly?) running old, slow, and buggy Mercurial releases. We want people to have the best experience possible. How do we do that?

One idea is to install an extension on the server that strongly recommands or even requires users follow best practices (minimal HG version, installed extensions, etc).

I have developed a proof-of-concept that does just this.

Rich querying of metadata

When you start putting more metadata into Mercurial (or at least write Mercurial extensions to aggregate this metadata), all kinds of interesting query opportunities open up. Using revsets and templates, you can do an awful lot to use Mercurial as a database of sorts to extract useful reports.

I dare say reports like John O'duinn's Monthly Infrastructure Load posts could be completely derived from Mercurial. I've demonstrated this ability previously. That's only the tip of the iceburg.

Summary

We could enable a lot of new and useful scenarios by extending Mercurial. We could accomplish this without introducing new services and tools into our already complicated infrastructure and workflows.

The possibilities I've suggested are by no means exhaustive. I encourage others to dream up new and interesting ideas. Who knows, maybe some of them may actually happen.

mach now lives in mozilla-central

January 09, 2014 at 10:55 AM | categories: Mozilla, mach

mach -- the generic command line interface framework that is behind the mach tool used to build Firefox -- now has its canonical home in mozilla-central, the canonical repository for Firefox. The previous home has been updated to reflect the change.

mach will continue to be released on PyPI and installable via pip install mach.

I made the change because keeping multiple repositories in sync wasn't something I wanted to spend time doing. Furthermore, Mozillians have been contributing a steady stream of improvements to the mach core recently and it makes sense to leverage Mozilla's familiar infrastructure for patch contribution.

This decision may be revisited in the future. Time will tell.

« Previous Page -- Next Page »