Gregory Szorc's Digital Home

Cloning From S3

July 08, 2015 at 11:40 AM | categories: Mercurial, Mozilla

A month ago, I blogged about faster cloning from hg.mozilla.org using bundle files. But deploying the feature on the servers was only the tip of the iceberg.

At the end of last week, Firefox release automation rolled out the bundleclone extension to their Linux and OS X machines. Essentially, clones are bootstrapped from Amazon S3 automatically once this extension is installed.

In just a few days of deployment, we've already seen a drastic shift in traffic. On UTC day 2015-07-07, S3 served 1,563,014,396,236 bytes of repository data! The hg.mozilla.org servers themselves served a total of 1,976,057,201,583 bytes.

But we're not done. The bundleclone extension isn't yet deployed on Windows. Nor is it always used on TaskCluster. In addition, there are still some high-use repositories that don't have bundles being generated. Yesterday, I crunched the data and enabled bundles on more repositories. It's too early to have conclusive data, but this should move an additional several hundred gigabytes of traffic to S3.

At Whistler, I said that reducing traffic on hg.mozilla.org by 90% is within the realm of possibility. Between a partial rollout of S3 clone offload that is already serving 44% of traffic and a feature I'm working on in core Mercurial to enable auto sharing of repository data, I'd say we're well on track.

Changeset Metadata on hg.mozilla.org

June 04, 2015 at 01:55 PM | categories: Mercurial, Mozilla

Just a few minutes ago, I deployed some updates to hg.mozilla.org to display more metadata on changeset pages. See 4b69a62d1905, dc4023d54436, and b617a57d6bf1 for examples of what's shown.

We currently display:

More detailed pushlog info. (Before you had to load another page to see things like the date of the push.)
The list of reviewers, each being a link that searches for other changesets they've reviewed.
A concise list of bugs referenced in the commit message.
Links to changesets that were backed out by this changeset.
On changesets that were backed out, we may also display a message that the changeset was backed out.
For Firefox repos, we also display the application milestone. This is the Gecko/app version recorded in the config/milestone.txt file in that changeset. The value can be used to quickly answer the question What versions of Firefox have this changeset.

If you notice any issues or have requests for new features, please file a bug.

This work is built on top of a feature I added to Mercurial 3.4 to make it easier to inject extra data into Mercurial's web templates. We just deployed Mercurial 3.4.1 to hg.mozilla.org yesterday. It's worth noting that this deployment happened in the middle of the day with no user-perceived downtime. This is a far cry from where we were a year ago, when any server change required a maintenance window. We've invested a lot of work into a test suite for this service so we can continuously deploy without fear of breaking things. Moving fast feels so good.

Faster Cloning from hg.mozilla.org With Server Provided Bundles

May 29, 2015 at 11:30 AM | categories: Mercurial, Mozilla

When you type hg clone, the Mercurial server will create a bundle from repository content at the time of the request and stream it to the client. (Git works essentially the same way.)

This approach usually just works. But there are some downsides, particularly with large repositories.

Creating bundles for large repositories is not cheap. For mozilla-central, Firefox's main repository, it takes ~280s of CPU time on my 2014 MacBook Pro to generate a bundle. Every time a client runs a hg clone https://hg.mozilla.org/mozilla-central, a server somewhere is spinning a CPU core generating ~1.1 GB of data. What's more, if another clone arrives at the same time, another process will perform the exact same work! When we talk about multiple minutes of CPU time per request, this extra work starts to add up.

Another problem with large repositories is interrupted downloads. If you suffer a connectivity blip during your clone command, you'll have to start from scratch. This potentially means re-transferring hundreds of megabytes from the server. It also means the server has to generate a new bundle, consuming even more CPU time. This is not good for the user or the server.

There have been multiple outages of hg.mozilla.org as a result of the service being flooded with clone requests to large repositories. Dozens of clients (most of them in Firefox or Firefox OS release automation) have cloned the same repository around the same time and overwhelmed network bandwidth in the data center or CPU cores on the Mercurial servers.

A common solution to this problem is to not use the clone command to receive initial repository data from the server. Instead, a static bundle file will be generated and made available to clients. Clients will call hg init to create an empty repository then will perform an hg unbundle to apply the contents of a pre-generated bundle file. They will then run hg pull to fetch new data that was created after the bundle was generated. (It's worth noting that Git's clone --reference option is similar.)

This is a good technical solution. Firefox and Firefox OS release automation have effectively implemented this. However, it is a lot of work: you have to build your own bundle generation and hosting infrastructure and you have to remember that every hg clone should probably be using bundles instead. It is extra complexity and complexity that must be undertaken by every client. If a client forgets, the consequences can be disastrous (clone flooding leading to service outage). Client-side opt-in is prone to lapses and doesn't scale.

As of today, we've deployed a more scalable, server-based solution to hg.mozilla.org.

hg.mozilla.org is now itself generating bundles for a handful of repositories, including mozilla-central, inbound, fx-team, and mozharness. These bundles are being uploaded to Amazon S3. And those bundles are being advertised by the server over Mercurial's wire protocol.

When you install the bundleclone Mercurial extension, hg clone is taught to look for bundles being advertised on the server. If a bundle is available, the bundle is downloaded, applied, and then the client does the equivalent of an hg pull to fetch all new data since when the bundle was generated. If a bundle exists, it is used transparently: no client side cooperation is needed beyond installing the bundleclone extension. If a bundle doesn't exist, it simply falls back to Mercurial's default behavior. This effectively shifts responsibility for doing efficient clones from clients to server operators, which means server operators don't need cooperation from clients to enact important service changes. Before, if clients weren't using bundles, we'd have to wait for clients to update their code. Now, we can see a repository is being cloned heavily and start generating bundles for it without having to wait for the client to deploy new code.

Furthermore, we've built primitive content negotiation into the process. The server doesn't simply advertise one bundle file: it advertises several bundle files. We offer gzip, bzip2, and stream bundles. gzip is what Mercurial uses by default. It works OK. bzip2 bundles are smaller, but they take longer to process. stream bundles are essentially tar archives of the .hg/store directory and are larger than gzip bundles, but insanely fast because there is very little CPU required to apply them. In addition, we advertise URLs for multiple S3 regions, currently us-west-2 (Oregon) and us-east-1 (Virginia). This enables clients to prefer the bundle most appropriate for them.

A benefit of serving bundles from S3 is that Firefox and Firefox OS release automation (the biggest consumers of hg.mozilla.org) live in Amazon EC2. They are able to fetch from S3 over a gigabit network. And, since we're transferring data within the same AWS region, there are no data transfer costs. Previously, we were transferring ~1.1 GB from a Mozilla data center to EC2 for each clone. This took up bandwidth in Mozilla's network and cost Mozilla money to send data thousands of miles away. And, we never came close to saturating a gigabit network (we do with stream bundles). Wins everywhere!

The full instructions detail how to use bundleclone. I recommend everyone at Mozilla install the extension because there should be no downside to doing it.

Once bundleclone is deployed to Firefox and Firefox OS release automation, we should hopefully never again see those machines bring down hg.mozilla.org due to a flood of clone requests. We should also see a drastic reduction in load to hg.mozilla.org. I'm optimistic bandwidth will decrease by over 50%!

It's worth noting that the functionality from the bundleclone extension is coming to vanilla Mercurial. The functionality (which was initially added by Mozilla's Mike Hommey) is part of Mercurial's bundle2 protocol, which is available, but isn't enabled by default yet. bundleclone is thus a temporary solution to bring us server stability and client improvements until modern Mercurial versions are deployed everywhere in a few months time.

Finally, I would like to credit Augie Fackler for the original idea for server-assisted bundle-based clones.

Firefox Mercurial Repository with CVS History

May 18, 2015 at 08:40 AM | categories: Mercurial, Mozilla

When Firefox made the switch from CVS to Mercurial in March 2007, the CVS history wasn't imported into Mercurial. There were good reasons for this at the time. But it's a decision that continues to have side-effects. I am surprised how often I hear of engineers wanting to access blame and commit info from commits now more than 9 years old!

When individuals created a Git mirror of the Firefox repository a few years ago, they correctly decided that importing CVS history would be a good idea. They also correctly decided to combine the logically same but physically separate release and integration repositories into a unified Git repository. These are things we can't easily do to the canonical Mercurial repository because it would break SHA-1 hashes, breaking many systems, and it would require significant changes in process, among other reasons.

While Firefox developers do have access to a single Firefox repository with full CVS history (the Git mirror), they still aren't satisfied.

Running git blame (or hg blame for that matter) can be very expensive. For this reason, the blame interface is disabled on many web-based source viewers by default. On GitHub, some blame URLs for the Firefox repository time out and cause GitHub to display an error message. No matter how hard you try, you can't easily get blame results (running a local Git HTTP/HTML interface is still difficult compared to hg serve).

Another reason developers aren't satisfied with the Git mirror is that Git's querying tools pale in comparison to Mercurial's. I've said it before and I'll say it again: Mercurial's revision sets and templates are incredibly useful features that enable advanced repository querying and reporting. Git's offerings come nowhere close. (I really wish Git would steal these awesome features from Mercurial.)

Anyway, enough people were complaining about the lack of a Mercurial Firefox repository with full CVS history that I decided to create one. If you point your browsers or Mercurial clients to https://hg.mozilla.org/users/gszorc_mozilla.com/gecko-full, you'll be able to access it.

The process used for the conversion was the simplest possible: I used hg-git to convert the Git mirror back to Mercurial.

Unlike the Git mirror, I didn't include all heads in this new repository. Instead, there is only mozilla-central's head (the current development tip). If I were doing this properly, I'd include all heads, like gecko-aggregate.

I'm well aware there are oddities in the Git mirror and they now exist in this new repository as well. My goal for this conversion was to deliver something: it wasn't a goal to deliver the most correct result possible.

At this time, this repository should be considered an unstable science experiment. By no means should you rely on this repository. But if you find it useful, I'd appreciate hearing about it. If enough people ask, we could probably make this more official.

Notes from Git Merge 2015

May 12, 2015 at 03:40 PM | categories: Git, Mercurial, Mozilla

Git Merge 2015 was a Git user conference held in Paris on April 8 and 9, 2015.

I'm kind of a version control geek. I'm also responsible for a large part of Mozilla's version control hosting. So, when the videos were made public, you can bet I took interest.

This post contains my notes from a few of the Git Merge talks. I try to separate content/facts from my opinions by isolating my opinions (within parenthesis).

Git at Google

Git at Google: Making Big Projects (and everyone else) Happy is from a Googler (Dave Borowitz) who works on JGit for the Git infrastructure team at Google.

"Everybody in this room is going to feel some kind of pain working with Git at scale at some time in their career."

First Git usage at Google in 2008 for Android. 2011 googlesource.com launches.

24,000 total Git repos at Google. 77.1M requests/day. 30-40 TB/day. 2-3 Gbps.

Largest repo is 210GB (not public it appears).

800 repos in AOSP. Google maintains internal fork of all Android repos (so they can throw stuff over the wall). Fresh AOSP tree is 17 GiB. Lots of contracts dictating access.

Chrome repos include Chromium, Blink, Chromium OS. Performed giant Subversion migration. Developers of Chrome very set in their ways. Had many workflows integrated with Subversion web interface. Subversion blame was fast, Git blame slow. Built caching backend for Git blame to make developers happy.

Chromium 2.9 GiB, 3.6M objects, 390k commits. Blink 5.3 GiB, 3.1M objects, 177k commits. They merged them into a monorepo. Mention of Facebook's monorepo talk and Mercurial scaling efforts for a repo larger then Chromium/Blink monorepo. Benefits to developers for doing atomic refactorings, etc.

"Being big is hard."

AOSP: 1 Gbps -> 2 minutes for 17 GiB. 20 Mbps -> 3 hours. Flaky internet combined with non-resumable clone results in badness. Delta resolution can take a while. Checkout of hundreds of thousands of files can be slow, especially on Windows.

"As tool developers... it's hard to tell people don't check in large binaries, do things this way, ... when all they are trying to do is get their job done." (I couldn't agree more: tools should ideally not impose sub-optimal workflows.)

They prefer scaling pain to supporting multiple tools. (I think this meant don't use multiple VCSs if you can just make one scale.)

Shallow clone beneficial. But some commands don't work. log not very useful.

Narrow clones mentioned. Apparently talked about elsewhere at Git Merge not captured on video. Non-trivial problem for Git. "We have no idea when this is going to happen."

Split repos until narrow clone is available. Google wrote repo to manage multiple repos. They view repo and multiple repos as stop-gap until narrow clone is implemented.

git submodule needs some love. Git commands don't handle submodules or multiple repos very well. They'd like to see repo features incorporated into git submodule.

Transition to server operation.

Pre-2.0, counting objects was hard. For Linux kernel, 60s 100% CPU time per clone to count objects. "Linux isn't even that big."

Traditionally Git is single homed. Load from users. Load from automation.

Told anecdote about how Google's automation once recloned the repo after a failed Git command. Accidentally made a change one day that caused a command to persistently fail. DoS against server. (We've had this at Mozilla.)

Garbage collection on server is CPU intensive and necessary. Takes cores away from clients.

Reachability bitmaps implemented in JGit, ported to Git 2.0. Counting objects for Linux clones went from 60s CPU to ~100ms.

Google puts static, pre-generated bundles on a CDN. Client downloads bundle then does incremental fetch. Massive reduction in server load. Bundle files better for users. Resumable.

They have ideas for integrating bundles into git fetch, but it's "a little way's off." (This feature is partially implemented in Mercurial 3.4 and we have plans for using it at Mozilla.) It's feature in repo today.

Shared filesystem would be really nice to spread CPU load. NFS "works." Performance problems with high throughput repositories.

Master-mirror replication can help. Problems with replication lag. Consistency is hard.

Google uses a distributed Git store using Bigtable and GFS built on JGit. Git-aware load balancer. Completely separate pool of garbage collection workers. They do replication to multiple datacenters before pushes. 6 datacenters across world. Some of their stuff is open source. A lot isn't.

Humans have difficulty managing hundreds of repositories. "How do you as a human know what you need to modify?" Monorepos have this problem too. Inherent with lots of content. (Seemed to imply it is worse with multiple repos than with a monorepo.)

Porting changes between forks is hard. e.g. cherry picking between internal and external Android repos.

ACLs are a mess.

Google built Gerrit code review. It does ACLs, auto rebasing, release branch management. It's a swiss army knife. (This aligns with my vision for MozReview and code-centric development.)

Scaling Git at Twitter

Wilhelm Bierbaum from Twitter talks about Scaling Git at Twitter.

"We've decided it's really important to make Twitter a good place to work for developers. Source control is one of those points where we were lacking. We had some performance problems with Git in the past."

Twitter runs a monorepo. Used to be 3 repos. "Working with a single repository is the way they prefer to develop software when developing hundreds of services." They also have a single build system. They have a geo diverse workforce.

They use normal canonical implementation of Git + some optimizations.

Benefits of a monorepo:

Visibility. Easier for people to find code in one repo. Code search tools tailored towards single repos.

Single toolchain. single set of tools to build, test, and deploy. When improvements to tools made, everyone benefits because one toolchain.

Easy to share code (particularly generated code like IDL). When operating many small services, services developed together. Code duplication is minimized. Twitter relies on IDL heavily.

Simpler to predict the impact of your changes. Easier to look at single code base then to understand how multiple code bases interact. Easy to make a change and see what breaks rather than submit changes to N repos and do testing in N repos.

Makes refactoring less arduous.

Surfaces architecture issues earlier.

Breaking changes easier to coordinate

Drawbacks of monorepos:

Large disk footprint for full history.

Tuning filesystem only goes so far.

Forces some organizations to adopt sparse checkouts and shallow clones.

Submodules aren't good enough to use. add and commit don't recognize submodule boundaries very well and aren't very usable.

"To us, using a tool such as repo that is in effect a secondary version control tool on top of Git does not feel right and doesn't lead to a fluid experience."

Twitter has centralized use of Git. Don't really benefit from distributed version control system. Feature branches. Goal is to live as close to master as possible. Long-running branches discouraged. Fewer conflicts to resolve.

They have project-level ownership system. But any developer can change anything.

They have lots of read-only replicas. Highly available writable server.

They use reference repos heavily so object exchange overhead is manageable.

Scaling issues with many refs. Partially due to how refs are stored on disk. File locking limits in OS. Commands like status, add, and commit can be slow, even with repo garbage collected and packed. Locking issues with garbage collection.

Incorporated file alteration monitor to make status faster. Reference to Facebook's work on watchman and its Mercurial integration. Significant impact on OS X. "Pretty much all our developers use OS X." (I assume they are using Watchman with Git - I've seen patches for this on the Git mailing list but they haven't been merged into core yet.)

They implemented a custom index format. Adopted faster hashing algorithm that uses native instructions.

Discovery with many refs didn't scale. 13 MB of raw data for refs exchange at Twitter. (!!) Experimenting with clients sending a bloom filter of refs. Hacked it together via HTTP header.

Fetch protocol is expensive. Lots of random I/O. Can take minutes to do incremental fetches. Bitmap indices help, but aren't good enough for them. Since they have central and well-defined environment, they changed fetch protocol to work like a journal: send all changed data since client's last fetch. Server work is essentially a sendfile system call. git push appends push packs to a log-structured journal. On fetch, clients "replay" the transactions from the journal. Similar to MySQL binary log replication. (This is very similar to some of the Mercurial replication work I'm doing at Mozilla. Yay technical validation.) (Append only files are also how Mercurial's storage model works by default.)

Log-structured data exchange means server side is cheap. They can insert HTTP caches to handle Range header aware requests.

Without this hack, they can't scale their servers.

Initial clone is seeded by BitTorrent.

It sounds like everyone's unmerged feature branches are on the one central repo and get transferred everywhere by default. Their journal fetch can selectively fetch refs so this isn't a huge problem.

They'd like to experiment with sparse repos. But they haven't gotten to that yet. They'd like a better storage abstraction in Git to enable necessary future scalability. They want a network-aware storage backend. Local objects not necessary if the network has them.

They are running a custom Git distribution/fork on clients. But they don't want to maintain forever. Prefer to send upstream.

« Previous Page -- Next Page »