Gregory Szorc's Digital Home

Python Bindings Updates in Clang 3.1

May 14, 2012 at 12:05 AM | categories: Python, Clang, compilers

Clang 3.1 is scheduled to be released any hour now. And, I'm proud to say that I've contributed to it! Specifically, I've contributed improvements to the Python bindings, which are an interface to libclang, the C interface to Clang.

Since 3.1 is being released today, I wanted to share some of the new features in this release. An exhaustive list of newly supported APIs is available in the release notes.

Diagnostic Metadata

Diagnostics are how Clang represents warnings and errors during compilation. The Python bindings now allow you to get at more metadata. Of particilar interest is Diagnostic.option. This property allows you to see the compiler flag that triggered the diagnostic. Or, you could query Diagnostic.disable_option for the compiler flag that would silence this diagnostic.

These might be useful if you are analyzing diagnostics produced by the compiler. For example, you could parse source code using the Python bindings and collect aggregate information on all the diagnostics encountered.

Here is an example:

from clang.cindex import Index

index = Index.create()
tu = index.parse('hello.c')

for diag in tu.diagnostics:
    print diag.severity
    print diag.location
    print diag.spelling
    print diag.option

Or, if you are using the Python bindings from trunk:

from clang.cindex import TranslationUnit

tu = TranslationUnit.from_source('hello.c')
...

Sadly, the patch that enabled this simpler usage did not make the 3.1 branch.

Finding Entities from Source Location

Two new APIs, SourceLocation.from_position and Cursor.from_location, allow you to easily extract a cursor in the AST from any arbitrary point in a file.

Say you want to find the element in the AST that occupies column 6 of line #10 in the file foo.c:

from clang.cindex import Cursor
from clang.cindex import Index
from clang.cindex import SourceLocation

index = Index.create()
tu = index.parse('foo.c')

f = File.from_name(tu, 'foo.c')

location = SourceLocation.from_position(tu, f, 10, 6)
cursor = Cursor.from_location(tu, location)

Of course, you could do this by iterating over cursors in the AST until one with the desired source range is found. But, that would involve more API calls.

I would like to say that these APIs feel klunky to me. There is lots of redundancy in there. In my opinion, there should just be a TranslationUnit.get_cursor(file='foo.c', line=10, column=6) that does the right thing. Maybe that will make it into a future release. Maybe it won't. After all, the Python bindings are really a thin wrapper around the C API and an argument can be made that there should be minimal extra logic and complexity in the Python bindings. Time will tell.

Type Metadata

It is now possible to access more metadata on Type instances. For example, you can:

See what the elements of an array are using Type.get_array_element_type
See how many elements are in a static array using Type.get_array_element_count
Determine if a function is variadic using Type.is_function_variadic
Inspect the Types of function arguments using Type.argument_types

In this example, I will show how to iterate over all the functions declared in a file and to inspect their arguments.

from clang.cindex import CursorKind
from clang.cindex import Index
from clang.cindex import TypeKind

index = Index.create()
tu = index.parse('hello.c')

for cursor in tu.cursor.get_children():
    # Ignore AST elements not from the main source file (e.g.
    # from included files).
    if not cursor.location.file or cursor.location.file.name != 'hello.c':
        continue

    # Ignore AST elements not a function declaration.
    if cursor.kind != CursorKind.FUNCTION_DECL:
        continue

    # Obtain the return Type for this function.
    result_type = cursor.type.get_result()

    print 'Function: %s' % cursor.spelling
    print '  Return type: %s' % result_type.kind.spelling
    print '  Arguments:'

    # Function has no arguments.
    if cursor.type.kind == TypeKind.FUNCTIONNOPROTO:
        print '    None'
        continue

    for arg_type in cursor.argument_types():
        print '    %s' % arg_type.kind.spelling

This example is overly simplified. A more robust solution would also inspect the Type instances to see if they are constants, check for pointers, check for variadic functions, etc.

An example application of these APIs is to build a tool which automatically generated ctypes or similar FFI bindings. Many of these tools today use custom parsers. Why invent a custom (and likely complex) parser when you can call out into Clang and have it to all the heavy lifting for you?

Future Features

As I write this, there are already a handful of Python binding features checked into Clang's SVN trunk that were made after the 3.1 branch was cut. And, I'm actively working at integrating many more.

Still to come to the Python bindings are:

Better memory management support (currently, not all references are kept everywhere, so it is possible for a GC to collect and dispose of objects that should be alive, even though they are not in scope).
Support for token API (lexer output)
More complete coverage of Cursor and Type APIs
More friendly APIs

I have a personal goal for the Python bindings to cover 100% of the functionality in libclang. My work towards that goal is captured in my python features branch on GitHub. I periodically clean up a patch, submit it for review, apply feedback, and commit. That branch is highly volatile and I do rebase. You have been warned.

Furthermore, I would like to add additional functionality to libclang [and expose it to Python]. For example, I would love for libclang to support code generation (i.e. compiling), not just parsing. This would enable all kinds of nifty scenarios (like channeling your build system's compiler calls through a proxy which siphons off metadata such as diagnostics).

Credits and Getting Involved

I'm not alone in my effort to improve Clang's Python bindings. Anders Waldenborg has landed a number of patches to add functionality and tests. He has also been actively reviewing patches and implementing official LLVM Python bindings! On the reviewing front, Manuel Klimek has been invaluable. I've lost track of how many bugs he's caught and good suggestions he's made. Tobias Grosser and Chandler Carruth have also reviewed their fair share of patches and handled community contributions.

If you are interested in contributing to the Python bindings, we could use your help! You can find me in #llvm as IndyGreg. If I'm not around, the LLVM community is generally pretty helpful, so I'm sure you'll get an answer. If you prefer email, send it to the cfe-dev list.

If you have any questions, leave them in the comments or ask using one of the methods above.

Better Sharing of Test Code in Mozilla Projects

May 10, 2012 at 10:35 AM | categories: Mozilla, Firefox, testing

Just landed in mozilla-inbound (Firefox's integration tree) is support for test-only JavaScript modules. That is, JavaScript modules that are utilized by just test code. This is being tracked in bug 748490.

The use case for this feature is sharing common test code, mock types, etc between tests. For example, in the Services code, we have a number of mock types (like a JS implementation of the Sync HTTP server) that need to be utilized across sub-modules. With test-only modules, it is now possible to publish these modules to a common location and import them using the familiar Cu.import() syntax. Previously, you had to perform the equivalent of a #include (possibly by utilizing the [head] section of xpcshell.ini files). The previous method of importing is dirty because you pollute the global object. Furthermore, it is really inconvenient when you wish to utilize shared files from different directories. See this file for an example.

The new method of publishing and consuming test-only JavaScript modules is clean and simple. From your Makefile, define TESTING_JS_MODULES to a list of (JavaScript) files to publish. Optionally, define TESTING_JS_MODULE_DIR to the relative path they should be published to. If the directory variable is not defined, they will be published to the root directory. Here is an example Makefile.in:

DEPTH     = ../..
topsrcdir = @top_srcdir@
srcdir    = @srcdir@

include $(DEPTH)/config/autoconf.mk

TESTING_JS_MODULES = mockserver.js common.js
TESTING_JS_MODULE_DIR = foobar

All test modules are installed to a common directory somewhere in the object directory. Where is not relevant. Just know it is outside the normal distribution directory, so the test modules aren't packaged. This common directory is registered with the resource manager under resource://testing/. So, once a build is performed, you can import these files via Components.utils.import():

Cu.import("resource://testing-common/foobar/mockserver.js");

I hope this feature facilitates better reuse of test code. So, next time you are writing test code, please consider writing writing and publishing it as a module so others can utilize it.

One more thing. Currently, integration with the resource manager is only implemented for xpcshell tests. I'd like to see this supported in all the test runners eventually. I implemented xpcshell support because a) that is the test harness I use almost exclusively and b) it is the only one I'm comfortable modifying. If you want to implement support in another test runner, please have a go at it!

Improving the Mozilla Build System Experience

May 07, 2012 at 04:45 PM | categories: Mozilla, Firefox

tl;dr User experience matters and developers are people too. I have proposed a tool to help developers interact with the Firefox build system and source tree.

I don't think I have to make my case when I state that Mozilla's build system end-user experience is lacking. There are lots of hurdles to overcome:

Determine where to obtain the source code.
Install a source control system (possibly).
Wait a long time for the large source repository to download.
Figure out how to launch the build process (unlike many other build systems, it isn't as simple as configure or make - although it is close).
Determine which dependencies need to be installed and install them (this can also take a long time).
Create a configuration file (mozconfig).
Build the tree (another long process).

If you want to contribute patches, there are additional steps:

Configure Mercurial with your personal info.
Configure Mercurial to generate patches in proper format.
Create a Bugzilla account (made simpler through Persona!).
Figure out the proper Bugzilla product/component (even I still struggle at this) so you can file a bug.
Figure out how to attach a patch to a bug and request review (it isn't intuitive if you've never used Bugzilla before).
Figure out who should review patch.
Learn how tests work so you can:
Write new tests.
Run existing tests to verify your changes.
Obtain commit access (so at least you can push to Try).
Learn how to push to Try.
Learn about TBPL.
Discover and use some of the amazing tools to help you (MXR, trychooser, mqext, etc).

Granted, not all of these are required. But, they will be for returning contributors. My point is that there are lots of steps here. And, every one of them represents a point where someone could get frustrated and bail -- a point where Mozilla loses a potential contributor.

Ever since I started at Mozilla, I've been thinking of ways this could be done better. While the Developer Guide on MDN has improved drastically in the last year, there are still many ways the process could be improved and streamlined.

In bug 751795, I've put forward the groundwork of a tool to make the developer experience more user friendly. Yes, this is a vague goal, so let me go in to further detail.

What I've submitted in the bug is essentially a framework for performing common actions related to the build system and source tree. These actions are defined as methods in Python code. Hooking it all together is a command-line interface which is launched via a short script in the root directory called mach (mach is German for do). Since actions speak louder than words, here's an example:

$ ./mach

usage: mach command [command arguments]

This program is your main control point for the Mozilla source tree.

To perform an action, specify it as the first argument. Here are some common
actions:

  mach build         Build the source tree.
  mach help          Show full help.
  mach xpcshell-test Run xpcshell test(s).

To see more help for a specific action, run:

  mach <command> --help

e.g. mach build --help

And, going into a sub-command:

$ ./mach xpcshell-test --help

usage: mach xpcshell-test [-h] [--debug] [TEST]

positional arguments:
  TEST         Test to run. Can be specified as a single JS file, an
               xpcshell.ini manifest file, a directory, or omitted. If
               omitted, the entire xpcshell suite is executed.

optional arguments:
  -h, --help   show this help message and exit
  --debug, -d  Run test in debugger.

Now, I've focused effort at this stage on performing actions after the initial build environment is configured. The reason is this is low-hanging fruit and easily allows me to create a proof-of-concept. But, I have many more ideas that I'd eventually like to see implemented.

One of my grand ideas is to have some kind of setup wizard guide you through the first time you use mach. It can start by asking the basics: "Which application do you want to build?" "Release or Debug?" "Clang or GCC?" "Should I install Clang for you?" It could also be more intelligent about installing dependencies. "I see you are using Ubuntu and are missing required packages X and Y. Would you like me to install them?" And, why stop at a command-line interface? There's no reason a graphical frontend (perhaps Tcl/Tk) couldn't be implemented!

The setup wizard could even encompass configuring your source control system for proper patch generation by ensuring your tree-local .hg/hgrc or .git/config files have the proper settings. We could even ask you for Bugzilla credentials so you could interact with Bugzilla directly from the command-line.

Once we have all of the basic configs in place, it's just a matter of hooking up the plumbing. Want to submit a patch for review? We could provide a command for that:

./mach submit-patch

"refactor-foo" is currently on top of your patch queue.

Submit "refactor-foo"?
y/n: y

Enter bug number for patch or leave empty for no existing bug.
Bug number:

OK. A new bug for this patch will be created.

Please enter a one-line summary of the patch:
Summary: Refactor foo subsystem

Is the patch for (r)eview, (f)eedback, or (n)either?
r/f/n: r

I've identified Gregory Szorc (:gps) as a potential reviewer for
this code. If you'd like someone else, please enter their IRC
nickname or e-mail address. Otherwise, press ENTER.
Reviewer:

I'm ready to submit your patch. Press ENTER to continue or CTRL+C to
abort.

Bug 700000 submitted! You can track it at
https://bugzilla.mozilla.org/show_bug.cgi?id=700000

The framework is extremely flexible and extensible for a few reasons. First, it encourages all of the core actions to be implemented as Python modules/methods. Once you have things defined as API calls (not shell scripts), the environment feels like a cohesive library rather than a loose collection of shell scripts. Shell scripts have a place, don't get me wrong. But, they are hard to debug and test (not to mention performance penalties on Windows). Writing code as reusable libraries with shell scripts only being the frontend is a more robust approach to software design.

Second, the command-line driver is implemented as a collection of sub-commands. This is similar to how version control systems like Git, Mercurial, and Subversion work. This makes discovery of features extremely easy: just list the supported commands! Contrast this to our current build system, where the answer is to consult a wiki (with likely out-of-date and fragmented information) or gasp try to read the makefiles in the tree.

My immediate goal for bug 751795 is to get a minimal framework checked in to the tree with a core design people are content with. Once that is done, I'm hoping other people will come along and implement additional features and commands. Specifically, I'd like to see some of the awesome tools like mqext integrated such that their power can be harnessed without requiring people to first discover they exist and second install and configure them. I think it is silly for these obvious productivity wins to go unused by people ignrant of their existence. If they are valuable, let's ship them as part of a batteries included environment.

In the long run, I think there are many more uses for this framework. For starters, it gives us a rallying point around which to organize all of the Python support/tools code in the tree. Currently, we have things spread all over the place. Quite frankly, it is a mess. I'd like to have a unified site-packages tree with all our Python so things are easier to locate and thus improve.

If nothing else, the tool provides a framework for logging and formatting activities in a unified way. There are separate log streams: one for humans, one for machines. Under the hood, they both use the saming logging infrastructure. When messages are logged, the human stream is formatted as simple sentences (complete with terminal encodings and colorization). The machine-destined log stream is newline-delimited JSON containing the fields that were logged. This allows analysis of output without having to parse strings. This is how all log analysis should be done. But, that's for another post. Anyway, what this all means is that the output for humans can be more readable. Colors, progress bars: we can do that now.

Over time, I imagine some may want to move logic out of configure and makefiles and into this tool (because Python is easier to maintain and debug, IMO). I would love to see that too. But, I want to stress that this isn't a focus right now. I believe this framework should be supplemental in the beginning and the official build system should not rely on it. Maybe that changes in the future. Time will tell.

Anyway, this project is currently just my solo effort. This isn't captured on a roadmap or anyone's quarterly goals. There is no project page listing planned features. If you are interested in helping, drop me a line and/or join in on the bug. Hopefully the core framework will land soon. Once it does, I'm hoping for an explosion of new, user-friendly features/commands to make the overall Firefox development experience smoother.

Comparing the Security and Privacy of Browser Syncing

April 08, 2012 at 09:00 PM | categories: Mozilla, security, browsers, Firefox, internet

Many popular web browsers offer built-in synchronization of browser data (history, bookmarks, passwords, etc) across devices. In this post, I examine the data security and privacy aspects of some of them.

Chrome

Chrome and Chromium have comprehensive support for browser sync.

When you sign in to Chrome (using your Google Account credentials), Chrome prompts you to set up sync. By default, all data types are uploaded to Google's servers.

The default behavior is for Chrome to encrypt your passwords before uploading them to the server. All of your remaining data (history, bookmarks, etc) is uploaded to Google unencrypted. This means anyone with access to Google's servers has full access to your history, etc.

Access to the uploaded data is governed by the Google Chrome Privacy Notice. This policy (pulled on April 3, 2012) states that the sync data is governed by the unified Google Privacy Policy. This policy states (as pulled on April 4, 2012):

We use the information we collect from all of our services to
provide, maintain, protect and improve them, to develop new ones,
and to protect Google and our users. We also use this information
to offer you tailored content – like giving you more relevant
search results and ads.

In other words, you are granting Google the ability to use your synced data.

An advanced settings dialog as part of the sync setup allows users to opt in to local encryption of all data - not just passwords - simply by clicking a checkbox. This same dialog also allows users to choose an alternate passphrase (not your Google Account password) for encrypting data.

For encrypted data, Chrome uses an encryption scheme called Nigori. An Overview and protocol details are available from the author's website.

This encryption scheme takes the user-supplied passphrase and uses PBKDF2 to derive keys. It first derives a 64 bit salt key, Suser, using 1001 iterations of PBKDF2 with SHA1 using the username as the salt. Then, it performs 3 more PBKDF2 derivations to produce three 128 bit keys from the original passphrase using the newly-derived salt key, producing Kuser, Kenc, and Khmac. For these, the PBKDF2 iteration counts are 1002, 1003, and 1004, respectively. Kuser and Kenc use AES as the PBKDF2 algorithm. Kmac uses SHA-1. Kuser is used to authenticate the client with the server. Kenc and Kmac are used to encrypt and sign data, respectively. Data is encrypted with AES-128 in CBC mode with a 16 byte IV. (It is worth noting that Chrome does not use a cryptographically-secure random number generator for the IV. I don't believe this amounts to anything more than a mild embarassment in this case.)

When someone wishes to sync to a new Chrome instance, she simply enters her Google Account username and password (or custom sync passphrase) and data is downloaded from Google's servers and applied. The pre-PBKDF2 passphrase is all that is needed. The new Chrome instance remembers the passphrase and syncing is automatic from that point on.

Opera

Opera supports syncing via Opera Link. Opera Link supports syncing bookmarks, history, passwords, search engine plugins, and other data types.

Opera is not open source and I have not been able to find technical details on how Opera Link is implemented. The two sources I found are a blog post and the Guide to Using Opera Link.

From those two documents, we know that Opera locally encrypts passwords. However, it is unclear whether other data is also encrypted locally. I can interpret the blog post to go either way. (If someone knows, please leave a comment with some kind of proof and I'll update this post.)

The blog post gives a high-level overview of how encryption works. A lone comment is the only source of technical details:

for encryption we use AES-128, and we use a random salt that is
part of each "blob" (one blob is a single field in each password
entry)

As commenters in that post have pointed out, that is still very short on technical details.

What I think is going on is that when you initially set up Opera Link, it generates a full-entropy 128 bit key from a random number generator. Uploaded data is encrypted with this key using AES-128 with a randomly-generated IV (or salt using terms from the blog post). The ciphertext and the IV are uploaded to Opera's servers. There may be HMAC or some other form of message verification involved, but I could find no evidence of that.

Since Opera Link is tied to your Opera Account password, I'm guessing that Opera uses PBKDF2 to derive a key from the password. It then uses this key to symmetrically encrypt the randomly-generated encryption key. It then uploads the encrypted encryption key to Opera's servers.

When someone wishes to sync with a new Opera instance, she simply enters her Opera Account credentials on the new Opera and Opera Link is set up automatically. This is a one-time set-up process.

Data uploaded with Opera Link is goverened by an Opera Link Privacy Policy. This policy states (pulled on April 4, 2012):

Opera will never disclose, share, or distribute an individual’s
Linked data to any third party except where required by law or
regulation, or in cases where you have chosen to grant access to your
data to an Opera or third party application or service using Opera
Link API. Opera restricts internal access to this information
exclusively to those who need it for the operation of the Link
service.

Safari

Safari supports syncing via iCloud. Its offerings appear to currently be limited to bookmarks, possibly because iCloud is a relatively new offering from Apple.

Configuration of iCloud is something that typically happens outside of Safari at the OS level. And, iCloud is deeply tied to your Apple ID. Users typically sign up for an Apple ID then enable iCloud support for a Safari feature (currently just bookmarks). During Apple ID setup, iCloud asks you some security questions. To connect a new device, you simply sign in to Apple ID, enable iCloud, and things just work.

Technical details of iCloud's security model are hard to come by. What we do appear to know is that everything except email and notes is encrypted on Apple's servers. However, the current theory is that this encryption only occurs after the data hits Apple's servers or that Apple has the encryption key and can read your data without your knowledge.

Data uploaded to iCloud is governed by the iCloud Terms and Conditions. This policy states (pulled on April 7, 2012):

You further consent and agree that Apple may collect, use, transmit,
process and maintain information related to your Account, and any
devices or computers registered thereunder, for purposes of providing
the Service, and any features therein, to you. Information collected
by Apple when you use the Service may also include technical or
diagnostic information related to your use that may be used by Apple
to support, improve and enhance Apple’s products and services.

If data is readable by Apple, this policy grants Apple the right to use it.

I'm not going to speculate about the technical details of Apple's encryption model because I couldn't find any non-speculative sources to base it on. If you want to read the speculation of others, see Ars Technica posts 1, 2, and 3 and Matthew Green's response.

Internet Explorer

Internet Explorer supports syncing of favorites via Windows Live Mesh.

This was discovered after this post was originally written, which is why there are no additional details.

Firefox

Firefox has built-in support for syncing browser data via Firefox Sync. It doesn't sync as many data types as Chrome, but the basics (history, bookmarks, passwords, add-ons) are all there.

When you initially create a Firefox Sync account, you are asked to create a Mozilla Services account by entering an e-mail address and password. Once this process is done, Firefox uploads data to the sync server in the background.

By default, all data is encrypted locally before being uploaded to the server. There is no option to disable client-side encryption.

Data uploaded to the server is governed by the Firefox Sync Privacy Policy. The summary (pulled on April 4, 2012) is quite clear:

* Your data is only used to provide the Firefox Sync service.
* Firefox Sync on your computer encrypts your data before sending
  it to us so the data isn’t sitting around on our servers in a
  usable form.
* We don’t sell your data or use ad networks on the Firefox Sync
  webpages or service.

While Mozilla provides a default server for Firefox Sync, the server is open source (see their documentation) and anybody can run a server and point their clients at it.

When a new account is created, Firefox creates a full-entropy 128 bit key via random number generation. It then derives two 256 bit keys through SHA-256 HMAC-based HKDF (RFC 5869). This key pair effectively constitutes a root encryption and signing key.

Firefox then generates a completely new pair of full-entropy 256 bit keys via random number generation. This key pair is used to encrypt and sign all data uploaded to the server. This second key pair is called a collection key.

Firefox takes your synced data, and performs AES-256 in CBC mode with a 16 byte randomly-generated IV (unique for each record) with the collection key's symmetric encryption key. The ciphertext is then hashed with the HMAC key. The ciphertext, HMAC, and IV are uploaded to the server.

The collection key is encrypted and signed with the root key pair and uploaded to the server as well. The root keys remain on the client and are never transmitted to the server.

Technical details of the full crypto model are available.

The e-mail and password for the Mozilla Services account are used to authenticate the HTTPS channel with the server using HTTP Basic Auth.

When you wish to connect another Firefox instance to your Firefox Sync account, the root 128 bit key must be transferred to the new device. Firefox supports manually entering the 128 bit key as a 26 character value. More commonly, Password Authenticated Key Exchange by Juggling (J-PAKE) is used. One device displays 12 characters and establishes a channel with a central brokering server. The same 12 characters are entered on the pairing device. The two devices establish a cryptographically secure channel between them and proceed to exchange the Mozilla Account credentials, server information, and the 128 bit root key. While the J-PAKE server is hosted by Mozilla, the channel is secured between both endpoints, so the server operator can't read the root key as it passes through it.

The new client then derives the root key pair via HKDF, downloads, verifies, and decrypts the collection key from the server, then uses that key pair for all subsequent encryption and verification operations.

Once a client has been paired, it holds on to the root key indefinitely and a user doesn't need to take any subsequent action for syncing to occur.

LastPass

LastPass isn't a browser, but a password manager that can be integrated with all the popular browsers. I thought it would be interesting to throw it into the comparison, especially since LastPass is perceived to have an excellent security model.

Technical details of LastPass's security model are available in the LastPass User Manual. The remaining details are found on a help desk answer.

LastPass encrypts all of your data locally before uploading it to the LastPass servers. It does this by making use of a master password.

Data uploaded to LastPass's servers is governed by a Privacy Statement. The summary that best reflects it (as pulled on April 4, 2012) is:

We don't allow you to send LastPass critically important information
like your usernames, passwords, account notes, and LastPass master
password; instead your LastPass master password is used locally to
encrypt the important data that's sent to us so that no one,
including LastPass employees ever can access it.

LastPass performs N iterations (default 500) of PBKDF2 using SHA256 over your master password to produce a 256 bit encryption key. It then produces one additional iteration to produce a login key. Data is encrypted locally using AES-256 with the encryption key derived from your master password. Encrypted data is uploaded to LastPass's servers. Your master password is never transmitted to LastPass. Instead, the login key is used to authenticate communications.

The LastPass web interface downloads encrypted blobs and decrypts them locally using the PBKDF2-derived encryption key.

To set up a new LastPass client, you download LastPass and present your username and master password. Typically, the master password needs to be presented every time you initially access your LastPass data (e.g. the first time you need to find a password after starting your browser).

Assessment

The following chart summarizes the security aspects of different browsers' sync features. Desirable traits for better security are bolded.

Product	Encryption Defaults	Can Encrypt Everything?	Encryption Entropy Source	Server Knows Decryption Key?	Server-Side Data Recovery Difficulty
Chrome	Passwords encrypted; everything else stored in cleartext	Yes	User-supplied passphrase	Yes by default (Google Account password). No if using custom passphrase	No effort for unencrypted data. 1001 PBKDF2-SHA1 + 1003 PBKDF2-AES iterations for encrypted data.
Opera	Passwords encrypted; everything else unknown	Unknown	User-supplied passphrase	Yes. Can't change.	Unknown
Safari	On remote disks only?	No	Unknown. User-supplied password?	Yes (probably)	No effort for Apple (apparently)
Firefox	Everything	Yes (default)	128 bit randomly generated key	No	128 bit key + HKDF into AES-256
LastPass	Everything (only syncs passwords and notes)	Yes (default)	User-supplied passphrase	No	Variable PBKDF2-SHA256 iterations (default 500)

So much about Safari is unknown, so it will be ignored.

Firefox and LastPass (and possibly Opera) are the only products that encrypt all data by default. Chrome (and possibly Opera) is the only product that does not encrypt all data by default.

Firefox and LastPass are the only products that don't send the entropy source to the server by default. Chrome uses the Google Account password by default and this is sent to Google when logging in to various services. Opera sends the password to Opera when logging in to your Opera Account. Google allows you to change the entropy source to a custom passphrase so Google doesn't receive the entropy source. Opera does not.

Sending the entropy source to the server is an important security consideration because it means you are giving the key to your data to someone else. Even if your data is encrypted locally, someone with the key can decrypt it. Services that send the entropy source to the server are subject to man-in-the-middle attacks and could be subverted by malicious or legal actions occurring on the server side (e.g. the service operator could be compelled through a subpoena to capture your entropy source and use it to decrypt your stored data, possibly without your knowledge).

Firefox is the only product whose encryption source is full-entropy. All other products rely on taking a user-supplied passphrase and using "key-stretching" via PBKDF2 to increase the cost of a brute-force search.

PBKDF2-derived encryption keys are common in the examined products. It is worth noting that PBKDF2 can be susceptible to dictionary and brute-force attacks because assumptions can be made about the input passphrase, such as its entropy and length. Systems often enforce rules on the source passphrase (e.g. between 5 and 15 characters and contains only letters and numbers). When cracking keys, you normally iterate through every possible permutation until you find one that works. When you can make assumptions about the input, you can eliminate a large number of these permutations. The products that use PBKDF2 are theoretically susceptible to this weakened brute-force search.

Since Firefox does not rely on PBKDF2, it is the only examined product not theoretically susceptible to a weakened brute-force search. Instead, an attacker would have to churn through every permutation of a 128 bit root key, which would take billions of computer-years. (See Brute-force attack on Wikipedia for more.)

Firefox's additional security comes at the price of more complex device setup. Firefox users need to physically have a copy of the 128 bit root key or physical access to 2 devices when pairing. All other products rely on a passphrase which the user can carry around seemlessly in her head. In addition, if the Firefox root encryption key is lost, it is more likely that your data is not recoverable because the key is not in your head.

Conclusion

Considering just the security and privacy aspects, I can only recommend two of the examined products: Firefox Sync and LastPass. I am recommending them because they encrypt all data locally by default and they do not send the encryption key source to the server. Of these two, Firefox Sync is more secure for reasons outlined above.

I can't recommend Safari because details about iCloud's encryption strategy are unknown. Furthermore, it appears Apple can recover your (possibly) encrypted data without your knowledge.

I can't recommend Opera because your encryption key source (your Opera Account password) is sent to Opera's servers. Furthermore, not enough technical details of Opera Link are available to vet it.

I can't recommend Chrome (at least in its default configuration) because it doesn't encrypt all data locally (only passwords) and you periodically send the encryption key source (your Google Account password) to Google's servers when using other Google services. If you enable encryption of all data and use a custom passphrase, Chrome's security model is essentially identical to LastPass's and thus can be recommended.

Disclaimer: I am currently employed by Mozilla and work on Firefox Sync. That being said, I believe this post has been objective and not subject to my bias towards Firefox and/or Firefox Sync. If you feel differently, please leave a comment and I will adjust the post as necessary.

Edit 2012-04-16 Note that IE supports Bookmark sync via Windows Live Mesh (thanks to Nick Richards for pointing it out in the comments). Also removed an incorrect sentence from the Chrome section which incorrectly stated that the PBKDF2 iteration count was part of the hash in each iteration.

Gone with the Wind Thoughts

February 26, 2012 at 02:30 PM | categories: movies

I watched Gone with the Wind, the 1939 classic, a few days ago. It is an amazing film. But, I was bothered by a number of plot points.

On Scarlett

How exactly am I supposed to feel about Scarlett? Sympathy? I hope not! She is a manipulative, gold-digging woman. Yes, I do love her strength and juxtaposition with men at times. But, she has no qualms about crying to get her way. Screw that! I knew people like her- in high school! One thing I learned was to keep as far away from people like this as you can. Rhett apparently didn't get the memo.

On Rhett Butler

At times I was like, yes, he's the man of men: I want to be him. But, there were a few things I couldn't get over.

First, the age gap with Scarlett was creepy. When they first meet at Twelve Oaks, she's maybe 18 and he's what, 40? Maybe that was acceptable in the 1800's and earlier this century. But, that doesn't change that I'm freaked out by it. I'm 28 now and find 25 to be about the minimum age I can tolerate. The gap between Rhett and Scarlett is Hugh Hefner territory by 1800's standards.

Maybe I missed the line, but I don't think they attributed his wealth to anything specific. Yes, they mentioned he was a blockade runner or something (which immediately got me thinking that Han Solo might have been modeled after him, which is an interesting train of thought). But, I don't think that's enough to explain his grand fortune. Hmm.

He's cold. Too cold. After he drunkenly makes love to Scarlett near the end (I dare say rape, but she appears to have loved it the next morning, so maybe it was just aggressive - which I imagine Scarlett might enjoy given her masculine qualities. Am I allowed to go there?), she's feeling all good about things and then he comes in and is like "I'm going to London for a while. Oh, and I'm taking our daughter too!" What an asshole. I admire the consistency of his character, but he's a jerk for the timing, even though the actions were probably justified given the state of their relationship.

Lack of Character Arcs

In many great films, the protagonist goes through an arc to help the audience connect the lows and the highs and to root for them. In this film, our protagonist is Scarlett and her arc is questionable. Yes, she goes from living a great lifestyle to poverty via the Civil War then to lavishness after marrying Butler. But, the film revolves around relationships. All that stuff about the Civil War and the lifestyle of the south is just extra icing on the cake. Despite things around her changing, Scarlett's relationships are quite consistent throughout the film.

Even though her arc is tiny, it is still larger than anyone else's since all the main characters were fairly rigid and hardly developed during the film. Rhett was the rich playboy who enjoyed his independence ("I'm not a marrying man"). Ashley was consistently ambiguous. Melanie was steadfast. Mammy was Mammy. From a character development perspective, not much went on in this film. It was simply an epic love story told in the south during the Civil War.

Misc Observations

It bothers me they seemed to overlook the stigma of Scarlett being a two-time widow. This is the deep south in the 1800's: her life would have been over after her first husband died. She would be an outcast for marrying again. Two, and she would be driven out of town, especially since her actions got #2 shot through the head. They alluded to this at times, but it was quickly moved faster. You'd think in a nearly 4 hour film they could find more time to cover this.

Can you imagine people in 1939 watching two amazing (color) films with 2 strong, female protagonists (Vivien Leigh, Judy Garland)? I'm not an expert on the history of cinema, but surely that year must have been a turning point.

There are some incredible lines in this film. Obvious ones aside, I had no clue that the "don't call me Shirley" line originated in this film and not Airplane!

"Great balls of fire!" I need to watch Top Gun now.

« Previous Page -- Next Page »