Google I/O 2009 – Mercurial on BigTable

Google I/O 2009 – Mercurial on BigTable

November 9, 2019 7 By Peter Engel


Lee: Hi, I’m Jacob Lee. I’m with the Google Code Open
Source Project hosting team, and I’m going to be talking about the Mercurial distributed
version control system and how we got it to run
on Google infrastructure. A point of administravia, this is sort of a–we are
officially launching this to the public today. Actually,
as of several hours ago, so you can go create projects, but after my talk. And if you’re tempted
to vanish, say, during Q&A, I would at least lurk
in the back until the end, because we have T-shirts. So if you want one,
I’d stick around. That’s a great sound. All right, so I’m going
to first talk a little bit about project hosting,
and then a lot about how Mercurial
is implemented, and then about
our implementation. So let’s first time warp
way back to the distant past where we still maintained source
code in giant card catalogs. This is July 2006
at OSCON, three years ago
when we first announced Google code project hosting
to the public. At that time, Subversion 1.0
was about two years old. One of the earliest distributed
version control systems, Monotone,
was a year older than that. Git and Mercurial were,
more or less, sort of brand-new. CVS was widely used, and I guess
now we’re dating ourselves well, and maybe several months
before that, back in February, Sourceforge had just announced
Subversion support to add to its offering
of CVS. And we came
into that environment with the goal of using
our unique assets, which is Google infrastructure, to provide rock-solid
and scalable Subversion hosting, and thereby provide a platform
for collaboration on open source projects. And nowadays, well, there’s
lots of options out there for new and existing
open source projects. Google Code now has 200,000
or so, give or take, projects and several million
unique visitors per day. Sourceforge has expanded
their offerings. They have shell accounts and several distributed version
control systems and, you know,
hosted Trac instances and cool things like that. And there are new players
like GitHub and Bitbucket, which are specialized for a particular
version control system and provide a well-integrated
social experience on top of that,
and that’s all really great. And we’re happy
that there are so many choices for projects out there. But we still think
that Google is well-positioned to support and support well
these newly prominent features of distributed version control. So with that, about a month
ago, we started offering Mercurial support to about
50 or so trusted testers– brave volunteers–and now
it is open to the public. That’s just
a quick screenshot. So why did we choose Mercurial? This was the source
of many great discussions. It was sort of entertaining if you like angry people. So, you know, why would we
choose Mercurial and not Git or Bazaar
or many of the other systems shown on the previous slides? Any individual might have
personal reasons to like one or the other. If you were
to grossly over-generalize, you might like Mercurial’s
simple, orthogonal workflow. You might like Git’s ability
to do almost anything. Our decision was primarily
technical and sort of unique to our situation. Mercurial has a really fantastic
wire protocol that is over HTTP. And that
is what Google is built on. Almost everything we do is HTTP with a handful
of exceptions, like Google Talk,
which is Jabber/XMPP. And that’s something that was
sort of unique to Mercurial that makes it a very good fit
for our infrastructure. That is the only stupid keynote
transition I am going to use in this whole presentation,
but I had to, ’cause it’s a flamewar. And it’s the point
where people were starting to evade our +1 filters that we decided to make
that issue read-only. Right.
Okay. So if you’re not familiar
with–well, first of all, how many people here use on,
say, a day-to-day or at least weekly basis,
version control? That’s good.
That’s really good. If your hand didn’t go up,
you should consider. It is an investment that pays
itself in spades in time. So how many people
use distributed version control? Nice. Nice. All right, how many people
use Mercurial? Pretty good.
Git? All right. Fight?
All right. Okay. I stole that joke
from Fitz and Ben, thanks. All right? So if you’re not familiar,
the basic workflow is you’ve got your, instead of
being on some server somewhere, you just have your repository
sitting on disk. You initialize it,
you hack a bunch, and then, when you feel you have
a coherent change that– say, like a single patch
that you would mail out– then you just commit that, hack some more,
and commit that. And you keep doing that, and you build up your list
of changes. And when you feel that
you have something to share– which, hopefully, if you
attended Ben and Fitz’s talk, is relatively early– then you push it out
to some public URL. And then some other random
person can come and clone it, and they get a copy
of your repository with complete history
locally on their disk. And they hack a bunch, and
they can do their own commits. And they have something
to share back to you, they push to their server. And their changes
are in red there. And then, meanwhile,
you’ve done some more hacking, and then at some point,
you decide, “Oh, hey, so-and-so just
emailed me with a cool feature. I’ll merge it back in.” So you get it
from their repository, and now all of a sudden,
you have–at the front of your repository,
you have a change and they have a change. And then you merge them in, and hopefully
there are no conflicts, and then you have the tip
of your repository, the single front point
of history. And merging is something
that is really scary, historically at least. You just don’t know
what crazy state the code will be in
when you are done. That used to be the case,
and one advantage of distributed version control
is that this is something that happens all the time– pretty much anytime
you integrate a change that someone else did
simultaneously with you– and it’s relatively
pain-free, so… The other advantages
is everyone– this person who submitted
the change doesn’t have to be a core contributor
to be able to use the same tools that I do, and have their own… to keep track of history
and so forth. And you can do this– So these two repositories
can be totally equal, or your social structure
might be set up so that one repository’s
considered central. You might have several
central repositories, some more bleeding edge
than others. But you’re not constrained
by technical limitations for how you choose to structure
your repositories. And if you’ve tried
to do things like sync from one repository
to another in Subversion, it’s sort of a pain. So this solves
that problem well too. So moving on to:
How does Mercurial actually implement this? So the first thing to note
is that this repository here is not a linear list
of changes. It’s a graph. Specifically it’s a directed–
a cyclic graph. You don’t–
I mean, internally, Mercurial– Well, Mercurial
might present to you a numbered list of changes,
but those numbers are totally specific
to your repository. Someone who integrated their
changes in a different order might have different numbers. So when we were actually talking
to other people, we identified changes
by a unique identifier that happens to be
the hash of some states of the repository. So we’ve got this graph here. Each node in the graph
is a changeset, a single patch. And the edges in the graph
are ancestry relationships. We identify a changeset
by the hash of its contents. And the contents are enough to uniquely identify
the changeset. It is, you know, the log
message, the user, the date, a pointer to the manifest,
which has the list of files, and the parents
of the change if it is not the first one
in your repository. So that’s kind of the key point,
because that means two changes that happen
to touch the same file
and have the same message, they’re still identified by– they’re still unique
because their parents, which are different,
are included as part of the hash. And so that means,
because of all this, a single changeset
is entirely immutable. And this is a difference
from other systems like Git. You can’t change the contents
of a changeset without– If you–say you
were to compress a bunch so it still represented
the same final state but had
different ancestry. Then it has a different I.D., which means
it’s a different changeset, because you identify them
by I.D. So this means that Mercurial
really is not much a fan of rewriting history. There are extensions to do it,
but it happens really before a repository
has been published, typically. All right,
so we actually have– The user visible graph
is the graph of changesets, but conceptually
there are many others. I mentioned earlier
that the manifest is the list of files in the changesets. That is also versioned in the
same way that changesets are. Every time you make a commit,
if the list of files changes, that is a new node
in the manifest graph, with its parent being
the previous list of files. And then each individual file
conceptually has its own graph, which, each time
you change the file– change the contents
of the file, it gets a new node identified
as the hash of its contents. And then so you’re building up
these graphs here. The changesets point down
to the manifests, and manifests point down
to the version of the file at that particular point. And then to be able to do
certain operations efficiently, you also have a link
back up. Files and manifests have links
back up to the changesets. So that’s actually– It’s a very simple model
of history. A handful of other concepts
that are good to know: The heads of the repository
are simply the most recent changesets
that do not have any children. Typically, at least
in the public repository, there’s exactly one head. It’s a good state
to be in. You might
have multiple branches, which are just named heads,
and, well, named heads coupled with their parents… And then you have tags,
which is just a name attached to a particular
revision, say, for a release. All right. So how is this
actually stored on disk? Mercurial uses a structure
called a revlog that, if you’ve noticed, these graphs
are fundamentally the same whether they’re operating
on files, manifests, or changesets. So they’re stored in a format
called a revlog that is designed to meet
the requirements of what Mercurial has to do, which is you have to be able
to append to it, because you need to add
new changesets to your history. You need to be able to access it at random
to be able to, say, walk the history
of a particular file, or to look up a
file-particular revision. And it has to be fast,
and it has to be compact. So there are two components
to this file. You have–Well, first of all,
there is one revlog for each conceptual graph. There’s one for changelogs,
one for manifests, and one per file tracked
in the repository. It’s divided
into two components. You have an index
and a data file. And to add a new node
to one of these graphs, you simply add a new BLOB
of data to the end of the data file and add a new record
to the index. The index is just a list
affixed with records that tells you, for a particular
changeset number or file number, what is its identifier,
what are its parents, and importantly where is
it located in the data file. So the index tends
to be very compact, such that you can–
when you hit it a bunch, it tends to stick around
in memory, and it’s only 64 bytes
per node. It doesn’t take up
very much space in memory. And so the data file–
I sort of said that you just keep appending
to the end of it, which is true,
but if you’re doing that for, say, a megabyte–a file
that’s a megabyte in size, each time you changed it,
that would grow quickly. So you actually just store
deltas in this file. You keep appending
to a list of deltas, which would have
its own problem, and that’s–to reconstruct
the actual file, you would have to walk
several revisions, and so the way Mercurial
strikes a compromise here is that simply once the size
of the accumulated deltas exceeds the actual size
of the file, it records a new snapshot. So typically, it doesn’t have
to read very many entries in this revlog to reconstruct
a particular version of a particular node. All right.
And so… I just talked about all of that.
Great. Writing to the repository
is as I described, and the only thing is you do
need to lock the repository when you’re doing this,
so that you don’t present an inconsistent state
to any other instance of the hg command
that happens to be running. All right, so when you– to, say, make a new commit
to the repository, we lock it and we compute
the list of modified files, which–it has to be locked
during that– and we show the–
you know, we invite the user
to type in a log message, and then we write
a new list of files, and then the new manifest
and then the new changeset. We go in that order
to make rollbacks easier in case the user cancels
at some point. And so when I say
it’s locked, I mean–
and then I say we ask the user to type in a
description of their changeset. It’s very locked. Because that’s
an arbitrarily long operation. So this is only a write-lock. So you can still read
the repository while this is going on, but it’s very much,
you know, one operation at a time–
one write operation at a time
on this repository. So those are all things that are just happening on disk,
which is great. It’s really fast, and you can
do it on an airplane. To actually exchange
with other people, the network options we have
to do are pushes and pulls. That’s how we exchange fragments
of our repository or, more typically, the whole
thing with other people. So to do a push is very easy,
assuming that you are up to date with the remote server
that you’re trying to push to. I’ll gloss over the case
where you’re not up to date. It’s like doing a pull,
then a push, sort of. So to do a push,
you ask the remote repository, “What are your heads? What are the most recent
revisions in your repository?” And then you find those
in your repository, because you have more stuff
than they do, typically. And then you say, “Okay,
everything from that point forward is typically
what I’m going to send you.” And then you bundle that up
into a changeset and then upload it. It’s a very straightforward
and fast operation. Pull is slightly
more complicated, because there’s this negotiation
process where you have to figure out what the remote
has that you don’t have, so that you transfer
the minimum amount– so the minimum amount necessary
is transferred to you. So in this case,
I have the nodes in blue and the remote has all
of these new ones in yellow. So I ask, “What are your heads,
3, 9, and 13? “Oh, I don’t have 9 and 13. Tell me about, you know,
what is some ancestry of 9?” And it says,
“Well, 9 follows a bunch of– it goes back until 6,
and 6 has parents 2 and 5.” I say, “Oh, good, I have 5.
We’re getting somewhere.” And you keep going
with this sort of 20 questions of figuring out
what you need to get. And your goal here is
to identify nodes 5 and 10. Those are the farthest back
nodes that you don’t have. And then I tell the server,
“Please send me everything from 5 and 10 forward.” And it bundles them up
and sends. So there’s a lot of–
it’s a little chatty at the beginning, but that’s
only transferring information about changesets. The actual changeset,
the potentially large operation, is getting the actual bundle
of new data. But that is a single response
from the server. So that is something that’s–
this is what I am referring to when I say that it is well
designed for our infrastructure, as I’ll get to. All right, so how does
our implementation– What did we have to do
to get this running? Well, so there’s good news
and bad news about the way
Mercurial is set up and how we’re set up. The good news is we don’t have
to do, like, commits and anything that the hg clients
would want to do to our repository,
any extensions or something, ’cause those all happen on
the users’ local repositories. We only care
about the network operations, that is push and pull,
and we care about whatever we have to do to support
our Web front-end, the source browser. Oh, and we have to do
very simple commits for the wiki, which–
project wiki is stored in a Mercurial repository
as well. But other than that,
it’s mostly push and pull and some querying. So it’s all… It’s mostly
just this DAG walking that I’ll get to. But Mercurial makes a lot
of assumptions that don’t work for us. For instance, it can–
a single process is running on a single machine. It can access the file system
directly. The file system–
a single hard drive holding the Mercurial
repository–it can lock it. It can load all
of the index files that it needs into memory and do fast
random access on them. Most of these assumptions
don’t work for us at all at Google,
because we’ve got sort of– well, see, here’s a particular
Google cluster, right? That’s actually
a picture of one of the first Google server
environments back at Stanford, but we tend to have racks
that look like that. We have–our whole
infrastructure is built on the assumption
that we have lots and lots and lots
of relatively cheap computers. I mean, we would–used to say
“commodity computers.” They’re not strictly commodity. We’re not buying them
from Dell or anything. But they’re certainly not
the beefiest servers that money can buy. On top of that, we have our
layers like GFS and BigTable, which I’ll talk about. The whole point of this
is that, when you’ve got the volume that we do,
you’re going to have lots and lots of computers
anyway, and as soon as you have
lots and lots of servers, you’re going to have failures
all the time. And so there’s really
no point in over-engineering because
you’ll still have failures, and your software will
still have to handle them, so we prefer just giant clusters
and write the software as robustly as we can
to operate on them. So the first layer for that
is the Google File System, GFS. It was built way early on
to serve the needs of Search, so the primary need
that it has to be robust in the face of random computers
or RACs or data centers vanishing. It has to hold hundreds
of terabytes, and these numbers probably
are all low balls, pro tip. And, you know,
multi-gigabyte files… The unique workload
of searches, you need to be able to stream
these files really fast. Like, say you’re running
a MapReduce, a computation across them. So you just need to be able
to stream it really fast, and you need to append
to it very fast, and you need to be able
to append to it concurrently. Say you have lots of workers
all appending to the same thing. They need to be able to do that
without stepping on each other’s toes. And maybe you need
to do random access, but it’s not
your primary workload, and it needs to be possible but it doesn’t need
to be fast. So there’s a system
called Google File System. There is–it’s published.
There’s a paper about it. It’s not open source. But it’s
a single master system, just to keep
the implementation sane. And then it farms out–
it divides a single file into multiple chunks
that are 64 megs each, farms those out
to a bunch of servers, multiple copies–
it might be three or four copies of it on separate machines– and then you always
query the master, “Hey, where can I find
this file?” And it tells you, “Oh, there are
chunks here, here, and here,” and then you hit
the actual file servers. And those might–if they die,
then you go back to the master
and ask it again, and eventually it will figure–
it keeps it in sync. But so would we just– well, that actually sounds
sort of attractive. We need fast appends and– but so do–will we just
build straight on GFS? No, and actually most teams
at Google don’t do that because, first of all,
GFS doesn’t handle replication between data centers,
and we like keeping our data in more than one data center, so if one vanishes
from the face of the Earth, we don’t have to go
and tell people, “Hey, guys, you know
all those repositories we were holding on to for you?” So that would be embarrassing. So we don’t want to have
to write that ourselves. We need to be on top of GFS. And also the performance
characteristics are, you know, sort of quirky
and this whole, for instance, if you need to fetch
a byte from a file, you have to swap it in
in 64 megabyte chunks, and that’s sort of suboptimal. It would not jive very well to simply have the revlogs
correspond to GFS files. It would have saved us
a lot of coding, but it wouldn’t have worked. So on top of that,
there’s BigTable, which is the primary database
technology used at Google. It’s used by lots of teams. It is not a relational database. It is sort of
a row/column database. It is built to meet the diverse
needs of many Google teams. So it has–
but the primary example, you know,
is in the papers about it– tend to deal with Search,
imagine that. So your rows
are just arbitrary strings that are lexicographically
ordered. Those are single-table,
which is ginormous. It can be ginormous. So it’s chunked out
to different servers by the row space, so different
sections vertically. And then–
or horizontally, yeah. The columns
are also arbitrary strings. So it’s sort of a,
you know, you end up with this sparse matrix
where you don’t have to have alleys for every column. And they’re grouped
into families also for locality
of access. And then the values
you actually store, well, it might be a single value
per cell. It might be a list
of values by timestamp, which is a feature
that we are not too reliant on, so I’ll gloss over it. And the actual values
you store are opaque BLOBs. There’s no sort of, you know,
referential integrity or SQL-like thing going on here. You’re just storing BLOBs
in this, and now that it’s open source,
we can talk about it. So these BLOBs
often are protocol buffers. All right. The operations that you do,
you look up by a particular row and column name. You might scan a set of rows,
and you might, of course, write to it,
and you have the ability to lock a particular row to write values into it,
which is important. We care about that. And this also–
it’s not open source, but there’s a paper
about it, and the Hadoop project
from Apache I believe has an open source
implementation of a BigTable-like system. So BigTable is what
we would like to build on. It’s widely used
inside Google. So it turns out
that we’ve got all these things
indexed by hash. That is a very good match
for BigTable, which we can treat essentially
as a giant hash table with some other features. So those changesets that we
identify or–sorry, these nodes in the graph, which might be
changesets, might be file– particular versions of files,
we store those in our row space, identified by this hash,
by the repository name, and by what kind of node it is. The “C” there is for changesets.
The “M” is for manifest. “F” is for files. And if it’s a file,
it’s also identified by what’s its path. Because as I mentioned, these
are sorted lexicographically, so this has the advantage– And they’re split
into tablets where rows that are close
together are likely to end up on the same machine
and can be accessed very efficiently together. So all the changesets
for a single repository end up next to each other
lexicographically. Same with the manifests,
same with the different versions of each file. And then–
so the values that we store split up
into these column families here. So for files we store
the contents of the file, obviously enough. We actually fragment them because there are row size
limitations in BigTable. But we can still retrieve those
linearly by row. And then for manifests
we just store the list of files, and for changesets, we store
the changed contents, the log message, and so forth. Just basic data
and metadata. And so that’s
all just the raw contents of the graph that we throw
into Table. And so I actually identify
a single repository. We have its own row indexed
by our repository name. And it has simply
the list of heads that refers back to the
changesets stored in the table. So that which is enough
to uniquely identify the current state
of the repository. So for the operations
that we actually want to do, and why they are efficient– this is–first of all,
to clarify, this is–when I say
push and pull here, this is all
from the client’s perspective. So if I’m pushing something
into Google’s servers, well, all we have to do
when we receive this incoming graph data
is just shovel it into this row space
we talked about earlier, and it’s all sort of–
it’s garbage. It’s unlinked.
It’s not referenced anywhere. So we are free to do that without acquiring
any sort of lock, because it’s not messing up
the current state of the repository,
because we haven’t changed the list of heads. So once we are done
with all that, we’ve noted what we think
the current heads are in the repository. And if that’s still the same, that means nothing else
has happened to this, so we can go ahead
and lock the repository row, check to make sure
it hasn’t changed, and then write the new list
of heads that we computed. And then–well,
then we’re done. The repository instantly
refers to this new data. We just add it
to the table. And that’s all something
that, you know, can be done– if you have multiple pushes
going at the same time, they pretty much all succeed
and they all happen in parallel. One of them will win for
this actual very small write at the end
for the repository row. And the other one will have
to do a little more computation before it can update
that row again. But it still gets to succeed. Now if the client
is pulling from us, we have to answer
that game of 20 questions that I described
earlier of the repository– of the client asking us,
“Well, what do you have?” And so that is mostly just tracing back the history
of the repository in this table. We look up a changeset. We look up its parents,
keep going until we find the information
the client wants. And once the client has
identified, “Okay, I want you to send me everything
from these two changes forward,” then we can get
the actual changeset contents from the table
and then stream those out to the client. That is–
so at the beginning, there’s some graph-blocking,
but to actually read the changesets
and build up this bundle, that is something that we can
rely on BigTable’s parallelism to just–we basically can send
them out to the client as fast as we can read them. And then the last major
operation we have to do is for the source browser, the graphical front end
that we’ve written, and that is things like
getting the next and previous revisions of a file, and getting the history– or at a particular path, getting the next
and previous revisions, getting the contents of a file at a particular revision, that sort of thing. That is all just graph walking,
so it’s– Well, we’ll get into
the performance of that. There’s–also we got– Well, there’s a lot of
optimizations that we can do that are–or really
that BigTable does for us. The thing we need to avoid is doing a sequential
read or write, which is ask BigTable
for something, wait for an answer, ask it for something else. Because Mercurial is sitting
on a local file system. It can do that very quickly. And it’s typically–
it’ll all be in memory, and for us that is maybe,
I don’t know, 20 millisecond round-trip
each time. So our goal is to let BigTable
do the concurrency. We throw all of our,
you know– If we have writes, we the throw them into a pool and wait for all of them
to finish, and BigTable gets back
to us eventually. Same with reads. And then we do a little bit
of computation to minimize
this graph walking for things that we know
the client is going to ask us. It’s going to ask,
“Hey, for this particular node “that I don’t have, “what is the farthest back node before a merge?” ‘Cause that’s
a single line of history that it’s going to want
to investigate further, so that’s the sort of thing
we can pre-compute, because it’s all in the past, and it’s guaranteed
never to change. So we can just store it
when we do the writes. So the actual results
we get from our implementation, pushing– When the client
is pushing to us, that is ridiculously fast. We just accept data
and writes into the table, and because there are many,
many machines involved here, this can happen way faster than an actual
file system writes. The pull, that negotiation
process at the beginning, is still a certain mound
of synchronous reads where we have to wait
for an answer, and that’s something
that’s even– We can’t do anything about
a lot of those round-trips, so even at the best case, once we’re finished optimizing
our implementation, we might still be, say,
twice as slow as Mercurial, because of the difference
in architecture. And then
for the source browser, we’re very fast
for certain operations. Retrieving a particular
revision of the file– that’s just a table lookup. Retrieving the history
of the file– so in Mercurial, that is just tracing
a single revlog that holds the whole contents
of a file. That’s slow for us, ’cause we have to do
a whole bunch of graph walking, and that’s something
we’re working on. You’ll notice that’s slow
on the web front end and we’re– but we have some stuff we can do
to make it faster still. But overall, though– So those numbers
for a single repository also hold true for however much
we want to serve. A, you know, single job
running in a data center, we can serve hundreds
of queries per second on the source browser across
all sorts of repositories. And for the pushing
and pulling, we can do tens
of megabits per second from the single job, which is rather fast
for a Python program on a single computer. So we don’t run into things
like lock contention as much as you would if you had
the stock Mercurial server trying to do
that sort of thing. So the lessons we can learn
from this is, you know, engineering, it’s really
about trade-offs, and there are things that are– The design goals very much
affect the end product. Our design goal is that if the entire project-hosting
ecosystem– Well, if all of Google code decided to switch their projects
today to Mercurial, we could handle it maybe by throwing a couple more
computers at it, bring up a couple more jobs, and we can
handle the traffic fine. But at the same time, that, I mean,
involves a lot of– When you’re sharding things
across computers like that, it’s always–
there are certain operations that you’re not
going to be able to do as fast as if you just had
a single process talking directly to disk. So the scalability there
will always have a price. So that’s something that– When you’re starting
to design a system, these are the things
you need to keep in mind. And one more cool thing
that we have… I think that’s
a Lego clone army. I don’t know. But so you notice that– We’ll go back to this… All of that data
in that top table, that’s all immutable. And one thing that’s very common
in a distributed workflow is having, you know,
multiple copies of this– more or less,
the same repository. And that is, for Mercurial, that is something that if you
want to clone a local repository that’s sitting on disk… Well, it doesn’t do any work. It actually hard-links
the new repository to the old, and then when you
actually do a write, it then goes, “Whoa, this file
has two references to it. It’s linked somewhere else.” And then it actually
does the copy then. So cloning is wicked fast. But you have–
Oh, I should stop doing that. But it will slow down
your operations randomly later. So it sort of…it works. We can do a little bit
better than that, because all that– If we were to clone a repository
on the service side, all that actual revision data
that’s stored in that table, that’s all immutable, and whatever happens
for the divergent history of these two repositories, that data can remain shared. So the only thing that uniquely
identifies a repository is that list of heads, and the list of branches
and things like that, and also meant if we have
any sort of references pointing forward
that we’ve pre-computed, those are things we have
to watch out for, but the bulk of the data
is shared. So that’s something that we can
certainly do pretty easily from a technical perspective. Getting the right UI and integration with our
existing project hosting is something that’ll
take more effort, but it’s something that you can
look forward to in the near future
in your projects. And we believe that,
you know, the distributed workflow, I mean,
it’s not really just– You don’t want to be able
to use Mercurial. Though you might enjoy
using Mercurial, you want to be able
to take advantage of the workflow of
distributed version control, and that’s what we hope
to support. Thank you.
Any questions? [applause] man: I have one. Lee: Yeah. man: Kind of unrelated, but we are always
looking for ways to let our, like,
graphic designers store big Photoshop files. Kind of doesn’t work
in Service Control, but I’ve never seen
another solution. You know, we want revisioning
in that process. But, you know.
Do you have any solutions? Lee: Write one
and tell us about it. man: No, you’re Google.
You should know things. Lee: Well, actually, for
projects that are doing this, we recommend using
our downloads server, which you can at least give,
you know, revision numbers. You have to do it yourself,
it’s a pain, but it’s doable, and that is relying
on a different, totally different
massive storage system. man: All right. Well, maybe we’ll start
an open source project. man: Hi. You talked about
the changesets sort of having
a similar hash value, and then that’s why
they hash together– well, close to each other. Is that– Lee: Well, say a change
and its parents will have totally different
hashes, right? Because it tries to be
a reasonably smooth function, or reasonably
arbitrary function. But they all have
the same prefix of the repository name and then “C.” As opposed to, say,
the files that are manifest, that are also stored
in the same table. To be clear,
that was all one BigTable. man: Yeah, so, I mean, so my question is more probably
related to BigTable. So this sort of the same
prefix hashing to similar locations, is that something that is
by default in BigTable, or did you have to do
something for that? Lee:
No, that’s how it works. Rows are stored
lexicographically and then sharded
by their prefix. man: So it’s a
location-preserving hash table? Lee: Sorry? man:
Like, I mean, BigTable is like a
location-preserving hash… I mean, it preserves,
like, I mean, based on– Lee: No, no, it’ll just–
it’ll start on one tab, and then when it gets big enough it’ll split it up
as it deems appropriate. man: Thanks. Lee: It’s not something
we really have to deal with. It’s BigTable magic. man:
I was just wondering if you imagine Google eventually
using something like Mercurial for its own
internal development, and compared to, you know, big systems like Perforce
have trade-offs, and can you talk a little about
what those trade-offs are, and what systems like Perforce do better or worse
than Mercurial and sort of how
you think that through? Lee: Well, we’re sort of
unique at Google. We have our–
basically one giant repository for, like, everything. So Perforce scales… Well, crack engineers whom I’m–
don’t know who they are do a good job of scaling it, and so I don’t know actually if there are any developments
on that front. man: I’ve been trying
to evaluate the different distributed
version control systems, and one of the big things
in my environment is being able to import existing
subversion repositories into whatever it is, and I ran into
quite a few problems using Mercurial and such. I’ve been kind of
leaning towards Bazaar, and I was just wondering what
was your experience with that, and how poorly did it go, and that kind of thing? Lee: Well, the actual import,
I found, is straightforward. Getting all
of the correct bindings is a pain in the neck
right now. You have to get
the subversion Python bindings, because Mercurial is written
all in Python. That’s kind of
the most annoying one. If you have, say,
a recent Ubuntu distribution, you might have it
all out of the box. Otherwise you have
some compiling from source in your future,
and that’s loads of fun. So that’s something–
It would probably be, I mean, GitHub I know actually has some sort of automated imports for existing
subversion repositories, which is really cool. So… Yeah? man: Do you offer
transition from subversion to Mercurial currently? Lee: Yes. As of this afternoon, you can select
from the little drop-down which system to use, but unfortunately you do
have to do the migration yourself, which we know is annoying. man: Just interested, can you extend this concept to content repositories? Like, say, does Google Docs also use
similar version systems? Lee: Sorry? man: Like, Google Docs, does it use BigTable
for its content? Lee: Oh.
I don’t know what Docs uses. A lot of teams at Google
use BigTable, but, I mean, Docs and sites
all do versioning, but they’re certainly not,
like, using Mercurial or Git behind the scenes. man: I was wondering
about your experience while changing the,
let’s say, writing or disk back-end of
Mercurial to work with BigTable. If you’re, for instance, going to release these
modifications to Mercurial to the public, as a first partial question, and then what’s your opinion
or your advice, ’cause I’m very interested in being able to change
that back-end instead of writing
directly to disk– writing, for example,
to our relational database. Lee: Ooh. That’d be fun. Well, it turns out, so we’re not that fundamental
a structure of the revlog. We had to totally
replace that, and that’s sort of the bulk
of the interesting operat– That coupled with us only dealing
with the network operations means we use the little bits of,
say, like the parsing code, but most of the actual
implementation is internal. So it’d be something
that’s sort of challenging to open source because of
the dependency on BigTable, but I hope at the very least you can expect maybe a white
paper or something from us. If not,
as upside of…yeah. man:
I was mainly thinking of just maybe a small extraction layer where you can hook your,
let’s say, persistent strategy. Lee: Yeah, you can–
the Mercurial code base is relatively clean. It’d be sort of doable. But you have to keep in mind your access patterns
are going to– or your performance
characteristics are going to be
totally different. So if it hopes to be able
to do random access by integer alias
for changesets, and do that quickly, you need to be able
to support that. Which, actually,
in, you know, say, a
stock-relational database, probably would be doable. We don’t use integer I.D.s, you know, anywhere
in our implementation. It’s only hashes, so ours
is pretty quirky in that, so… man: Okay, and just
a very small question more. Do you support for sub-treat
cloning or exporting? Yeah, I mean, instead
of the full repository, just the subset of the path. Lee: No, I don’t think
Mercurial does that. man: Mercurial does not. Lee: Yeah, no,
so we don’t either. man: Okay, thank you. Lee: It’s not in the protocol. man: Hi. A lot of CM systems
set up at the beginning to facilitate the rest
of the engineering process. Do you have any plans
to integrate or support continuous integration
on the back end, tying into bug systems where you can track changes both in the
issue tracking system and in the quote system
linked together? Lee: Some of that
is possible right now. We support web hooks, so when there’s a new commit
in the repository or a new push for Mercurial, we just hit an external URL. That URL then could get
that revision and run, say, a continuous
build from that. It’s something that–
it would be sort of challenging for us to support internally, just ’cause every build system is so radically unique, so that’s the best bet
for now. man: And what about integration
with bug tracking? Lee:
What sort of integration? man: Well, a lot of
well developed CM systems have this notion
that you have a change, and the change incorporates
both the code change and the issues
that are associated with it. Lee: Versioned issues.
No, we– man:
Not versioning the issues, but tying the issues together with the source-code changes. Lee:
We have some integration where your commits
include commands to the issue tracker. Like, “Fixes issue number 23.” It will actually close
issue 23, so that’s sort of
the most we do. man: Okay. Lee: All right? Thank you.
We have T-shirts. [applause]