Spaces:
Running
Running
Concerning Git's Packing Heuristics | |
=================================== | |
Oh, here's a really stupid question: | |
Where do I go | |
to learn the details | |
of Git's packing heuristics? | |
Be careful what you ask! | |
Followers of the Git, please open the Git IRC Log and turn to | |
February 10, 2006. | |
It's a rare occasion, and we are joined by the King Git Himself, | |
Linus Torvalds (linus). Nathaniel Smith, (njs`), has the floor | |
and seeks enlightenment. Others are present, but silent. | |
Let's listen in! | |
<njs`> Oh, here's a really stupid question -- where do I go to | |
learn the details of Git's packing heuristics? google avails | |
me not, reading the source didn't help a lot, and wading | |
through the whole mailing list seems less efficient than any | |
of that. | |
It is a bold start! A plea for help combined with a simultaneous | |
tri-part attack on some of the tried and true mainstays in the quest | |
for enlightenment. Brash accusations of google being useless. Hubris! | |
Maligning the source. Heresy! Disdain for the mailing list archives. | |
Woe. | |
<pasky> yes, the packing-related delta stuff is somewhat | |
mysterious even for me ;) | |
Ah! Modesty after all. | |
<linus> njs, I don't think the docs exist. That's something where | |
I don't think anybody else than me even really got involved. | |
Most of the rest of Git others have been busy with (especially | |
Junio), but packing nobody touched after I did it. | |
It's cryptic, yet vague. Linus in style for sure. Wise men | |
interpret this as an apology. A few argue it is merely a | |
statement of fact. | |
<njs`> I guess the next step is "read the source again", but I | |
have to build up a certain level of gumption first :-) | |
Indeed! On both points. | |
<linus> The packing heuristic is actually really really simple. | |
Bait... | |
<linus> But strange. | |
And switch. That ought to do it! | |
<linus> Remember: Git really doesn't follow files. So what it does is | |
- generate a list of all objects | |
- sort the list according to magic heuristics | |
- walk the list, using a sliding window, seeing if an object | |
can be diffed against another object in the window | |
- write out the list in recency order | |
The traditional understatement: | |
<njs`> I suspect that what I'm missing is the precise definition of | |
the word "magic" | |
The traditional insight: | |
<pasky> yes | |
And Babel-like confusion flowed. | |
<njs`> oh, hmm, and I'm not sure what this sliding window means either | |
<pasky> iirc, it appeared to me to be just the sha1 of the object | |
when reading the code casually ... | |
... which simply doesn't sound as a very good heuristics, though ;) | |
<njs`> .....and recency order. okay, I think it's clear I didn't | |
even realize how much I wasn't realizing :-) | |
Ah, grasshopper! And thus the enlightenment begins anew. | |
<linus> The "magic" is actually in theory totally arbitrary. | |
ANY order will give you a working pack, but no, it's not | |
ordered by SHA-1. | |
Before talking about the ordering for the sliding delta | |
window, let's talk about the recency order. That's more | |
important in one way. | |
<njs`> Right, but if all you want is a working way to pack things | |
together, you could just use cat and save yourself some | |
trouble... | |
Waaait for it.... | |
<linus> The recency ordering (which is basically: put objects | |
_physically_ into the pack in the order that they are | |
"reachable" from the head) is important. | |
<njs`> okay | |
<linus> It's important because that's the thing that gives packs | |
good locality. It keeps the objects close to the head (whether | |
they are old or new, but they are _reachable_ from the head) | |
at the head of the pack. So packs actually have absolutely | |
_wonderful_ IO patterns. | |
Read that again, because it is important. | |
<linus> But recency ordering is totally useless for deciding how | |
to actually generate the deltas, so the delta ordering is | |
something else. | |
The delta ordering is (wait for it): | |
- first sort by the "basename" of the object, as defined by | |
the name the object was _first_ reached through when | |
generating the object list | |
- within the same basename, sort by size of the object | |
- but always sort different types separately (commits first). | |
That's not exactly it, but it's very close. | |
<njs`> The "_first_ reached" thing is not too important, just you | |
need some way to break ties since the same objects may be | |
reachable many ways, yes? | |
And as if to clarify: | |
<linus> The point is that it's all really just any random | |
heuristic, and the ordering is totally unimportant for | |
correctness, but it helps a lot if the heuristic gives | |
"clumping" for things that are likely to delta well against | |
each other. | |
It is an important point, so secretly, I did my own research and have | |
included my results below. To be fair, it has changed some over time. | |
And through the magic of Revisionistic History, I draw upon this entry | |
from The Git IRC Logs on my father's birthday, March 1: | |
<gitster> The quote from the above linus should be rewritten a | |
bit (wait for it): | |
- first sort by type. Different objects never delta with | |
each other. | |
- then sort by filename/dirname. hash of the basename | |
occupies the top BITS_PER_INT-DIR_BITS bits, and bottom | |
DIR_BITS are for the hash of leading path elements. | |
- then if we are doing "thin" pack, the objects we are _not_ | |
going to pack but we know about are sorted earlier than | |
other objects. | |
- and finally sort by size, larger to smaller. | |
In one swell-foop, clarification and obscurification! Nonetheless, | |
authoritative. Cryptic, yet concise. It even solicits notions of | |
quotes from The Source Code. Clearly, more study is needed. | |
<gitster> That's the sort order. What this means is: | |
- we do not delta different object types. | |
- we prefer to delta the objects with the same full path, but | |
allow files with the same name from different directories. | |
- we always prefer to delta against objects we are not going | |
to send, if there are some. | |
- we prefer to delta against larger objects, so that we have | |
lots of removals. | |
The penultimate rule is for "thin" packs. It is used when | |
the other side is known to have such objects. | |
There it is again. "Thin" packs. I'm thinking to myself, "What | |
is a 'thin' pack?" So I ask: | |
<jdl> What is a "thin" pack? | |
<gitster> Use of --objects-edge to rev-list as the upstream of | |
pack-objects. The pack transfer protocol negotiates that. | |
Woo hoo! Cleared that _right_ up! | |
<gitster> There are two directions - push and fetch. | |
There! Did you see it? It is not '"push" and "pull"'! How often the | |
confusion has started here. So casually mentioned, too! | |
<gitster> For push, git-send-pack invokes git-receive-pack on the | |
other end. The receive-pack says "I have up to these commits". | |
send-pack looks at them, and computes what are missing from | |
the other end. So "thin" could be the default there. | |
In the other direction, fetch, git-fetch-pack and | |
git-clone-pack invokes git-upload-pack on the other end | |
(via ssh or by talking to the daemon). | |
There are two cases: fetch-pack with -k and clone-pack is one, | |
fetch-pack without -k is the other. clone-pack and fetch-pack | |
with -k will keep the downloaded packfile without expanded, so | |
we do not use thin pack transfer. Otherwise, the generated | |
pack will have delta without base object in the same pack. | |
But fetch-pack without -k will explode the received pack into | |
individual objects, so we automatically ask upload-pack to | |
give us a thin pack if upload-pack supports it. | |
OK then. | |
Uh. | |
Let's return to the previous conversation still in progress. | |
<njs`> and "basename" means something like "the tail of end of | |
path of file objects and dir objects, as per basename(3), and | |
we just declare all commit and tag objects to have the same | |
basename" or something? | |
Luckily, that too is a point that gitster clarified for us! | |
If I might add, the trick is to make files that _might_ be similar be | |
located close to each other in the hash buckets based on their file | |
names. It used to be that "foo/Makefile", "bar/baz/quux/Makefile" and | |
"Makefile" all landed in the same bucket due to their common basename, | |
"Makefile". However, now they land in "close" buckets. | |
The algorithm allows not just for the _same_ bucket, but for _close_ | |
buckets to be considered delta candidates. The rationale is | |
essentially that files, like Makefiles, often have very similar | |
content no matter what directory they live in. | |
<linus> I played around with different delta algorithms, and with | |
making the "delta window" bigger, but having too big of a | |
sliding window makes it very expensive to generate the pack: | |
you need to compare every object with a _ton_ of other objects. | |
There are a number of other trivial heuristics too, which | |
basically boil down to "don't bother even trying to delta this | |
pair" if we can tell before-hand that the delta isn't worth it | |
(due to size differences, where we can take a previous delta | |
result into account to decide that "ok, no point in trying | |
that one, it will be worse"). | |
End result: packing is actually very size efficient. It's | |
somewhat CPU-wasteful, but on the other hand, since you're | |
really only supposed to do it maybe once a month (and you can | |
do it during the night), nobody really seems to care. | |
Nice Engineering Touch, there. Find when it doesn't matter, and | |
proclaim it a non-issue. Good style too! | |
<njs`> So, just to repeat to see if I'm following, we start by | |
getting a list of the objects we want to pack, we sort it by | |
this heuristic (basically lexicographically on the tuple | |
(type, basename, size)). | |
Then we walk through this list, and calculate a delta of | |
each object against the last n (tunable parameter) objects, | |
and pick the smallest of these deltas. | |
Vastly simplified, but the essence is there! | |
<linus> Correct. | |
<njs`> And then once we have picked a delta or fulltext to | |
represent each object, we re-sort by recency, and write them | |
out in that order. | |
<linus> Yup. Some other small details: | |
And of course there is the "Other Shoe" Factor too. | |
<linus> - We limit the delta depth to another magic value (right | |
now both the window and delta depth magic values are just "10") | |
<njs`> Hrm, my intuition is that you'd end up with really _bad_ IO | |
patterns, because the things you want are near by, but to | |
actually reconstruct them you may have to jump all over in | |
random ways. | |
<linus> - When we write out a delta, and we haven't yet written | |
out the object it is a delta against, we write out the base | |
object first. And no, when we reconstruct them, we actually | |
get nice IO patterns, because: | |
- larger objects tend to be "more recent" (Linus' law: files grow) | |
- we actively try to generate deltas from a larger object to a | |
smaller one | |
- this means that the top-of-tree very seldom has deltas | |
(i.e. deltas in _practice_ are "backwards deltas") | |
Again, we should reread that whole paragraph. Not just because | |
Linus has slipped Linus's Law in there on us, but because it is | |
important. Let's make sure we clarify some of the points here: | |
<njs`> So the point is just that in practice, delta order and | |
recency order match each other quite well. | |
<linus> Yes. There's another nice side to this (and yes, it was | |
designed that way ;): | |
- the reason we generate deltas against the larger object is | |
actually a big space saver too! | |
<njs`> Hmm, but your last comment (if "we haven't yet written out | |
the object it is a delta against, we write out the base object | |
first"), seems like it would make these facts mostly | |
irrelevant because even if in practice you would not have to | |
wander around much, in fact you just brute-force say that in | |
the cases where you might have to wander, don't do that :-) | |
<linus> Yes and no. Notice the rule: we only write out the base | |
object first if the delta against it was more recent. That | |
means that you can actually have deltas that refer to a base | |
object that is _not_ close to the delta object, but that only | |
happens when the delta is needed to generate an _old_ object. | |
<linus> See? | |
Yeah, no. I missed that on the first two or three readings myself. | |
<linus> This keeps the front of the pack dense. The front of the | |
pack never contains data that isn't relevant to a "recent" | |
object. The size optimization comes from our use of xdelta | |
(but is true for many other delta algorithms): removing data | |
is cheaper (in size) than adding data. | |
When you remove data, you only need to say "copy bytes n--m". | |
In contrast, in a delta that _adds_ data, you have to say "add | |
these bytes: 'actual data goes here'" | |
*** njs` has quit: Read error: 104 (Connection reset by peer) | |
<linus> Uhhuh. I hope I didn't blow njs` mind. | |
*** njs` has joined channel #git | |
<pasky> :) | |
The silent observers are amused. Of course. | |
And as if njs` was expected to be omniscient: | |
<linus> njs - did you miss anything? | |
OK, I'll spell it out. That's Geek Humor. If njs` was not actually | |
connected for a little bit there, how would he know if missed anything | |
while he was disconnected? He's a benevolent dictator with a sense of | |
humor! Well noted! | |
<njs`> Stupid router. Or gremlins, or whatever. | |
It's a cheap shot at Cisco. Take 'em when you can. | |
<njs`> Yes and no. Notice the rule: we only write out the base | |
object first if the delta against it was more recent. | |
I'm getting lost in all these orders, let me re-read :-) | |
So the write-out order is from most recent to least recent? | |
(Conceivably it could be the opposite way too, I'm not sure if | |
we've said) though my connection back at home is logging, so I | |
can just read what you said there :-) | |
And for those of you paying attention, the Omniscient Trick has just | |
been detailed! | |
<linus> Yes, we always write out most recent first | |
<njs`> And, yeah, I got the part about deeper-in-history stuff | |
having worse IO characteristics, one sort of doesn't care. | |
<linus> With the caveat that if the "most recent" needs an older | |
object to delta against (hey, shrinking sometimes does | |
happen), we write out the old object with the delta. | |
<njs`> (if only it happened more...) | |
<linus> Anyway, the pack-file could easily be denser still, but | |
because it's used both for streaming (the Git protocol) and | |
for on-disk, it has a few pessimizations. | |
Actually, it is a made-up word. But it is a made-up word being | |
used as setup for a later optimization, which is a real word: | |
<linus> In particular, while the pack-file is then compressed, | |
it's compressed just one object at a time, so the actual | |
compression factor is less than it could be in theory. But it | |
means that it's all nice random-access with a simple index to | |
do "object name->location in packfile" translation. | |
<njs`> I'm assuming the real win for delta-ing large->small is | |
more homogeneous statistics for gzip to run over? | |
(You have to put the bytes in one place or another, but | |
putting them in a larger blob wins on compression) | |
Actually, what is the compression strategy -- each delta | |
individually gzipped, the whole file gzipped, somewhere in | |
between, no compression at all, ....? | |
Right. | |
Reality IRC sets in. For example: | |
<pasky> I'll read the rest in the morning, I really have to go | |
sleep or there's no hope whatsoever for me at the today's | |
exam... g'nite all. | |
Heh. | |
<linus> pasky: g'nite | |
<njs`> pasky: 'luck | |
<linus> Right: large->small matters exactly because of compression | |
behaviour. If it was non-compressed, it probably wouldn't make | |
any difference. | |
<njs`> yeah | |
<linus> Anyway: I'm not even trying to claim that the pack-files | |
are perfect, but they do tend to have a nice balance of | |
density vs ease-of use. | |
Gasp! OK, saved. That's a fair Engineering trade off. Close call! | |
In fact, Linus reflects on some Basic Engineering Fundamentals, | |
design options, etc. | |
<linus> More importantly, they allow Git to still _conceptually_ | |
never deal with deltas at all, and be a "whole object" store. | |
Which has some problems (we discussed bad huge-file | |
behaviour on the Git lists the other day), but it does mean | |
that the basic Git concepts are really really simple and | |
straightforward. | |
It's all been quite stable. | |
Which I think is very much a result of having very simple | |
basic ideas, so that there's never any confusion about what's | |
going on. | |
Bugs happen, but they are "simple" bugs. And bugs that | |
actually get some object store detail wrong are almost always | |
so obvious that they never go anywhere. | |
<njs`> Yeah. | |
Nuff said. | |
<linus> Anyway. I'm off for bed. It's not 6AM here, but I've got | |
three kids, and have to get up early in the morning to send | |
them off. I need my beauty sleep. | |
<njs`> :-) | |
<njs`> appreciate the infodump, I really was failing to find the | |
details on Git packs :-) | |
And now you know the rest of the story. | |