Spaces:
Sleeping
Sleeping
'\" t | |
.\" Title: gitformat-commit-graph | |
.\" Author: [FIXME: author] [see http://www.docbook.org/tdg5/en/html/author] | |
.\" Generator: DocBook XSL Stylesheets vsnapshot <http://docbook.sf.net/> | |
.\" Date: 04/24/2023 | |
.\" Manual: Git Manual | |
.\" Source: Git 2.40.1 | |
.\" Language: English | |
.\" | |
.TH "GITFORMAT\-COMMIT\-G" "5" "04/24/2023" "Git 2\&.40\&.1" "Git Manual" | |
.\" ----------------------------------------------------------------- | |
.\" * Define some portability stuff | |
.\" ----------------------------------------------------------------- | |
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.\" http://bugs.debian.org/507673 | |
.\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html | |
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.ie \n(.g .ds Aq \(aq | |
.el .ds Aq ' | |
.\" ----------------------------------------------------------------- | |
.\" * set default formatting | |
.\" ----------------------------------------------------------------- | |
.\" disable hyphenation | |
.nh | |
.\" disable justification (adjust text to left margin only) | |
.ad l | |
.\" ----------------------------------------------------------------- | |
.\" * MAIN CONTENT STARTS HERE * | |
.\" ----------------------------------------------------------------- | |
.SH "NAME" | |
gitformat-commit-graph \- Git commit\-graph format | |
.SH "SYNOPSIS" | |
.sp | |
.nf | |
$GIT_DIR/objects/info/commit\-graph | |
$GIT_DIR/objects/info/commit\-graphs/* | |
.fi | |
.sp | |
.SH "DESCRIPTION" | |
.sp | |
The Git commit\-graph stores a list of commit OIDs and some associated metadata, including: | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
The generation number of the commit\&. | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
The root tree OID\&. | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
The commit date\&. | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
The parents of the commit, stored using positional references within the graph file\&. | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
The Bloom filter of the commit carrying the paths that were changed between the commit and its first parent, if requested\&. | |
.RE | |
.sp | |
These positional references are stored as unsigned 32\-bit integers corresponding to the array position within the list of commit OIDs\&. Due to some special constants we use to track parents, we can store at most (1 << 30) + (1 << 29) + (1 << 28) \- 1 (around 1\&.8 billion) commits\&. | |
.SH "COMMIT\-GRAPH FILES HAVE THE FOLLOWING FORMAT:" | |
.sp | |
In order to allow extensions that add extra data to the graph, we organize the body into "chunks" and provide a binary lookup table at the beginning of the body\&. The header includes certain values, such as number of chunks and hash type\&. | |
.sp | |
All multi\-byte numbers are in network byte order\&. | |
.SS "HEADER:" | |
.sp | |
.if n \{\ | |
.RS 4 | |
.\} | |
.nf | |
4\-byte signature: | |
The signature is: {\*(AqC\*(Aq, \*(AqG\*(Aq, \*(AqP\*(Aq, \*(AqH\*(Aq} | |
.fi | |
.if n \{\ | |
.RE | |
.\} | |
.sp | |
.if n \{\ | |
.RS 4 | |
.\} | |
.nf | |
1\-byte version number: | |
Currently, the only valid version is 1\&. | |
.fi | |
.if n \{\ | |
.RE | |
.\} | |
.sp | |
.if n \{\ | |
.RS 4 | |
.\} | |
.nf | |
1\-byte Hash Version | |
We infer the hash length (H) from this value: | |
1 => SHA\-1 | |
2 => SHA\-256 | |
If the hash type does not match the repository\*(Aqs hash algorithm, the | |
commit\-graph file should be ignored with a warning presented to the | |
user\&. | |
.fi | |
.if n \{\ | |
.RE | |
.\} | |
.sp | |
.if n \{\ | |
.RS 4 | |
.\} | |
.nf | |
1\-byte number (C) of "chunks" | |
.fi | |
.if n \{\ | |
.RE | |
.\} | |
.sp | |
.if n \{\ | |
.RS 4 | |
.\} | |
.nf | |
1\-byte number (B) of base commit\-graphs | |
We infer the length (H*B) of the Base Graphs chunk | |
from this value\&. | |
.fi | |
.if n \{\ | |
.RE | |
.\} | |
.SS "CHUNK LOOKUP:" | |
.sp | |
.if n \{\ | |
.RS 4 | |
.\} | |
.nf | |
(C + 1) * 12 bytes listing the table of contents for the chunks: | |
First 4 bytes describe the chunk id\&. Value 0 is a terminating label\&. | |
Other 8 bytes provide the byte\-offset in current file for chunk to | |
start\&. (Chunks are ordered contiguously in the file, so you can infer | |
the length using the next chunk position if necessary\&.) Each chunk | |
ID appears at most once\&. | |
.fi | |
.if n \{\ | |
.RE | |
.\} | |
.sp | |
.if n \{\ | |
.RS 4 | |
.\} | |
.nf | |
The CHUNK LOOKUP matches the table of contents from | |
the chunk\-based file format, see linkgit:gitformat\-chunk[5] | |
.fi | |
.if n \{\ | |
.RE | |
.\} | |
.sp | |
.if n \{\ | |
.RS 4 | |
.\} | |
.nf | |
The remaining data in the body is described one chunk at a time, and | |
these chunks may be given in any order\&. Chunks are required unless | |
otherwise specified\&. | |
.fi | |
.if n \{\ | |
.RE | |
.\} | |
.SS "CHUNK DATA:" | |
.sp | |
.it 1 an-trap | |
.nr an-no-space-flag 1 | |
.nr an-break-flag 1 | |
.br | |
.ps +1 | |
\fBOID Fanout (ID: {O, I, D, F}) (256 * 4 bytes)\fR | |
.RS 4 | |
.sp | |
.if n \{\ | |
.RS 4 | |
.\} | |
.nf | |
The ith entry, F[i], stores the number of OIDs with first | |
byte at most i\&. Thus F[255] stores the total | |
number of commits (N)\&. | |
.fi | |
.if n \{\ | |
.RE | |
.\} | |
.RE | |
.sp | |
.it 1 an-trap | |
.nr an-no-space-flag 1 | |
.nr an-break-flag 1 | |
.br | |
.ps +1 | |
\fBOID Lookup (ID: {O, I, D, L}) (N * H bytes)\fR | |
.RS 4 | |
.sp | |
.if n \{\ | |
.RS 4 | |
.\} | |
.nf | |
The OIDs for all commits in the graph, sorted in ascending order\&. | |
.fi | |
.if n \{\ | |
.RE | |
.\} | |
.RE | |
.sp | |
.it 1 an-trap | |
.nr an-no-space-flag 1 | |
.nr an-break-flag 1 | |
.br | |
.ps +1 | |
\fBCommit Data (ID: {C, D, A, T }) (N * (H + 16) bytes)\fR | |
.RS 4 | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
The first H bytes are for the OID of the root tree\&. | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
The next 8 bytes are for the positions of the first two parents of the ith commit\&. Stores value 0x70000000 if no parent in that position\&. If there are more than two parents, the second value has its most\-significant bit on and the other bits store an array position into the Extra Edge List chunk\&. | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
The next 8 bytes store the topological level (generation number v1) of the commit and the commit time in seconds since EPOCH\&. The generation number uses the higher 30 bits of the first 4 bytes, while the commit time uses the 32 bits of the second 4 bytes, along with the lowest 2 bits of the lowest byte, storing the 33rd and 34th bit of the commit time\&. | |
.RE | |
.RE | |
.sp | |
.it 1 an-trap | |
.nr an-no-space-flag 1 | |
.nr an-break-flag 1 | |
.br | |
.ps +1 | |
\fBGeneration Data (ID: {G, D, A, 2 }) (N * 4 bytes) [Optional]\fR | |
.RS 4 | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
This list of 4\-byte values store corrected commit date offsets for the commits, arranged in the same order as commit data chunk\&. | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
If the corrected commit date offset cannot be stored within 31 bits, the value has its most\-significant bit on and the other bits store the position of corrected commit date into the Generation Data Overflow chunk\&. | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
Generation Data chunk is present only when commit\-graph file is written by compatible versions of Git and in case of split commit\-graph chains, the topmost layer also has Generation Data chunk\&. | |
.RE | |
.RE | |
.sp | |
.it 1 an-trap | |
.nr an-no-space-flag 1 | |
.nr an-break-flag 1 | |
.br | |
.ps +1 | |
\fBGeneration Data Overflow (ID: {G, D, O, 2 }) [Optional]\fR | |
.RS 4 | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
This list of 8\-byte values stores the corrected commit date offsets for commits with corrected commit date offsets that cannot be stored within 31 bits\&. | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
Generation Data Overflow chunk is present only when Generation Data chunk is present and atleast one corrected commit date offset cannot be stored within 31 bits\&. | |
.RE | |
.RE | |
.sp | |
.it 1 an-trap | |
.nr an-no-space-flag 1 | |
.nr an-break-flag 1 | |
.br | |
.ps +1 | |
\fBExtra Edge List (ID: {E, D, G, E}) [Optional]\fR | |
.RS 4 | |
.sp | |
.if n \{\ | |
.RS 4 | |
.\} | |
.nf | |
This list of 4\-byte values store the second through nth parents for | |
all octopus merges\&. The second parent value in the commit data stores | |
an array position within this list along with the most\-significant bit | |
on\&. Starting at that array position, iterate through this list of commit | |
positions for the parents until reaching a value with the most\-significant | |
bit on\&. The other bits correspond to the position of the last parent\&. | |
.fi | |
.if n \{\ | |
.RE | |
.\} | |
.RE | |
.sp | |
.it 1 an-trap | |
.nr an-no-space-flag 1 | |
.nr an-break-flag 1 | |
.br | |
.ps +1 | |
\fBBloom Filter Index (ID: {B, I, D, X}) (N * 4 bytes) [Optional]\fR | |
.RS 4 | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
The ith entry, BIDX[i], stores the number of bytes in all Bloom filters from commit 0 to commit i (inclusive) in lexicographic order\&. The Bloom filter for the i\-th commit spans from BIDX[i\-1] to BIDX[i] (plus header length), where BIDX[\-1] is 0\&. | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
The BIDX chunk is ignored if the BDAT chunk is not present\&. | |
.RE | |
.RE | |
.sp | |
.it 1 an-trap | |
.nr an-no-space-flag 1 | |
.nr an-break-flag 1 | |
.br | |
.ps +1 | |
\fBBloom Filter Data (ID: {B, D, A, T}) [Optional]\fR | |
.RS 4 | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
It starts with header consisting of three unsigned 32\-bit integers: | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
Version of the hash algorithm being used\&. We currently only support value 1 which corresponds to the 32\-bit version of the murmur3 hash implemented exactly as described in | |
\m[blue]\fBhttps://en\&.wikipedia\&.org/wiki/MurmurHash#Algorithm\fR\m[] | |
and the double hashing technique using seed values 0x293ae76f and 0x7e646e2 as described in | |
\m[blue]\fBhttps://doi\&.org/10\&.1007/978\-3\-540\-30494\-4_26\fR\m[] | |
"Bloom Filters in Probabilistic Verification" | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
The number of times a path is hashed and hence the number of bit positions that cumulatively determine whether a file is present in the commit\&. | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
The minimum number of bits | |
\fIb\fR | |
per entry in the Bloom filter\&. If the filter contains | |
\fIn\fR | |
entries, then the filter size is the minimum number of 64\-bit words that contain n*b bits\&. | |
.RE | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
The rest of the chunk is the concatenation of all the computed Bloom filters for the commits in lexicographic order\&. | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
Note: Commits with no changes or more than 512 changes have Bloom filters of length one, with either all bits set to zero or one respectively\&. | |
.RE | |
.sp | |
.RS 4 | |
.ie n \{\ | |
\h'-04'\(bu\h'+03'\c | |
.\} | |
.el \{\ | |
.sp -1 | |
.IP \(bu 2.3 | |
.\} | |
The BDAT chunk is present if and only if BIDX is present\&. | |
.RE | |
.RE | |
.sp | |
.it 1 an-trap | |
.nr an-no-space-flag 1 | |
.nr an-break-flag 1 | |
.br | |
.ps +1 | |
\fBBase Graphs List (ID: {B, A, S, E}) [Optional]\fR | |
.RS 4 | |
.sp | |
.if n \{\ | |
.RS 4 | |
.\} | |
.nf | |
This list of H\-byte hashes describe a set of B commit\-graph files that | |
form a commit\-graph chain\&. The graph position for the ith commit in this | |
file\*(Aqs OID Lookup chunk is equal to i plus the number of commits in all | |
base graphs\&. If B is non\-zero, this chunk must exist\&. | |
.fi | |
.if n \{\ | |
.RE | |
.\} | |
.RE | |
.SS "TRAILER:" | |
.sp | |
.if n \{\ | |
.RS 4 | |
.\} | |
.nf | |
H\-byte HASH\-checksum of all of the above\&. | |
.fi | |
.if n \{\ | |
.RE | |
.\} | |
.SH "HISTORICAL NOTES:" | |
.sp | |
The Generation Data (GDA2) and Generation Data Overflow (GDO2) chunks have the number \fI2\fR in their chunk IDs because a previous version of Git wrote possibly erroneous data in these chunks with the IDs "GDAT" and "GDOV"\&. By changing the IDs, newer versions of Git will silently ignore those older chunks and write the new information without trusting the incorrect data\&. | |
.SH "GIT" | |
.sp | |
Part of the \fBgit\fR(1) suite | |