Git Internals
Notes from "Git from the Ground Up" by Safia Abdalla.
Summary
- Git represents key information as objects stored on the files system
git cat-file
is useful for exploring these objects
- Git compresses loose objects into packfiles to increase space efficiency (see also: Packfiles: How Git Repositories Stay so Small)
- Rebases and merges differ in whether they give preference to maintaining a linear history or explicit branches
Types of Objects stored in .git/
Blobs represent file data.
Trees reference multiple blobs and other trees, similar to a directory structure.
Commits reference specific trees plus metadata, such as when the commit was made, the committer, and the commit message.
Tags are named commits.
Git objects have a type, size, and content.
The ./git/HEAD
File
The .git/HEAD
file contains a reference to a tag or SHA, which identifies a specific commit. The commit points to a Tree which contains one or more parents, plus
graph TD
HEAD[".git/HEAD"] --> Ref["Ref (.git/<tag or SHA>)"]
Ref --> Commit["<Commit SHA>"]
Commit --> Tree
Commit --> Author
Commit --> Comment["Commit Comment"]
Tree --> Parent["<Parent Commit SHA(s)>"]
Tree --> Blob["Blob(s)"]
Resources
Broader Topics Related to Git Internals
Git
A distributed version-control system to track changes, typically for software development projects