In A Git Repository, Where Do Your Files Live?
- Julia Evans tl;dr: Julia explores the inner workings of git, specifically how it stores files in the .git/objects directory. Through Python programs, Julia investigates the location of specific files and their older versions discovering "content addressed storage," where the filename is the hash of the file's content. The article also demystifies the encoding process, showing that files are zlib compressed, and emphasizes that git stores complete files, not just the differences.featured in #449
featured in #442
featured in #441
featured in #414
Writing Tips for Improving Your Pull Requests
- Jeff Mueller tl;dr: “I’m going to show you how to purposely write less by using the techniques below.” Tips are: (1) Make it scannable. (2) Speak plainly. (3) Avoid adverbs. (4) Simplify your sentences. (5) Avoid a passive voice. Jeff adds examples to each.featured in #404
5 Tips To Creating A (Good) Pull Request
- Danijela Vrzan tl;dr: (1) Keep it short. (2) Add more information i.e. the what, why and screenshots. (3) Leave in-line code comments. (4) Assign people or groups as reviewers. (5) Let your colleagues know your PR is ready for review.featured in #390
Git Commands You Probably Do Not Need
- Martin Myrseth tl;dr: Martin discusses: (1) The empty commit. (2) Pushing locally. (3) Commit ranking. (4) Cat file. (5) Orphan commits. (6) Filter branch. (7) Octopus merge. (8) Rounding off.featured in #383
featured in #365
Scaling Git’s Garbage Collection
- Taylor Blau tl;dr: The process for permanently removing unreachable objects from a repository’s history has a history of causing problems within GitHub, especially in busy repositories or ones with lots of objects. In this post, we’ll talk about what those problems were, why we had them, the tools we built to address them, and some interesting ways we’ve built on top of them.featured in #354
Git’s Database Internals I: Packed Object Store
- Derrick Stolee tl;dr: Some basic concepts that Git shares with application DBs: (1) Data is persisted to disk. (2) Queries allow users to request information based on that data. (3) The data storage is optimized for these queries. (4) The query algorithms are optimized to take advantage of these structures. (5) Distributed nodes need to synchronize and agree on some common state.featured in #348