Where can I learn more about "partial checkouts"?

erlend_sh · April 22, 2018, 5:34am

In “The road ahead” there is this brief feature description:

Partial repository checkouts. This is one the coolest features of Pijul, which will hopefully allow it to scale to much bigger repositories than others.

This would be a pretty huge deal for monorepos. There’s lots of custom tooling to ease the management of Git monorepos, e.g. https://lernajs.io/, but certain parts of its architecture is hard to get around.

Has there been any writings anywhere about how Pijul’s partial checkouts would work?

whitespace · April 23, 2018, 1:37am

I guess it is something like

svn checkout http://example.com/svn/repo/trunk/subdirectory

You can checkout any subdirectory of the project, keep the history, and still can pull from the project (svn update).

pmeunier · April 24, 2018, 5:37pm

No, but I can explain more here. There are two levels of implementation:

Because Pijul patches commute™, and include a globally unique identifier of the files they apply to, it is fairly easy to just pull the patches that apply to a subset of the repository. This allows one to work on that subset and make patches against it. Now, remember that branches/repositories behave as sets (actual mathematical sets) of patches, ordered only by the dependencies, explicitly mentioned in the patches themselves. This means that when the author pushes to a central monorepo, that monorepo will have the dependencies, which is the only condition required to merge patches. Hence, the monorepo will just compute the union of its sets of patches with the new patches.

This level is not super hard to implement, but has some tricky bits: what if, for instance, we want to get just one directory, but the patches required to build that directory also build other parts? One solution to that problem would be to have a list of paths to output, apply these “wider” patches anyway, and output just the part we’re interested in.

Another solution could be to split the patches before applying them, but then we’d have to maintain a list of partially applied patches, which could be quite messy, and require extra datastructures which could end up costing more in disk space.
Another implementation level is to help users write patches that apply to just one subset of the repository. I don’t know if others agree, but this is what I’d like to do with nested repositories: when you create a nested repository, it creates an empty .pijul, which just means that pijul record from inside the nested repository will record a patch only in the nested repository path. Obviously, this default behaviour could be overridden with some command-line option.

Topic		Replies	Views
Pulling patches from multiple repos into one repo... and more Question	0	543	August 12, 2019
Splitting up a Repository	0	272	October 29, 2023
Adoption plans and git Question	3	836	June 8, 2021
Equivalent of checking out an old commit? Question	18	2613	February 21, 2021
Exploring Pijul for a Collaborative Project - Seeking Advice on Best Practices and Workflow Question	3	409	January 16, 2025

Where can I learn more about "partial checkouts"?

Related topics