Phenomenological Pijul (or Pijul from the outside)

There has been a lot of confusion lately about what all the elements of pijul do, why they have such uncommon names, and how to wield pijul anyhow. So, this is a topic that tries to explain pijul, looking at it from its appearance and from how it handles.

So, you start out with an empty working tree in a fresh repository like this:

$ pijul init repo; cd repo

We need to clarify the language, first. So, here follow some definitions. (Words in bold are defined somewhere in this text.)

Working tree

This is ultimately the thing, that you are interested in, because it let’s you access the data that you want to see right now. It is a manifestation of the current state of your repository, because all repository state manifests into data that gets represented by files in a hierarchy of directories, known as your working tree. (This is not a conceptual limitation. One could imagine a mode of work in which pijul does not manifest its current state into a working tree, but for our purposes, this is always the case.)

Pijul treats your working tree differently than Git does. Git’s inventor Linus Torvalds once described it as: “The content tracker from hell”. And he went on to make it very clear that Git is first and foremost a content tracker. Manifesting content (data in our language) into a working tree is an addon to its storage format. This is why Git makes a clear distinction between blobs, which hold data but have no file name inside your working tree, and tree objects, which must ultimately refer to blobs to represent actual data and which assign a file name (and access bits) to a blob.
Pijul, however, works directly on your working tree. The distinction becomes clear in two observations:

  1. Git has trouble tracking file renames, additions and deletions (during merges) and must rely on many heuristics to circumnavigate problems that appear in common use cases. The reason for this is, that ultimately, Git’s storage model has no notion of successive changes. It does not know how your working tree transformed from one commit to the next. It just knows that last time it looked like that and now it looks like this.
    In Pijul, these modifications of the working tree are recorded explicitly, like changes of the file’s contents are.
  2. In Git, adding a previously untracked file and announcing to git to include a file in the next commit share the same interface, namely git add.
    Pijul uses two interfaces for this purpose. pijul add marks a new file to be tracked from now on (and has to be done only once per file). pijul record stores the modifications of the currently tracked files as a new change.

Repository

This is the conglomerate of all the wisdom that pijul has acquired about your data. It contains all possible manifestations of your data (i.e. states) as a big graph of changes but also holds knowledge about remote repositories and their state, and other useful pieces that are important for your interaction with your data through pijul (the program).

State

Because the only means to interact with your data is through your working tree and because a working tree is always a manifestation of a state, then, what is a state?

A state is a sub-graph of the graph of changes that is stored in your repository. Thus, of all your wisdom about your data, it represents a (usually very carefully crafted) selection of it.

The central service, pijul provides for you, is helping you to efficiently manage all the states, that you could be possibly interested in. E.g. In common version control systems, which use strict linearly ordered tracking of data modifications, one would like to go back in the history (of ones modifications) or merge another timeline into the current or split off into an alternative timeline or reorder/modify the history altogether.

State is central to turning your data into something you can manipulate (by manifestation into a working tree). That is why your main handle to work with state is called a channel. This means that in your working tree, you are always in some channel. Note, that a channel is not equal to a state, but it always represents some state.

$ pijul channel
* main

And you are!

Channel

There are multiple angles to look at channels. The most obvious is, that it always represents a state. It also keeps a linearised representation of its state, which is accessed through pijul log. Locally, this allows you to have a notion of time and explore a “history” of changes that lead to the current state that it represents. But most importantly, it allows you to identify individual changes that you can remove from it (undoing the change), that you can hand to your collaborators or that you can amalgamate into a new channel (both via. pijul push).
Lastly, it also holds a list of named states that are particularly interesting to you, called tags.

It is important to note right here that although the channel is your main handle to work with state, this doesn’t mean you need loads of channels to keep up with all the different states that you are interested in. The channel is merely the interface between you, pijul and the working tree that you are currently interested in.
One of the ways to interact with a channel is by adding new (read: unknown) changes to a channel, thus, modifying the state it represents.

The simplest way to add a new change is by modifying your working tree, and recording the change:

$ echo a >a
$ pijul add a
$ pijul record
Error: No identity configured, yet. Please use `pijul key` to create one.

Bummer! Pijul tries to be fit for 21st century problems. This is, identity of an author is treated very seriously. (As you treat your identity seriously IRL, too.) The way, pijul tackles the realm of identity is by requiring an author to create a public/private key pair that will henceforth identify them against pijul. What seems superfluous for local usage, becomes important, once people start collaborating. While this solves some problems regarding identity (search for “malleable identities” on the Nest), it introduces new problems (losing access to the private key, key compromise, multiple hosts etc.). Yet, that’s how pijul currently works.

$ pijul key generate <user>
$ pijul record
<interactive>...
Hash: ...

So, after creating an identity, we can record our modifications. This does two things:

  1. It creates a change that contains a faithful representation of our modifications and is identified as the “Hash: …” in the output.
    In this case, that means it records the fact that we created a new file in the working tree and want pijul to track it for us. It also means that it records that a new line with an “a” is added to the file.
  2. It records (hence the name) the change to the current channel, modifying the state this channel now represents.

Change

Changes are unique pieces of wisdom about your data, and they can be uniquely identified via a hash.
While changes represent a particular piece of knowledge about how you modified your working tree, they do not float completely freely in space. To make them actually useful, a change also knows about an author, a message describing the semantics of the change and a set of dependencies to other changes this one builds on. To make it clear once more: A change does not describe the state of your working tree (as a Git commit does). A change describes how you modified your working tree, starting from earlier modifications (which are called dependencies).
This distinction can be observed as follows:

  1. Git manifests data into a working tree by expanding its data structures of the last commit mapping blobs to files, where mapping between content and file name is stored in tree objects. This is (almost) zero cost.
    Pijul manifests data into a working tree by traversing all changes recorded in a channel to an (empty) root change, building the current state from it. This is real hard work.
  2. Git computes a diff by comparing two recorded states of the working tree. This is real hard work.
    Pijul displays the diff by returning the content of all recorded changes, that are in one state but not the other. This is (almost) zero cost.

to be continued… (Please keep discussions focussed on the improvement of this text. Create new topics for questions regarding its understanding or for in-depth discussions.)

2 Likes

The part about state is missing the part about its naming, because at least one of the command help texts (log) says there is an option --state and also --channel, but not --tag.
This section also alludes to going back in history in common VCS, but doesn’t actually say that can be done in pijul. Why mention it, unless it’s to say it can or can’t be done? It would be a place to say that pijul is “change management” if it’s not “version control”.

I know a lot of people come from Git, but I don’t, so comparing to Git is confusing. Under the change section, you could compare to Darcs or SVN or both (or none at all since you are just explaining this tool).