Pijul

1.0.0-alpha Usage Proto Documentation

In this post I’m going to look at most of the pijul UI and describe what I think everything does. As the documentation gets written, we’ll know more, but my hope is that this topic can serve as a place to get everything laid out before then and make it easier for people to start trying things in pijul to figure out workflows and find bugs. I will write it in a way that when read from top to bottom it serves as tutorial. At the bottom I will also put some commentary on pijul and give some suggestions if that’s okay.

Let’s start with getting a repository on your machine.

$ mkdir repo
$ cd repo
$ pijul init

To create a repository you create a directory and then run pijul init inside it. You can also give pijul the name of the directory you want to initialize as an argument: pijul init repo. If you do this you don’t have to first create the directory.

This creates a hidden .pijul directory where changes and other information related to this repository gets stored.

$ ls -a
.  ..  .pijul
$ ls -R .pijul
.pijul/:
changes  config  pristine

.pijul/changes:

.pijul/pristine:
db0  db.lock

We find here a file named config if we look inside it we’ll find this:

current_channel = "main"

[remotes]

This hints at the existence of something called “channels”, of which the current one is “main”. We’ll talk about these later.

Okay, now that we know how to make a repository, let’s record some changes. For this section we’ll start from an empty repository.

Let’s create a file with some text:

$ cat - >file <<DONE
I am a file
DONE

Now that we’ve made a change, the current state of our repository differs from what pijul knows about. We can ask pijul to tell us what the differences are:

$ pijul diff
Adding "file"
message = ''
timestamp = '2020-11-13T15:27:41.748969133Z'
authors = []

# Changes

1. File addition: "file" in "/" 384
  up 1.0, new 0:6
+ I am a file

Pijul can tell we added a file. and what we added to it. Now let’s make pijul remember this change.

$ pijul record file

Note that we had to tell pijul what file we wanted to record. At the moment, pijul doesn’t keep any kind of record of what files one plans to record, nor a way to specify particular changes within a file from the command line.

After we run this command, pijul will open a text editor (in Linux it’s based on the EDITOR environment variable) with the following information:

message = ''
timestamp = '2020-11-13T15:27:41.748969133Z'
authors = []

# Changes

1. File addition: "file" in "/" 384
  up 1.0, new 0:6
+ I am a file

You may have noticed this looks almost exactly like the output of diff. The reason that pijul puts it in an editor is to let us change some of the fields. In particular, to record a change, it must have a non-empty message. Everything preceding # Changes is in TOML. Here is an example change:

message = 'add file'
timestamp = '2020-11-13T15:27:41.748969133Z'

[[authors]]
name = 'Bobby Tables'
full_name = "Robert D Tables"
email = 'bobbyt@example.com'

authors is an array of tables. Per the TOML spec, we can represent that by starting a section with [[authors]], and then naming each field of a single record. At the moment, the only way to know the structure of an author object is to look in the pijul source code.

The part from # Changes onward is not part of the TOML configuration, but it is also not optional or only for information. It details to pijul the exact changes that will be recorded. If you change this part of the document, the change that gets recorded will be different from the one you have on file. This is useful because you can select what changes you want to record by editing this part of the document. There’s no documentation at the moment for this format.

Once we save and exit from the editor, we will get something like:

$ pijul record file
Adding "file"
Hash: IA43FB6C33WTKKOMU4A3L3PACG5OGVW3QBF76KCH6RWD527IPIFAC

The hash (or some prefix of it) is how we can refer to this particular change in other commands.

Let’s take a look at all the changes we’ve recorded so far by using the log command:

$ pijul log
Change IA43FB6C33WTKKOMU4A3L3PACG5OGVW3QBF76KCH6RWD527IPIFAC
Author: [Author { name: "Bobby Tables", full_name: Some("Robert D Tables"), email: Some("bobbyt@example.com") }]
Date: 2020-11-13 16:19:58.348131185 UTC

    add file

log will print every change in reverse chronological order. The code after the word Change is the hash for that change. Next we see a representation of the authors we specified, and a date. Finally, we can see our change message.

If we run pijul diff again, we’ll see no output since the state of our repository now matches pijul’s records once again.

We can get back the change we submitted by using the change command.

$ pijul change IA43FB6C33WTKKOMU4A3L3PACG5OGVW3QBF76KCH6RWD527IPIFAC
message = 'add file'
timestamp = '2020-11-13T16:19:58.348131185Z'

[[authors]]
name = 'Bobby Tables'
full_name = '''Robert D Tables'''
email = 'bobbyt@example.com'

# Changes

1. File addition: "file" in "/" 384
  up 1.0, new 0:6
+ I am a file.

Let’s explore some miscellaneous commands.

$ pijul ls
file
$ pijul ls --repository /path/to/nest.pijul.org/pijul/pijul
shell.nix
pijul-macros
pijul-macros/src
pijul-macros/src/lib.rs
pijul-macros/Cargo.toml
pijul
pijul/src
pijul/src/repository.rs
pijul/src/remote
pijul/src/remote/ssh.rs
pijul/src/remote/mod.rs
...

At first glace it looks like we are just printing all the files in a repository, but consider the following:

$ touch anotherfile
$ pijul ls
file

Our new file didn’t show up. From this we can infer that ls is only printing files that pijul knows about. How do we let pijul about a file? One way is to record the file. But we can also use another command called add.

$ pijul add anotherfile
Adding "anotherfile"
$ pijul ls
anotherfile
file

Nice, but we don’t actually want anotherfile in our repository, so let’s remove it with, you guessed it, pijul remove.

$ pijul remove anotherfile
$ pijul ls
file

Let’s say instead we want to change the name of our first file.

$ mv file coolfile

If we run pijul diff, pijul is going to tell us that we deleted file and we added a new file coolfile with the same contents that file used to have. This may seem silly, but pijul has no way to know this isn’t what actually happened on its own. Let’s try that again.

$ mv coolfile file
$ pijul mv file coolfile
$ pijul ls
coolfile
$ pijul diff
...
1. Moved: "coolfile" "coolfile" 384 1.0
...

Now pijul recognizes the change that we actually wanted to make and we can record it.

$ pijul record coolfile
Hash: XHCVXVMMZVMQKEOVJG7PAFSHCLHF5OJQ4N7BCWVITGWRLS6GNIKQC

Now let’s say we regreted doing that and we want it off the record. Pijul has a command for undoing changes called unrecord, unsurprisingly.

$ pijul unrecord XHCVXVMMZVMQKEOVJG7PAFSHCLHF5OJQ4N7BCWVITGWRLS6GNIKQC

If we look at the log again, we won’t see that change anymore.

Let’s now take a look at using other people’s work. By using pijul clone, we can copy someone else’s repository, even over the internet.

$ pijul clone https://nest.pijul.com/pijul/pijul pijul

The first argument to clone is a remote URL or local file path to the repository you want to copy. The second is the name that the copy of the repository will have in the local filesystem. If you don’t specify the second argument, pijul will assume the local directory, but it will fail if the current directory is not empty.

Remotes are kind of broken and unconfigurable at the moment so I will skip them for now.

The last thing I would like to look at for now is channels. Channels as far as I can tell are simply named sets of changes within pijul. As we saw in .pijul/config default channel is ‘main’. All the changes we have recorded so far are in the ‘main’ channel.

Let’s create a new channel.

$ pijul fork --channel main feature
$ pijul channel
  feature
* main

fork creates a new channel with the same patches as the channel we specify with the --channel flag. channel lists all the channels and marks the active channel with an asterick. To change the active channel we use reset.

$ pijul reset --channel feature
$ pijul channel
* feature
  main

reset is a kind of mysterious command. All I know about it so far is that it lets you change the active channel. Now that we are in the feature channel, we can record changes to it.

$ vi anotherfile
$ pijul record anotherfile
$ pijul log
Change EXKC67TCSWSPRG4JP5AK2MYSRG6RCC3N5W3N62KESUUFN3FN6R4AC
Author: []
Date: 2020-11-14 02:06:21.014281963 UTC

    fork change

Change IA43FB6C33WTKKOMU4A3L3PACG5OGVW3QBF76KCH6RWD527IPIFAC
Author: [Author { name: "Bobby Tables", full_name: Some("Robert\'); DROP TABLE Students;--"), email: Some("bobbyt@example.com") }]
Date: 2020-11-13 16:19:58.348131185 UTC

    first change

If we move back to the main channel those changes are not visible.

$ pijul reset --channel main
$ pijul log
Change IA43FB6C33WTKKOMU4A3L3PACG5OGVW3QBF76KCH6RWD527IPIFAC
Author: [Author { name: "Bobby Tables", full_name: Some("Robert\'); DROP TABLE Students;--"), email: Some("bobbyt@example.com") }]
Date: 2020-11-13 16:19:58.348131185 UTC

    first change

Let’s say we want to apply a change from our feature channel back into main. We can use the apply command.

$ pijul log --channel feature
Change EXKC67TCSWSPRG4JP5AK2MYSRG6RCC3N5W3N62KESUUFN3FN6R4AC
Author: []
Date: 2020-11-14 02:06:21.014281963 UTC

    fork change

...
$ pijul apply EXKC67TCSWSPRG4JP5AK2MYSRG6RCC3N5W3N62KESUUFN3FN6R4AC
$ pijul log
Change EXKC67TCSWSPRG4JP5AK2MYSRG6RCC3N5W3N62KESUUFN3FN6R4AC
Author: []
Date: 2020-11-14 02:06:21.014281963 UTC

    fork change

...

That’s all for now. Use pijul help for more details.

END OF TUTORIAL

TODO:

  • remote
  • pull
  • push
  • credit
  • archive

Things that are broken that I know about:

  • nest login (hence I haven’t made these issues in nest)
  • remotes (don’t get set up properly)
  • pull from remote (crashes)
  • pull from . (hangs)

Notes on pijul terminology:

Perhaps I’m simply not accustomed to pijul’s terminology yet. However, I think that “change” may be too general a term to use. “change” isn’t “wrong” in the sense that a “change” in pijul really does represent a change in some file. A case can be made that this makes it more accessible. It’s the kind of word that makes you want to describe it in terms of itself. For example, “a ‘change’ in pijul is a change made to a file”.

However, this property makes it somewhat difficult to talk about the reification of that concept. Specifically, a “change” is a certain data structure that encodes a diff of some kind. And in particular there are two versions of this structure that are important. Using git terminology, one is the patch, and one is the commit. A patch is the representation of a difference that we can have in a normal file and send in an email and is comparable to the output of pijul diff. A commit is an internal representation of a difference that is stored in some database within repository that is managed by the source control program. When I’m using a command like pijul apply, “change” can either be a hash prefix referring to a patch that’s already in the store somewhere, or it can read one from a file.

We can already see this conflation in the documentation. The flag --change just asks for a <change>, but for this flag that always means a change hash as far as I know. pijul change is helpfully specific about accepting a <hash>. pijul unrecord asks for a <change-id>. The documentation should pick either hash or change-id and stick with it. pijul apply accepts optionally (!) a [change]. Given the nature of the command, one might assume that it wants a patch, but it think it actually takes a hash because if you give it a file name it fails with memory allocation of 4404644050657112877 bytes failedAborted.

Because of these reasons I think that we should give the “change” terminology a bit more thought. I think at the very least, we should write “hash” or “change-id” or “change-ref” (ideally the same term everywhere) when we mean the commit, and write “change” when we mean the actual patch. Another option would be to just use the words commit and patch like they are already used everywhere, but I don’t mind using different words as long their meaning is clear and their use is consistent.

The word “channel” is even more problematic. “channel” doesn’t provide any hint to me at all about what it’s underlying structure is supposed to be or how to treat it. A channel to me, as generally as possible, is some kind of medium I tune into for information. However, if I’m correct, in pijul it’s really just a collection of changes. I don’t think this fits. I would like to hear the original thought behind calling them channels in the first place.

Let’s consider “branch” for a moment since that’s a widespread term. Branch could be a good fit for pijul since in pijul a fork, which is a bunch of patches, forms a tree, but in git a fork is only a list of commits unless otherwise branched, so at first glance you might conclude that branch fits pijul better than even git. Here’s the problem though: In pijul (and correct me if I’m wrong) a channel can simply be an arbitrary subset of changes from another channel. For example, a channel can refer to only the patches in a particular subtree of the repository. This channel would not really be a “decendant” of the channel that had all the changes. In fact this channel doesn’t not seem much different from simply having another repository in another directory. This is not dissimilar to the way that repositories in git are not very different from branches in git. However, in pijul it does make the analogy of a branch fall apart.

Suggestion: call channels changesets. I think this already makes it immediately obvious what one is talking about without sounding contrived. Then merges can be called joins, and forks can be called subsets. Or maybe the general form can be called subset, but the “common” case can be called fork. (We only really say fork is common because subset is not really a thing in git. It might be the case that arbitrary subsets are really common usecase in pijul when people get used to them.) We can use the word “set” as a shorthand where appropriate as well.

8 Likes

Thanks, I could not figure out how to use channels when I migrated my old repository.

Very helpful, thank you. How does one get a high level diff across channels?

I don’t know what you mean by high-level, but you can get a diff between two channels by doing.

pijul diff --channel the_other_channel

Branches and patches and diffs is what they were called in the old version of pijul, iirc. People didn’t like those. I’m personally fine with patch and diff. I however don’t think we should use “branch”, because pijul channels are not really like branches in other vcs’s (and pijul channels aren’t really branches or subsets - they’re just sets). I actually kinda like the name “changeset” for channels.

2 Likes

Btw, I believe one of the things pijul reset does is, iirc, equivalent to git’s git checkout -- ., where it reverts back to the latest commit, discarding uncommited changes.

I tested it and it does do that, thank you.

1 Like

I mean where it only lists the difference in terms of changes (hex id, date) between channels, not the actual source lines

I don’t think we can do that kind of diff at the moment.

I like your tutorial approach.
But the first commit message is “add file”, and then it changes to “first change”.
And all that business with the author probably would make more sense if the config is set up first, separately. Having the attempt to hack a SQL command in the author name is quite distracting to a novice.

Did you actually try using just the first part of the hash? Does it work?

I’m a bit discouraged to see that Pijul has the worst part of Git: what is being called channels, being a way to store changes in two places from one file tree. This is why I don’t use Git. It makes the user do all the work of remembering what channel etc. instead of simply using a different folder (typically called a branch). I am used to Bazaar, which makes the branch (folder) the main thing that is tracked. It really is better, regardless of the name used.

Since these are all distributed VCSs, it should not matter whether the copy is on the same system in another folder or on your coworker’s machine. There shouldn’t be so many verbs: fork, clone, copy, branch.
As for “change” versus “patch”, I have a hard time on both since they could be a verb or a noun. If a change is a diff, just call it a diff. It implies that there are two things being compared and this represents how they differ. But maybe that’s the whole problem with naming, because diff represents a certain state which anchors it into place of the previous state and the result state, whereas Pijul asserts that the previous state doesn’t matter. Therefore, it’s not a diff, but a set of editing commands. Still, it seems like you can’t create a change of adding a line 500 if there are only ever 2 lines.

But the first commit message is “add file”, and then it changes to “first change”.

I was trying out a lot of stuff which I was writing this up. Sometimes I did things in a different repo or had to create it again, and I wasn’t super careful about keeping it consistent. So yes, this happened.

And all that business with the author probably would make more sense if the config is set up first, separately.

I don’t know how to set up the author in the config file. I’ve only figured out so far how to specify it in record. Which is why I only did it for the first change and on the other ones I left it empty out of laziness.

Having the attempt to hack a SQL command in the author name is quite distracting to a novice.

It was just a joke. I can remove it if you think it’s so distracting.

Did you actually try using just the first part of the hash? Does it work?

Yes and yes.

This is why I don’t use Git. It makes the user do all the work of remembering what channel etc. instead of simply using a different folder (typically called a branch). I am used to Bazaar, which makes the branch (folder) the main thing that is tracked. It really is better, regardless of the name used.

In both git and pijul you can clone into another directory and do things like that. You don’t have to use branches or channels if you don’t want to. Do you think that pijul should just cut the feature and have people do things manually? I personally find branches in git helpful. In particular holding public stable versions of a project in their own branch, or having a branches for production, testing, and development.

Since these are all distributed VCSs, it should not matter whether the copy is on the same system in another folder or on your coworker’s machine.

As far as I know, this doesn’t matter in either git or pijul (when it reaches 1.0, so far remotes are broken).

If a change is a diff, just call it a diff.

The reification of a change can be a diff, but it could be something else. As you yourself noted, it could be a set of editing commands. Both change and patch are properly abstracted from that I think. We just have to be clear about the use. In cases where we really mean a diff, we can use that word as well.

There’s no equivalent of git log fromchannel..tochannel / git log ^fromchannel tochannel / git log --not=fromchannel tochannel right? I do see a pijul log --hash-only in the code at least, maybe you could plain diff those. (And then pijul show the result / each? I don’t think pijul show exists)

1 Like

This is actually implemented in the network protocol for performance reasons, and is faster than list + diff (it’s in O((log n) + d), where n is the max of the list sizes, and d is the size of the difference).

Yeah obviously in theory “get diff between branch and branch + one change” should be straightforward, and I would love to see as much of the full git rev-parse grammar implemented as applicable; but speaking about placeholders that work on 1.0.0 today :‍p

I would love to see as much of the full git rev-parse grammar implemented as applicable

I’d like to suggest the mercurial revset language might be a better model, it seemed a bit less arbitrary back when I worked with it compared to git’s.