Channels of channels

The problem

I am using pijul for some private projects for a while now and several usage scenarios appear repeatedly, of which I want to initiate a discussion about one in particular: Channels composed of other channels.

What pijul does very well from the get-go is keeping the history of separate data separated. When I always only record changes for files in src/ separately from changes in doc/, both histories will never depend on each other and I can operate on either. So far so good. What does not work well, though, is communicating separate threads of history to me, the user.

This becomes especially visible when I incorporate threads of changes from other channels, that I want to try out but may want to get rid of later. I have a hard time pin-pointing the exact thread of changes that I want to unrecord again. Especially, because the threads of changes sooner or later share a common prefix of changes with my channel (because they revolve around the same set of files with a common history). When I want to unrecord those threads, I obviously want to keep the common prefix, but nothing in the log tells me, where the deviation starts.

Now, to my suggestion!

The solution – Channels of channels

Pijul channels should work slightly differently. Instead of only holding changes in their log, they should also hold other local channels and blend in all changes from their logs.

What is to be gained from this?

Although, pijul works perfectly without channels as a means to capture branches. It lacks, in my experience, considerably in communicating which set of changes belongs to which feature. I think, channels are necessary for communicating this and I see proof in the fact that the Nest implements discussions via channels for exactly this purpose.

Right now, when I want to try out a new feature, I can pull the most recent change of it and pijul pulls all the necessary dependent changes for me. The problem starts, when there is no single most recent change but several that are independent or only have common ancestors. Then, I need to know all of them to get all parts of the new feature. The only means to communicate the complete set of changes to another developer is by putting the feature into a separate channel and point them to it.

Now, git (and all other VCSs, I know of) force you to dissolve a feature into its elementary set of changes when pulling it into your current channel. So, in git, as well, I have trouble removing all the bits of a feature that I no longer want in my code base. In channels of channels, the incorporated feature channel would remain a single identifiable element that I can address to remove or inspect it.

What’s more, a channel of channels could be a live structure. Additions to the feature channel could be immediately reflected in my main channel that incorporates it. If I didn’t want that, I could put a tag of the feature channel into my main channel, such that updates would need to occur manually.

This would give the user powerful tools to handle named sets of state and incorporate them into a single history graph. Right now, everything immediately falls apart into single changes when pulled into a channel.

This directly addresses discussions on the Nest around (un-)recording sets of changes. Its immediate benefits would be, that it allows an arbitrary semantic structure of features and development stages of a project.

What semantics would such a channel have?

  1. A channel that has its own changes and now also hosts a feature channel just applies the set of all changes to the working tree. No special semantics necessary. Changes that appear in both channels are applied once. It works like pulling the feature on the fly. All the caching infrastructure to make re-pulls fast should work in principle. The log would maintain the complete set of changes from all incorporated channels with an additional field to mark what channels a change is part of. In this regard, would the log of a channel still be flat.

  2. The references to other channels could be kept in the log, identified by their current state hash.

  3. pijul record already offers recording to another channel. So, even recording a new change to the feature channel is possible from the merged working tree. This means, that channels of channels are not a read-only concept but naturally extend to writing, as well.

  4. The natural protection would apply. After recording changes to main that depend on changes from feature, it is no longer possible to record depending changes to feature without pushing the changes from main, first. Just like pushing a change to a channel without its necessary dependencies is not possible. This would immediately highlight that the user starts mixing the histories of two channels, which is likely unintended.

  5. When a change is added to main, that depends on a change from feature, this change now belongs to both channels, main and feature. In the event that feature is unrecorded, the dependency now remains part of the main channel. A warning could be shown when recording a change on main that depends on a change that is purely in feature. On the other hand, this is a likely case because an accepted feature is usually meant to become part of the history of main.

  6. A feature channel could be added to a main channel with record --ref-channel <feature> or pull --as-ref <feature>.

  7. It could be removed by identifying it in the log and unrecord --reset'ing it.

  8. The dependency on a feature channel can be resolved by using unrecord without --reset, which would keep the changes of the feature channel in the log but would remove the reference to it.

  9. A feature channel could be identified in the log by its current state hash. This hash would change on every update of the feature channel but would allow to clearly communicate a particular feature channel state to other developers, if needed.

  10. When all changes are kept flat in the log, pijul log could mark which changes appear in which channel (like how git shows, which branch head is on which commit).

  11. No special precautions should be necessary to avoid that channels form a circular dependency on each other. It should just work as a result of the commutativity of changes.

  12. Everything above should directly apply to tags from other channels. They should likely never automatically update. Thus, recording the last state of a channel, as opposed to the channel itself, could be used to imply that we are not interested in automatic updates to incorporated feature channels.

What could go wrong?

The stronger semantics that come with channels of channels can, of course, be used to make a code repository go haywire:

  1. Creating huge amounts of almost trivial micro-channels and working with “aggregator channels” will degenerate into a worse version of normal channels that only know changes.

  2. Mindlessly recording all kinds of channels in circles will lead to the situation that all channels depend on each other. This is a degenerated version of a single channel, and the naming of all the channels becomes arbitrary, because no single channel has a meaning without the other channels it depends on. Only all of them together bear the meaning of the repository, which makes them superfluous.

Technical challenges

  1. In the log, changes need an additional field that marks which channel they belong to. This is used in unrecord --reset <channel-ref> to identify the changes that are not part of the common prefix and need to be unrecorded.

  2. The log must be able to hold two types of channel references. I see no great challenges to make them fit into the current format for changes.

    1. “Live references” refer to actual local channels inside a repository. Whenever they get touched, the current channel must update the working tree like after a pijul pull. pijul log retrieves the current state hash while showing the log.

    2. “Fixed references” refer to a particular state hash that does not change. It can only be unrecorded.

  3. The two types of channel references are not needed to recreate the set of changes they represent. The set of changes is immediately added to the main log at the time of addition (and updated at the time of change). They purely serve as the marker for updates and to identify which changes are free to be unrecorded. Push and pull should work and perform almost exactly as before.

What else could go wrong?

  1. pijul push of a channel that references a local feature channel should fail if the feature channel does not exist on the remote or is not up-to-date. This puts the burden of ensuring consistency on the pushing side, because this is the only side that has all necessary information. Pushing incorporated feature channels should not be automatic by default, because it may change the semantics of other people’s channels (that pull from that remote) and should, thus, be a conscious act.

  2. pijul pull from a remote should fail by default if references to local channels are not up-to-date or non-existent. At least something like pijul pull --pull-refs should be necessary to make pulling into multiple local channels a conscious act, as it may have far reaching consequences. Especially, because it is unknown to the pulling side, how many new feature channels are part of the channel that is being pulled. So, pijul pull present the user with a list of referenced channels that are affected by the pull. Pulling a “fixed reference” is unproblematic, because all affected changes are already part of the log and it can never be outdated.

Wow! How complicated! I already had a mental block on the value of channels, but now I’m really confused.

Something I read in the Darcs docs made sense to me, but when I asked about it for Pijul, I think it’s not done the same way. That is, Darcs tags are simply a named dependency.

A tag is basically just an empty patch which depends on some other patches.

Of course, Darcs uses a model of “a sequence of patches”, implying the history.
When I suggested automatic internal tagging by date, it went nowhere.

What does not work well, though, is communicating separate threads of history to me, the user.

I think Pijul needs more show commands, like Darcs has.

This becomes especially visible when I incorporate threads of changes from other channels, that I want to try out but may want to get rid of later.

Also, if the push and pull commands made an internal tag before they start, then it would be easy to undo back to that.
Another option could be to pull with squash, so that multiple changes are combined into one.

I knew there was a reference explaining Darcs tag. Here is the key info:

When using darcs show dependencies things might become a little surprising because this command only generates a graph by walking backwards through our repository’s history until it encounters the first tag.

But Pijul tags don’t work the same way, although it might be a good way to be.

That is not what I need when working with other channels. When I made a tag and went back to it, I would not only remove the feature that I pulled in but also all my other work that came after that. I only want to remove the feature, retaining my own work.

I think it’s been made clear that channels are not branches, to be used for features.

I don’t know where you took that from. It has been said that they are not branches in the git, svn sense and that you don’t need multiple(!) of them need them to communicate features between developers.

This is not helpful for the discussion I want to have in this thread. Please open a different one, if you want to discuss what channels are there for. You could also look at Phenomenological Pijul (or Pijul from the outside) again, where I wrote about what channels are.

Your other post is not helpful for this discussion. Over there, it says

pijul add marks a new file to be tracked from now on (and has to be done only once per file).

This is not quite true, since the add command is per channel.
And that other post does not get into why you might want another channel, such as how the Nest uses them for discussions. How is the Nest storing just the change and not the dependencies, when you upload a change for inclusion in a project? How can you have an independent change except by creating a file?