1.0.0-alpha.3 Merging

Merging is the bread and butter of distributed source control, so I want to get this straightened out in Pijul.

Someone from hackernews posed the following scenario:

I’m comparing git merge to adding a patch to a set in Pijul.

Let’s say I have a patch A that adds a line: return 1 + 1 + 1 + 1;

and a patch B based upon A changes it to: return 1 + 1 + 2;

and a patch C based upon B changes that to: return 1 + 3;

and a patch D based upon the original A changes it to: return 4;

So we now have:

A
| \
B D
|
C

Now lets say we add them all to the same set of patches. Will I have to resolve the conflict in B and D and then also the conflict in (the resolution of B and D) and C?

I replicated this so far by recording A, forking and recording D, resetting back to main, and recording B and C. At this point I got no idea how I’m actually going to start the merge. The only thing I have found is applying the change, by hand, one at a time, from the other channel into main by looking at the hashes from that channel’s log. This is obviously unacceptable, so that’s definitely on the to-do list. (Let me know if there is already another way). Anyway, I apply D onto main. I get no feedback, so I guess pijul thought it was a perfect merge. The log shows D → C → B → A, as expected. diff shows nothing. If we look at the file we find:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
return 1 + 3;
================================
return 4;
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Okay, so not a perfect merge. Alright, I pick return 4;. diff now shows:

message = ''
timestamp = '2020-11-16T19:17:28.751950638Z'
authors = []

# Dependencies
[2] VWXEHN53VYFFZLVSFIY63BPI5NQ7BXFSD57Q3Z3EXVQQ4KHXBJWQC
[3]+PZOBADFPPTJZC5ETS5D4HS5FJYPNW26HJALW25W7WDH7MMS7CKDAC
[*] AJUWPHUPEOQC34FU7SMVJXZLRX7POFJPP32LKLANS46SMJT573NAC
[*] OJLDUJT6AYQ7PLGVUTDFCT2TAS5LI5USBCPUH7GNE2A62XDMJE7AC
[*] PZOBADFPPTJZC5ETS5D4HS5FJYPNW26HJALW25W7WDH7MMS7CKDAC

# Changes

1. Edit in file:0 3.7
B:BD 3.7 -> 2.0:14/2
- return 1 + 3;

diff only shows the removal of return 1 + 3;. That’s good! Conclusion: the original file content is merely the way pijul represents a graph with conflicting changes. Kind of like a 2D shadow of a 3D cube. I find it problematic at the moment that I can be in completely normal repository state while I have conflicting changes. If I merge two channels that compile and run, I want to get a repository in the end that also compiles and runs. If there is a conflict, I should definitely get some sort of warning at the very least. Just to contrast, this is what it looks like in git:

qrpnxz@box:~/tmp/gitteest$ git merge feature 
Auto-merging file
CONFLICT (content): Merge conflict in file
Automatic merge failed; fix conflicts and then commit the result.
qrpnxz@box:~/tmp/gitteest$ git status
On branch master
You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge)

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   file

no changes added to commit (use "git add" and/or "git commit -a")
qrpnxz@box:~/tmp/gitteest$ vi file
qrpnxz@box:~/tmp/gitteest$ git commit 
[master 2e2a770] Merge branch 'feature'
qrpnxz@box:~/tmp/gitteest$ git lg
*   2e2a770 (HEAD -> master) Merge branch 'feature'
|\  
| * db00a3b (feature) D
* | b1a60f4 C
* | b565af0 B
|/  
* a0c8a5c A

So let’s compare and contrast. What I think this means for git is that if I revert this merge on master, I get what is expected: master as it was before the merge, and feature as it was before the merge. If I unrecord the merge fix on pijul, I get back a bad merge. Another thing to note is that D doesn’t remember arriving at main through a merge. That means that on history, this patch could’ve been made or come from anywhere and all we know after the merge is that now it’s in main. This makes that log at the end impossible for pijul at the moment. The main bad consequence, however, is that a guarantee that I can get with git that I can point to any commit and that will be a valid project state is gone. After a merge in pijul, not only is the final state possibly broken, but there’s no clean way of looking at main before the merge (ABC), or main minus it’s last change before the merge (AB), etc.

So I feel that either pijul is lacking in major ways, or I’m using it really wrong.

@pmeunier Would really appreciate your feedback.

2 Likes

pijul definitely needs something like a reflog, such that “undo last command” is almost-always well-defined in terms of repository state(excepting something like a hypothetical pijul gc).

Re: an explicit merge command, what does pijul pull . feature do? Agree that a merge would be better, but if that works I suspect any conflict notification/resolution workflow implemented lives there.

(my personal pipe dream is for each channel to formally be a list of change ids, revisioned in pijul, with the possibility of editing this list for e.g. pushes but any non-append to the revision-revision history itself creating a meta-change, turtles all the way down. I suspect this is somewhat impractical for various reasons)

reflog is local. You should be able to look at any snapshot of main on any clone.

pull has --channel and --from-channel. Both hang so I guess not.

I mean, you can’t do that on git either, you can only do “snapshot<->time association alleged by the most recent force-push”.

But see my subsequent comment on how master@{yesterday} type functionality could be implemented; the raw patch algebra is pretty aggressively unordered, so I suspect the best you’d be able to do is “excepting changes after this time”, since on non-conflicts you don’t really have merge commits to work with.

I’m thinking that pijul should also be snapshot based like git, except that the snapshots are based on the patch tree rather than the filesystem. Then it could get the best of both worlds. Every commit could be an optional change hash, and pointers to other commits. A normal commit is a change hash and a pointer to the previous. A merge an optional change hash (only needed for merge conflicts) and a pointer to the head of all relevant branches. A cherry-pick looks like a normal commit, but the change hash just happens to already be in the tree.

This could really work.

And in the meantime we get by with a .pijul revisioned in bup revisioned in git?

That sounds like this bug: pijul/pijul - Discussion #50 - push / pull to/from self blocks silently

If that’s the case, you might be able to test the functionality by using separate repositories. I.e., pijul pull ../clone feature.

1 Like

Using two repos, I can get the desired behaviour of getting all changes. All other problems stand, but yes, if you could do this on a local channel, that would be the way to do it I suppose.

Well I’m with you that “on pull update target channel with the union of changes even if this causes conflicts, without confirmation” is not precisely desired behaviour.

(Similarly push should require an --allow-conflicts flag or sth, “publish conflicted version to nest” seems like a nonstandard usecase)

No, you’re using right. Keep in mind that the commands might not yet display all the information you want, but feel free to make suggestions. The library usually implements all that.

That is true, and is expected: you do want a conflict resolution to be a patch, else you get into rerere territory.

I would argue (as I have argued already in an reply to that comment on HackerNews, which I don’t think was read by anybody) that the consequence of that “feature” of Git, is that conflict resolutions are not modeled as commits, because conflicts are not valid states. This leads to the rerere command, which makes merges and rebases much more painful than they need to be, and wastes people’s time at a massive scale, when considered globally.

Moreover, this is not a fundamental limitation of Pijul, merely of the current implementation: indeed, since we have states, we could imagine remembering the conflicting states, and simply not show them to the user. Also, unlike in Git, we can know exactly where the conflicting lines come from, so we can tell the user which change to unrecord when there is a conflict (and they can actually unrecord it without changing the rest of the changes, unlike in Git).

Also, this behaviour is totally in line with the main goal of Pijul, which is to allow people to work faster and more easily (as well as to respect the basic axioms of changes, such as associativity, which Git doesn’t do).

2 Likes

You can, that’s what pijul fork --change is for. Note that this command will be extended to take a --state command (pijul archive already supports --state).

1 Like

Two interface suggestions here:

  1. after an action that potentially introduces conflicts (e.g. a pijul pull), list all files with conflicts. “Conflicts in:” and then a list of files with conflicts one per line.
  2. a command to to explicitly ask for this list (e.g. pijul conflicts), maybe with extra details such as the changes that are involved, and/or showing some form of diff
1 Like

I think I might have mentioned this in one draft of my post that I deleted. Reason I removed it was that I can’t really grasp the difference from git. In that case, the resolution is also a patch (a commit). Perhaps this has to do with pijul’s cherry picking power? In fact I’ve never used rerere so tried to imagine a scenario where you have a master and dev, except when you merge them you don’t continue dev from the merge patch, for some reason, so then when you merge again you need to resolve the same conflicts as the merge patch, but git is bad at cherry-picking so that’s a problem? Am I getting that right? If so, well props to pijul :slight_smile:.

Speaking of rebases, I tried the example as a rebase in git and noted that a conflict I fixed on a lower commit, I’d have to fix again on a later commit. Are you saying that with pijul that would not happen (as often)? That’s be really cool.

Okay, so if I’m merging two branches doing the same thing in different ways, in pijul you can get rid of one implementation, and still merge the rest? Does sound extremely convenient. I got no idea how you’d do that in git.

As far as I know, pijul fork --change is necessarily date based. So after a merge, it would not help me look at states of main prior to a merge because it has changes from other branches interspersed all over.

When you said states in your first reply I wasn’t really sure what you were talking about, now I get the sense they are a specific concept in pijul like “changes”. I also get the vibe that they are maybe like the snapshot idea I was talking about. If that’s the case I feel like they could solve a lot of the problems talked about here. I’d like to hear more about them.

I would push for, in a normal workflow, conflicts being valid states for a checkout to be in, but not a valid state for a channel to be in: put the user in a “merging” state, where they can record the resolution patch into a merged channel(along with the depended patches). Or alternately, reset back to the last known good channel state(on either side).
Again, there should be a --force to override this, and if pijul grows tags as distinct from branches stashing your “intent to merge” state in one seems perfectly legitimate; but implicitly conflicting a channel seems like a bad default for push and pull.

I think it does walk the dependency graph; but that still doesn’t help in the cherry-pick case, where the most recent commit as of a day ago, does not necessarily contain any reference to the changes introduced a day prior.

If a --state Merkle is anything like a git snapshot however, I can see that providing the functionality though.