Here’s a question related to my current effort to iron out the remaining bugs/unimplemented things in libpijul: how should we represent conflicts arising from missing contexts?
A missing context is the situation that arises when Bob adds a line to a block of text, while Alice deletes that block, and then pulls Bob’s patch.
The current solution
The current solution is to resurrect all the context until finding an alive one, i.e. if Bob changes “AC” to “ABC”, while Alice deletes A and/or C, we mark A and C as “zombies”, i.e. lines on which there is no consensus to tell whether they are fully dead or fully alive.
The main problem of this solution is that we might resurrect lines that are not involved in the conflict. Imagine the situation where the repository initially contains lines A_0, A_1, … A_n, then Alice produces a sequence of patches P_0, P_1, P_{n-1}, where for all i, P_i deletes A_i. Bob pulls all those patches, then adds a line B after A_n, while Alice makes patch P_n deleting A_n.
When Alice pulls Bob’s patch, the current implementation will resurrect all lines of the file, even though both agreed on their deletion.
Solution B: unrecording the other side of the conflict
Another solution is to unrecord the patch that deleted the edge (in this situation, that would be P_n), output the file. We would do this in an aborted child transaction, so that the unrecorded patch would still be applied. It is unclear how to articulate that with the pending patch. Maybe:
- apply the pending patch,
- start a child txn
- unrecord the hunk that deleted the line
- abort
- unrecord the pending patch (in the parent txn)
This sounds quite intuitive from a user’s perspective. In some cases though, unrecording P_n might reach a point where another context has to be restored, and we might have to unrecord many patches recursively.
Solution C: change the zombies
Yet another simpler solution is to change the zombies to being the lines introduced by Bob’s patch, and to look for an alive context without resurrecting anything.
This is computationally equivalent to the first solution (linear in the size of the conflict, where “the conflict” has the counter-intuitive definition of “all the deleted lines between the lowest alive line and the current hunk”).
This solution, however, doesn’t have the same drawbacks from the user’s point of view.