AST-level diffs and merges


AST diff and merge algorithms allow to resolve conflicts which cannot be resolved by usual git. For example changes in the same row or moving blocks of code into other places or variables renames. It may be useful to store not text diffs, but AST ones. Of course with fallback to raw text or binary when the lang is unknown.

https://github.com/koczkatamas/onelang - is an abstract ast for all langs with specifying languages in yaml and transpiling the code on them to each other
https://github.com/GumTreeDiff/gumtree - ast diff tool


Thanks for the constructive comment! Could you provide an explanation of why you think the current Pijul could not handle this, and how that could work with our current algorithm?


In fact I have not tried pijul yet, all the things are from my git experience, and since, I guess (am I right?) that pijul is also line-oriented (as git is), I thought that it may have the same troubles git has. But since pijul is not mature and stable like enough, it can introduce breaking changes and innovations, like AST-based diffs.


Pijul is not line-oriented (at least in the sense that Git is) - that’s the key difference. Although it is not at level of understanding different programming for conflict resolution, the internal file representation has an explicit modeling of conflicts (that you can then resolve).

I’d recommend reading:

it’s a good explanation of how conflicts are handled in Pijul.


I have already read this, thisI guess won’t help much for automatic conflict resolution.
version a

def a(b, d):
  return b+d

Version b

def a(b, c):
  return b+c

version c

def a(a, d):
  return a+d


How should d look like? I guess

def a(a, c):
  return a+c

I guess neither pijul nor git can deal with it.

Another example is code style change without changing meaning of the code and then merging 2 branches one before style change another one is after.

Another use case is subtree moving. We move a subtree into another tree. An then someone edits that subtree unmoved and tries to merge, obviously this will lead to conflict. But if we trac subtrees we can apply the changes he made to the moved subtree.


It’s an interesting idea, but I think it’s out of scope for Pijul (in its current form at least).

I could see it fitting in as a hook feature of sorts. If Pijul would provide a “merge” hook, then the conflict resolution scheme could employ knowledge of the AST to come with smarter suggestions.


Storing the code as an AST (and history as a spatiotemporal graphc) internally has some benefits such as lesser size and immediate answer on queries “which patches have affected this symbol” or “which patches affect contol flow touching that symbol”.


That’s really great, everybody knows that. We could also write diff algorithms for other structures than trees: why not arbitrary graphs? Sets? Continuous functions?

The main problem here is, unless you have a concrete proposal for how to adapt our theory to trees, we cannot implement this now.