Heads-up: recent bugs with conflicts and with the Nest

Hi all,

I just wanted to give you a quick heads-up about things I’m currently doing in Pijul:

  • The Nest was previously using my own hack to do HTTP, it’s now using Tokio. I originally wrote it because Hyper had no Tokio version, and I felt I didn’t know enough about Tokio to rewrite Hyper. Another goal was to try and avoid all the complexity (and memory allocations) of Hyper, which was forced to allocate lots of stuff because it was answering each request using a different thread.
    That experiment was interesting in itself, and makes we want to contribute back to Hyper some of the things I learnt along the way.
    Meanwhile, my current testing version of the Nest uses only Hyper for everything. This might cause some problems in the next few days with external services (such as OAuth and emails).

  • @laumann and @lthms have found a number of issues with conflicts. My initial intuition was that they were caused by a problem in the representation of zombie lines (lines that some patch authors consider alive, while others have deleted them).
    I’ve just checked the theory, and that intuition was wrong.
    However, looking at the code of output, apply and unrecord, I’ve found (and hopefully) a number of other problems:

    • output (libpijul/src/graph.rs) was not outputting zombie lines correctly, due to a stupid mistake. This is fixed now.
    • unrecord was not working correctly on zombie lines. Some lines were still marked as zombies.
    • ERROR: dependency not found was probably preventing some patch applications. The wrong test for that was a leftover from a refactoring. Fixed too!

    Even after these fixes, I would not expect existing repositories to work completely. Repositories on the Nest will be refreshed today (i.e. entirely cloned). A way to clean local repositories is to clone them (and obviously work on the fresh clone). Sorry for the inconvenience.

3 Likes

Thank you for this write-up.

It sounds like there is still quite a bit of work to be done on Hyper :slight_smile:

Let us know when pijul_org/pijul should be cloneable again, then I’ll be happy to try it out.

I’d love to understand how you got to this conclusion, but I suspect it requires a good in-depth understanding of the theory. I was beginning to wonder if dependency inference was buggy, but it seems like it isn’t?

Thanks again!

I haven’t found any problem in dependency inference, but sometimes patch application failed because a test in apply was not performed correctly (hence the ERROR: lines)

Now, about zombie lines: when Alice wants to delete a paragraph, and Bob concurrectly wants to add a line in the middle of that paragraph, the lines Alice deleted are called “zombies”.

This is because they have two kinds of edges pointing to them: alive edges (or pseudo-alive), and deleted edges. Producing these edges correctly took me a while (I did it quite some time ago), because detecting them is not symmetric: Alice can easily know she has deleted the lines of Bob’s context, but Bob needs a way to detect that as well, for patch application to commute.

Now, each edge of the graph is stored twice: once in each direction. This is for complexity reasons. So, when you delete an edge, you also need to delete its companion in the other direction. I was not doing that properly.

Now, how I got to the conclusion that this is not a theoretical bug. Indeed, each new patch can add edges, and the conflict detection happens afterwards. We just need to be careful not to lose information when unrecording stuff.

2 Likes

That makes sense, thank you!

Alright, so after the productive meetup @lthms and myself had today, I have a new hypothesis for the conflicts in Cargo.lock.

I now suspect that it might have contained a cycle at some point. I have no clear evidence, but I’ve seen cycles while debugging this morning.

Pijul is not (yet) supposed to be able to create them, so conflict resolution is not tested at all in that particular case.

I’ll make a Rust test with artificial cycles soon.

Edit: there were cycles in Cargo.lock on the Nest. Proof: the Nest is now doing stack overflows when trying to unrecord the patches supposed to “fix” the conflict.

You might be interested in my latest finding… (and I got the stack overflow in my terminal too)