Using pijul as backend for a Pull Request-like system?

ktwklb · August 30, 2021, 11:54am

I’m writing a web backend in Rust and was wondering if I could use pijul as a Rust library for detecting changes and conflicts. The idea for the website is that, people contribute articles and are free to suggest changes to other people like GitHub’s Pull Requests. My questions:

Can I use Pijul as a Rust library?
a. if so, can I use it without actually letting it talk to the file system?
all my data would be in a SQLite DB, so having to unpack it to a folder to let pijul handle the changes would be bad performance wise. I guess I could write to the ~~/tmp~~ /dev/shm directory? I found the working_copy::memory, could it be useful?
Ideally I would like to create whatever Rust struct is necessary for Pijul to generate me a patch I could then visualize in the webeditor.
b. if so, I would have to make my whole backend open-source and GPL2 licensed? Or just the parts that interact with Pijul, i.e. just the Pull Request backend functionality?

pmeunier · August 30, 2021, 1:14pm

1a. You can, but:

You would have to write your own instance of WorkingCopy. I don’t think working_copy::memory is the most suitable one, since it essentially emulates a filesystem in memory. The working_copy::fs one is much simpler to start with. Just copy it (mind the license), and replace std::fs by whatever you want.
I wouldn’t store the patches in an sqlite database, but there’s nothing preventing you from doing that. It is likely to be slower, with no benefit at all.

1b. IANAL, but the GPL2 license doesn’t seem to force you to make your backend’s code (source or assembly) public. However, if you do (for example by selling or distributing the compiled form, or by making your source publicly available), the result has to comply with the GPL2 license. If this is a problem for your application, please PM me to discuss a different license.

ktwklb · August 30, 2021, 1:28pm

Thanks for answering!

I think that’s exactly what I want, and similar to what I would achieve with something like /dev/shm.Are you suggesting the problem may lie in storing a large history in memory?

Slower compared to other databases or to another approach in general? Block storage on something like S3?

pmeunier · August 30, 2021, 2:10pm

Well, if you think it is more suitable, then so be it! You might want to check that you understand what the WorkingCopy trait does.

The nest works without a WorkingCopy, for example, and it does something similar to what you want to do.

I don’t know what you know about Pijul. For example, you can’t store the pristine on S3 nor in a database (because pristines are already databases themselves). Storing patches in a database could work, but will be slower than just files.

ktwklb · August 30, 2021, 3:19pm

The link seems to not be working. If it was meant to be a link.

Very little, indeed.

But the working copy solution you suggested is still viable? I thought I would create a very simplified file system in SQLite, and Pijul would still think it’s talking to a regular file system, making the pristine database work still? Or did you mean in a previous reply that using working_copy will work but without some features?

Storing patches in a database could work, but will be slower than just files.
The nest works without a WorkingCopy

So nest uses a custom solution? Any chance you could share how it works?

pmeunier · August 30, 2021, 3:43pm

Fixed, sorry about that.

Yes, you can definitely do that, you would get slower outputs than in a filesystem, but the edits would be atomic (unlike in a regular filesystem).

It is quite complicated, to be honest. The main feature of the Nest is to be able to handle user data in an experimental version control system. Experimental here means that it is very possible that files get out of sync with the pristine for a wide variety of reason.
Now that Pijul is more stable, I’m also using this to synchronise multiple cloud servers.

ktwklb · August 30, 2021, 3:50pm

I don’t know why you keep saying that: 35% Faster Than The Filesystem

I definitely need to benchmark it. I would have thought sqlite would be faster. Especially since the database would be constantly open, so less calls to OS for open and close.

pmeunier · August 30, 2021, 4:28pm

Because I know how a database works, and I know what Pijul needs. In Pijul, all the content produced by users is stored compressed in patches. You can use a cache to avoid opening and closing files constantly (which is what changestore::fs does), but storing stuff in another database will cost a lot more, since for each access, you’ll have to copy an entire patch from the database, load it up in memory, and start decompressing what you need.

You can benchmark if you want, but unless your patches are very small, I don’t think you’ll gain anything from that.

pmeunier · August 30, 2021, 4:32pm

If you’re into databases and performance, you might enjoy reading:

How to make SQLight faster, using LMDB instead of its original engine: GitHub - LMDB/sqlightning: SQLite3 ported to use LMDB instead of its original Btree code. See http
Pijul’s engine is actually faster than LMDB: Pijul - Rethinking Sanakirja

Topic		Replies	Views
Using libpijul from C? Question	8	270	December 29, 2024
Pijul with Gitit (Pijit / Pijulit ?) Question	0	573	February 7, 2019
Getting started is to difficult Site Feedback	3	1048	February 5, 2018
Pijul and data versionning Question	1	418	August 31, 2022
Writing a repo browser for Pijul Question	0	411	August 20, 2022

Using pijul as backend for a Pull Request-like system?

Related topics