Pijul

Incorporating discussions into the repo itself

So, I’ve been thinking about this since seeing it in fossil. Why don’t we integrate discussions into the repo itself? After all, they are just as important to the history of a project as the commit log itself.

This brings:

  1. decoupling from the host (the nest);
  2. a lonely developer working on a local only project can still manage his own issues (then if he ever decides to host it on the nest they will be automatically integrated).

For the implementation, we could either choose:

  1. an unversioned database in the .pijul directory
  2. a versioned database in a hidden __discussions branch

The database would be something like this:

{'1': 
       {'title': 'Issue title',
         'content': [
             {'date': date, 'author': author, 'message': 'the content', 'reactions': 'reactions from users, like :ok_hand:, :heart: ' }   
             {'date': date, 'author': author, 'message': 'the content', 'reactions': 'reactions from users, like :ok_hand:, :heart: ' }   
         ]
       }
}
3 Likes

I wanted to do that initially when I started the Nest: having Nest-compatible decentralised issues would be even better.
At the moment, this suffers from two problems:

  • it is quite opinionated to be hard-coded in Pijul at the moment, since no big project uses Pijul.

  • Having something decentralised requires a fixed format that would last almost forever. At the moment, Nest discussions can still improve a lot. When they’re ready, we can freeze a format and do this.

Meanwhile, why not use a distributed bug tracker such as “bugs everywhere”, and use Pijul as a backend?

2 Likes

external bug trackers are definitely interesting, but as you pointed out they would need nest-compatibility to allow an eventual migration into the nest.

I’m not sure why we need a “fixed format that would last almost forever” for bugs. The internals would be completely behind the scene and managed by pijul itself. A change in the format would mean a bump in pijul version and an automated upgrade of the database, but wouldn’t be anything disruptive like the change in the patch format was.

2 Likes

Having something decentralised requires a fixed format that would last almost forever.

it doesn’t. The formats need to be designed with extendability in mind.

1 Like

Shameless plug: I’ve recently developed a decentralized issue tracker that works equator well in different SCMs: be it git, metcurial or pijul (or even without one) because it’s based on simpler primitives. It’s part of the SIT project (https://sit.fyi)

2 Likes

Wiki and Issue trackers absolutely should be version controlled and ideally live near the code if for no other reason than software archaeology.

That said, separation of concerns should mean how the wiki and issue tracker work should ideally be independent of the source code management system as much as possible. gh-pages branch seems to work well as a standard for wikis. Just need something human readable for issues (will org-mode take over the world?).

@yrashk: I’m just seeing this now, sorry about this, I actually took a few months break from software development, moved to another country, started a new job (even less related to software than before), but I seem to be back at it now.

SIT looks great, I look forward to integrate this to Pijul (would be an excellent fit for the Nest, and it would finally allow to decentralise it).

1 Like

The problem I see with issues stored inside the repo is, which branch it will be connected to.
Would each issue be connected to the branch the feature is being implemented on?

When thinking about it, this isn’t even a bad idea. Especially, when using pijul.
First you can open a discussion about a new feature on the master branch (or a branch, which already has all the features, required for the new feature).
When the discussion gets too long or some code is being written, it can easily be seperated from the current branch into a new branch (fork and then unrecord on the current branch).
So discussions leading to nowhere could live on branches, that will die anyway.
When separating records containing discussion and code, the discussion could even be removed again from the current branch, after the feature is implemented, when wanted.
It’s pretty flexible how you can integrate discussions into the code.

However, I wouldn’t like, if some specific format for discussions is inbuilt into pijul. One might want to use a different format for a own repo hosting website.
And I also don’t want to be forced to include discussion into history, so at least using a specific branch/repo, unrelated to the code, should be possible.

Is this also what you were thinking about.

1 Like

It’s not just discussions -emoji’s of being able to say me too - I’d really like that feature or thank you very much ‘heart’ are pretty important on the human side of things.

I’m really arguing here for decentralised hugs. :hugs:

For formats I’d be tempted to suggest something like markdown + small json blob. It would be great if nest can be decentralised from the beginning ( or at least very offline friendly).

Let me preface this by saying that I consider distributed discussions, wiki, etc., to go along with distributed code tracking long overdue. I say this because there is more wrong with distributed project management than is right and I am not seeing anything in discussion so far that goes beyond existing systems. My goal is not to attack individual ideas but rather to foster a conversation about the underlying problems and the state of the art of distributed project management offers little in terms for constructive critique of anything.

Every attempt to crack the problem I have come across shares the understandable problem of approaching it from the DVCS side. After all, the architecture used to overcome structural challenges in overcoming centralization in code versioning should be easily applicable to related problems. Yes? No? Maybe. This is certainly the case in a larger sense of reasoning about distributed systems but breaks down when treating it as virtually the same exactly problem. Or even worse, as a subset of version control. The mental model is some variation of: it’s all documentation. This is not the case because otherwise it would be part of the documentation. What we are talking about here metadata about the project. While some of it may refer to a particular line, in a particular state, in a particular branch, residing in a particular repository in a general sense it does not. Let’s look at the most basic approach first.

Dump it all in the same repository as the code with some automation to handle ticketing and whatnot. The later should clearly indicate an issue. Automated commits in general indicate an abuse of a VCS and automated commits of automatically generated data is particularly egregious. You would not automatically commit code every time you saved it, so you should not treat other data as a second class citizen. In one sense or another it is going to be out of sync with the actual code, clutter the log, pull in all sorts of metadata into what would otherwise be clean merges and generally will get in the way of dealing with version control. At the very least descriptions for anything but targeted wiki edits and new issues are all but assured to be repetitive junk. It is possible to mentally or pragmatically filter out all this noise but that’s a clear indicator that it may not belong in the first place. Enter separate but equal.

Dump it all next to the repository, with tight integration. In the best case scenario this takes advantage of a solid distributed storage mechanism but it usually ends there. This can work decently well for parts of the project metadata that are the closest to actually being documentation, like a wiki, but it is not optimal and that becomes apparent the further it gets from the idealized model with a line of text as the fundamental unit of change under careful manual review. Since the wiki is the best case let’s start there.

The main problem of putting a wiki under version control is not so much the format as it is how it is edited. Project meta-documentation, that is documentation that applies to multiple repositories, is not expected to be in sync with the code beside it and so on, could conceivably be housed in a separate repository as it is produced very much like code. Simple saving is separated from adding it to the repository, so the later can be done as a complete change with a coherent description. Not quite the case with a wiki. Saving and auto-saving by necessity produce separate changes. They have so be stored by the server one way or another and the usual solution is to just treat each as a separate commit. This would just be an UI issue (albeit one that most wiki’s simply forego dealing with) as long as individual authors would be able to do the equivalent of commits after ordering their thoughts but the kind of collaborative editing that wiki’s are indented for breaks the code versioning model. It can still be kind of sort of treated as code documentation but I would suggest that for the purposes of project documentation it needs to integrate with discussions more than with code and should be reasoned about accordingly. This leads to discussions/tickets/bug reports whatever you see it as, where any similarity to program code simply breaks down.

Regardless what exactly the data is, the distinguishing feature is that it is structured. Past author, time and message, there is some sort of a status, tags, references to commits, other threads, etc. It can be stored as json in a code repository but what’s the point? The fundamental unit is a named field, not a line of text, and the units are organized into self contained posts quite unlike files consisting of interwoven edits. Add issues like spam and the ability to easily purge data from being replicated gets mixed into the pot.

My point is that project metadata should be restricted to centralized portals. I would like to see the ability to clone it, edit it locally, be merged together in a sensible fashion while staying accessible to drive by users who just need to file a bug report on the project website.

I have no idea what exactly that looks like and absolutely no idea what the theoretical underpinnings to achieve smooth distributed functioning would be, much less where that diverges with the commutative patches, just that there is more to be gained from approaching it as a separate problem space than bolting on the user experience of a project management system centralized system. Fossil did the later, and it works well enough for what it is but the ticketing portion of it barely benefits from distribution while suffering from its drawbacks and the wiki portion may as well be dynamically rendered documentation from a repository.

What I envision is something along the lines of Foswiki in the ability to intermingle structured data and free form wiki text, with posts that can be freely attached to any and all relevant topics, duplicate tickets that can be merged rather than simply marked as duplicate, feature requests can be directly incorporated into the wiki pages discussing the design to implement said features, arbitrary repositories can be referenced seamlessly without having to point at centralized servers they can be fetched from, etc. Maybe that’s what is being discussed here and I’m just not reading it right but to me it seems like the conversation is inadvertently retracing the missteps I see in Fossil.

1 Like

I think the commit messages are the history of the project, and trying to put all the discussion about paths not chosen into the repository itself is needless bloat.

I know in Bazaar, you can use a --fixes option on a commit, and use an identifier:ID format to indicate a reference to an external bug tracker. This helps discoverability in the logs and retains separation of concerns for code and bugs, as the project can choose which bug tracker works for them.