Let me preface this by saying that I consider distributed discussions, wiki, etc., to go along with distributed code tracking long overdue. I say this because there is more wrong with distributed project management than is right and I am not seeing anything in discussion so far that goes beyond existing systems. My goal is not to attack individual ideas but rather to foster a conversation about the underlying problems and the state of the art of distributed project management offers little in terms for constructive critique of anything.
Every attempt to crack the problem I have come across shares the understandable problem of approaching it from the DVCS side. After all, the architecture used to overcome structural challenges in overcoming centralization in code versioning should be easily applicable to related problems. Yes? No? Maybe. This is certainly the case in a larger sense of reasoning about distributed systems but breaks down when treating it as virtually the same exactly problem. Or even worse, as a subset of version control. The mental model is some variation of: it’s all documentation. This is not the case because otherwise it would be part of the documentation. What we are talking about here metadata about the project. While some of it may refer to a particular line, in a particular state, in a particular branch, residing in a particular repository in a general sense it does not. Let’s look at the most basic approach first.
Dump it all in the same repository as the code with some automation to handle ticketing and whatnot. The later should clearly indicate an issue. Automated commits in general indicate an abuse of a VCS and automated commits of automatically generated data is particularly egregious. You would not automatically commit code every time you saved it, so you should not treat other data as a second class citizen. In one sense or another it is going to be out of sync with the actual code, clutter the log, pull in all sorts of metadata into what would otherwise be clean merges and generally will get in the way of dealing with version control. At the very least descriptions for anything but targeted wiki edits and new issues are all but assured to be repetitive junk. It is possible to mentally or pragmatically filter out all this noise but that’s a clear indicator that it may not belong in the first place. Enter separate but equal.
Dump it all next to the repository, with tight integration. In the best case scenario this takes advantage of a solid distributed storage mechanism but it usually ends there. This can work decently well for parts of the project metadata that are the closest to actually being documentation, like a wiki, but it is not optimal and that becomes apparent the further it gets from the idealized model with a line of text as the fundamental unit of change under careful manual review. Since the wiki is the best case let’s start there.
The main problem of putting a wiki under version control is not so much the format as it is how it is edited. Project meta-documentation, that is documentation that applies to multiple repositories, is not expected to be in sync with the code beside it and so on, could conceivably be housed in a separate repository as it is produced very much like code. Simple saving is separated from adding it to the repository, so the later can be done as a complete change with a coherent description. Not quite the case with a wiki. Saving and auto-saving by necessity produce separate changes. They have so be stored by the server one way or another and the usual solution is to just treat each as a separate commit. This would just be an UI issue (albeit one that most wiki’s simply forego dealing with) as long as individual authors would be able to do the equivalent of commits after ordering their thoughts but the kind of collaborative editing that wiki’s are indented for breaks the code versioning model. It can still be kind of sort of treated as code documentation but I would suggest that for the purposes of project documentation it needs to integrate with discussions more than with code and should be reasoned about accordingly. This leads to discussions/tickets/bug reports whatever you see it as, where any similarity to program code simply breaks down.
Regardless what exactly the data is, the distinguishing feature is that it is structured. Past author, time and message, there is some sort of a status, tags, references to commits, other threads, etc. It can be stored as json in a code repository but what’s the point? The fundamental unit is a named field, not a line of text, and the units are organized into self contained posts quite unlike files consisting of interwoven edits. Add issues like spam and the ability to easily purge data from being replicated gets mixed into the pot.
My point is that project metadata should be restricted to centralized portals. I would like to see the ability to clone it, edit it locally, be merged together in a sensible fashion while staying accessible to drive by users who just need to file a bug report on the project website.
I have no idea what exactly that looks like and absolutely no idea what the theoretical underpinnings to achieve smooth distributed functioning would be, much less where that diverges with the commutative patches, just that there is more to be gained from approaching it as a separate problem space than bolting on the user experience of a project management system centralized system. Fossil did the later, and it works well enough for what it is but the ticketing portion of it barely benefits from distribution while suffering from its drawbacks and the wiki portion may as well be dynamically rendered documentation from a repository.
What I envision is something along the lines of Foswiki in the ability to intermingle structured data and free form wiki text, with posts that can be freely attached to any and all relevant topics, duplicate tickets that can be merged rather than simply marked as duplicate, feature requests can be directly incorporated into the wiki pages discussing the design to implement said features, arbitrary repositories can be referenced seamlessly without having to point at centralized servers they can be fetched from, etc. Maybe that’s what is being discussed here and I’m just not reading it right but to me it seems like the conversation is inadvertently retracing the missteps I see in Fossil.