How to move through history?

clamydo · January 21, 2022, 1:47pm

How do I move to a specific point in history without unrecording every single change until then?

I am trying out pijul 1.0.0-beta and trying to unlearn git commands ;). If I understand correctly, I could fork a channel (in order to not loose the current state) and pijul unrecord a change by a change until I’ve reached the target point in history. But is there a quicker way?

Maybe a git-pijul-Rosetta would be helpful?

joyously · January 21, 2022, 4:03pm

I’ve been asking similar questions, with no good answers.
repository history
timing comparison
branching
feature complete?
revisiting old versions
and probably some discussions here as well.

@pmeunier mentions history here

In Pijul, the cost is measured in terms of the size of history, i.e. the number of patches, rather than the size of the patches themselves. Large binary files aren’t really a problem (modulo a few minor remaining bugs, such as the text format for patches). One feature that isn’t implemented yet is navigating large histories that aren’t tagged, but this should be done shortly, since it is the last major thing I want to do before moving to beta.

m4b · January 22, 2022, 2:26am

This is the first question I had actually, and after perusing docs, and attempting similar operation:

pijul channel foo
pijul unrec 122134

and seeing the error message that it can’t do this because it relies on the grandchild commit, I do not see an easy or obvious way to checkout arbitrary points in history, which seems like a serious deficiency to me, unless I’m missing something?

Basically I’m asking what is the pijul equivalent of git checkout 12345 ? If this doesn’t exist, I’d suggest a reasonable form of this needs to be in place for a 1.0 release, as this seems like table stakes for a VC system at this point.

spacefrogg · January 24, 2022, 1:05pm

Pijul and “the other” VCSs work very differently from each other.

GIT and friends are overly aggressive with establishing a dependency by enforcing a single total order on all their commits. (Technically, their history is a single directed acyclic graph.)

You (the user) have come to accept that as a fact in life and structure your source code work accordingly. I.e. we usually implicitly accept that commits, that are purely adding new data files or revolve around documentation, are still in a dependency relation with changes about code or with each other. GIT established “commit often and early” as a parole, to make those fake dependencies manageable for the tools. Because it becomes easier when your changes are small and involve only a few files. There is no inherent value to small commits other than this.

In pijul, you’d have at least three dependency graphs, one for docs, one for code and one for data. Going back in time in one graph does not affect the other two. Or should it?

Well, that’s where GIT etc. make it mentally easy, because there you can follow a “don’t care” approach and just revert everything past a certain point. (And with “past” I mean the dependency chain, the commit/author date can be younger/older, still.)

You could list the patch history with pijul log and revert everything past your point of interest, but this might still miss independent patches with an older date. While this might work most of the time, you would need to keep the corner cases in mind.

So, while in GIT you have unintended dependencies, in pijul you may have unintended in-dependencies.

You and your collaborators must learn to use strategies to mitigate these issues, because there is no easy way for you to find those independent patches.

You could create a tag with pijul tag before you pull new changes. So, you have a state X you could jump back to. The problem is, that you must have created that state X beforehand. It is hard to go back as an afterthought.

Pijul’s pull is interactive and lists all changes you would download. This is the list of “unseen” patches which you would sift through. You can pull those patches into a separate channel with --to-channel and locally draw from them piece by piece until you’ve found the patch of interest. This channel is not meaningful in terms of “building the code” but is just a limited storage for certain patches, which are of interest to you.

Channels (as opposed to branches in GIT) can help you separate your content into units that are meaningful to you. You can have channels with are just dumps of ideas or variants of solutions without all the rest of your source code. And you would not interact with them by checking them out into a working tree but just by pushing and pulling individual (graphs of) patches.

You see, I don’t have an easy answer to your question, because you’d work differently with pijul from how you do with GIT. You won’t likely have the same problems that require you to “go back in time”. But you will have different problems that require new strategies.

It might make sense that you give a concrete example of “I am in state X. I get a bunch of changes, which leaves me in state Z. Now I must find state Y, which is the last ‘good’ state before everything went haywire.” One, that involves (2-3) files and content.

joyously · January 24, 2022, 7:17pm

That seems to be the problem. Other VCS are actually managing versions, whereas Pijul manages changes.

I doubt this. The value of small commits is that you can revert or change or send them to others easier. They are also easier to review and test. Committing often also is insurance against hardware issues or your own mistakes, as it gives you a backup. Bisect is also easier.

I already suggested that Pijul should be making “auto” tags, such as before a pull or by date (so you can go back to a date).

I suggested this (comparing to Git), but did it get added?

Yes, the problem is the same because you collaborate with people and you release versions. You still need to be able to “checkout and test” a change. You still need to be able to find the change that caused a bug, trying some form of bisect between the last good state and the latest. New tools might be needed, but the base Pijul needs to be able to supply the info to feed the new tools to be able to do the same things other VCS do.

spacefrogg · January 25, 2022, 10:25am

If that in itself is a problem for you, you shouldn’t use Pijul, because managing changes (individually) instead of versions of working trees is an intended function. This comes with consequences of “how it feels” to work with it and how to structure your programming work around it. There is no intention (as far as I can see) to make Pijul another of “the other” VCSs.

I hope you haven’t misread me on purpose. There is no value to a small change, but there is a value to a change that concerns a single intent. This means the change is the smallest one that transitions the working tree from a working state A into a working state B. There is no value to commits that transition your working tree from A, through several state X, Y, Z, which all leave the working tree in some broken state, and finally pulling all together into B, just for the virtue that these five are small. Future you or your collaborators cannot know that X, Y, Z are not to be used. And it will make bisect harder, because you end up having a “bubble” broken state in between, whereas bisect assumes that good and bad states are definitely on opposite sides of the history.

That is gambling. VCSs are not backup systems and non of them is intended as (although widely misused as) backup systems.

This was a statement of mine. It IS interactive and lets you chose what to pull. So, yes it did get added.

This is very generic and not helpful. You are stating goals and infer from them that the problems must be the same because the goals are the same. In this generality, that is not true. I asked for specific examples. So, we could exercise through them together, thinking of good strategies and identifying missing UI/tools. Just because you want to achieve the same goal doesn’t mean the ways towards the goal cannot be completely different. This said, no, I don’t think you will encounter the same set of problems with Pijul.

I don’t know if you’ve worked with code before GIT, but if you did, you know that rebasing was not a thing back then, the same goes for massive branching and merging. They are common strategies now, because they are features that GIT supports and that Subversion doesn’t. Rebasing and massive branching brought their own set of new problems. So, stating that the problems won’t change with a tool that works very differently is neither correct nor helpful.

For instance, when your project does not build, because somebody messed up your documentation, you can unapply patches that just relate to documentation files. You can achieve that in GIT, too. But it involves different steps. You must rewrite history for this.

Should the changes in docs be mingled with code changes in a single patch, you can unrecord the change and split off the code change (to keep it) while reverting the docs change. You can do that in GIT, too. But it, again, involves a history rewrite.

Again, please come up with a concrete example that interests you.

joyously · January 25, 2022, 6:54pm

The first line of the Pijul manual says:

Welcome to the Pijul book, an introduction to Pijul, a distributed version control system

so I think it is intended to be another VCS. It is supposedly like Darcs, but doesn’t have all the commands that Darcs has.

No, I didn’t. I was not referring to commits just for the sake of committing. I was referring to units of functional change of state. If I need to reformat 100s of files to meet the project’s new coding standards, do I commit one file at a time or 100? Which one is easier to review or test or send to others?

Not a literal backup system as such but it’s a good descriptive word for it. And yes, that’s what they are for. It’s in almost all of the intro to version control sites as the main reason to use a VCS. The collaboration aspect is secondary. Both need a good history traversal mechanism.

I just tried it and got an error. (I recently installed the beta.) There was nothing interactive for the pull command.

pijul pull
Downloading changes [>                                                 ] 0/6                     
Error: No such file or directory (os error 2)

I did, as did the OP and m4b.
Let’s say that I tried Pijul back when it was 0.12 or whatever. I have a repo that the newest version doesn’t read. So I want to extract the 0.12 version of the code so I can fix one thing before using it on my repo. What pijul command will do that?
Or I am in charge of writing a monthly newsletter on my project’s progress. How do I get the details of what changed in the code in the last month (or any month)? What if my boss wants a SLOC count for each month last year for the annual report?

m4b · January 26, 2022, 4:39am

note: I haven’t had time to read in depth some of the responses, but one of them suggesting “pijul isn’t for you” strikes me as oddly positioned (if not hostile).

It doubly strikes me as odd that the usecase for e.g., checking out the repo’s textual state at some point, whether that is semantic or temporal or some such other point, would be doubted as a normal/expected procedure. Something like this occurs on a weekly bi-weekly basis at my $DAYJOB and also $PERSONAL_PROJECTS, so either my dayjob and personal projects are weird (I don’t think so, and my coworkers definitely wouldn’t either), or it’s weird I have to explain to someone why this kind of operation might be common.

For example, as @joyously noted, it seems pretty reasonable to e.g., want to look at a change, and see if a regression occured, or for example, I see this just about everyday, someone says: “I’ve checked out change abcd123456, built the code, and the issue is not present”. Is the suggestion here someone wanting to do this should not use pijul? This is standard issue debugging in both large and small projects when trying to narrow down the scope of some regression; dismissing it outright, without for example, a suggestion on how one might accomplish it again, strikes me as odd.

Now, perhaps this viewpoint is highly git-centric, I won’t doubt that. But you’ll notice I gave an example of what I was trying to do, and asked what the pijul colloquialism for this was.

So: is there not a pijul colloquialism for this? What is someone supposed to do if e.g., some change introduced at some point in time caused a regression, and we want to find out, just by going back in time, where we might have introduced it?

And let me be clear: the answer here will be unappealing, I would posit, for most prospective newcomers and curious onlookers or otherwise generally brave people wanting to try a new VCS, that such and such is not possible because of theory, or an answer something like, you shouldn’t think about it that way; while this might be interesting, for these people, the VCS is likely dead on arrival for them, since they want to use it to do, well practical things, like browse through a history, and if encountering some bug, goto some point in history that doesn’t have that bug, and narrow it down from there (i.e., bisection).

I’ve been told that tags are the more appropriate solution for this, and maybe so (indeed, I filed pijul/pijul - Discussion #632 - pijul git appears to drop git tags precisely because of this, because oops, it appears to drop git tags, so now we can’t explore any history if tags are the pijul way to do this).

Even still, I find tags an unsatisfying suggestion here; sometimes I just want to isolate one particular patchset/change; it feels strange for someone to tell me that this is the incorrect viewpoint, since pijul change Z4ZNMHSKLDAQ2MQEWW3XZP3AJ356DYWCRAJSAXFLTPSKRGD5BYCQC shows me an isolated change; is it someone unsound for me to ask how to revert the state of the repo to that point. And please don’t tell me there isn’t a point; there is a point, pijul log shows me the that point as the first entry, and when i invoke it again, it’s consistently there. If it’s some non-deterministic set of patches, I’d suggest the log subcommand non-deterministically display the list of results

Again, back practical examples, I could write:

pijul unrecord Y37IP2CFY3HQZR4FMGQ2IGDMQ2BEU2EKVQTYAZAGNIKMAQEC5PJQC  6CV43X76UVGBOUOBBPVF7T3FL6XNRO27KW5SRNSVHFT3G7HXEW7AC MTC5LQENSV3IWRV63M7AKUP4YLCCWNDFEXGPDPJPNCZHKPYA2CHAC

which according to my pijul log are 3 changes in my repo preceding the patch Z4ZNMHSKLDAQ2MQEWW3XZP3AJ356DYWCRAJSAXFLTPSKRGD5BYCQC.

So, I’m asking, why isn’t there a subcommand, for example (non normative) pijul checkout Z4ZNMHSKLDAQ2MQEWW3XZP3AJ356DYWCRAJSAXFLTPSKRGD5BYCQC which perhaps resolves to:

pijul channel tmp-Z4ZNMHSKLDAQ2MQEWW3XZP3AJ356DYWCRAJSAXFLTPSKRGD5BYCQC
pijul unrecord Y37IP2CFY3HQZR4FMGQ2IGDMQ2BEU2EKVQTYAZAGNIKMAQEC5PJQC  6CV43X76UVGBOUOBBPVF7T3FL6XNRO27KW5SRNSVHFT3G7HXEW7AC MTC5LQENSV3IWRV63M7AKUP4YLCCWNDFEXGPDPJPNCZHKPYA2CHAC

i would be be pretty satisfied. Specifically, in just the context I was interested, just so this is clear, I’m talking about automating computer automate-able operations to make one’s life easier. If this kind of operation is somehow theoretically impossible in some general sense, then yes, perhaps pijul isn’t for me

m4b · January 26, 2022, 4:43am

~~I typed up a fairly long post, and it has now been removed and flagged by the spam filter Akismet, for reasons that are unclear to me, which is… super annoying?~~ displaying now…

spacefrogg · January 26, 2022, 2:10pm

I’m sorry, I caused this much upset. I am not here to convince anybody of anything or call judgement on anyone.

Short answer:
Pijul has state markers now. Having checked out a particular set of patches, pijul can put a state marker on them. That state marker is tied to the current dependency graph and resembles the linearity of history that you’ve come to expect from GIT and friends.

You find them with pijul log --state, which may return something like:

Change X77HHMGQCVXJSW7GTQ4ETCVT7OQO7F54B5GCLAUBTUVH6VRYL6TQC
Author: spacefrogg
Date: 2022-01-26 13:25:57.113769677 UTC
State: 4PUVTHWGVI3KPYJ7SSWTHUXASLJS5QOGH3D2RYOXUBWCDI62SB3QC

    a

Change EARDKIEJMJQUOBIYXUA2HUOO5ZPQ6DEKD3Y47KYZPSAOCYGGWRMAC
Author:
Date: 2022-01-26 13:26:00.603468821 UTC
State: SZI2BZ2Z3WAXEYFTQUJ2PXJRJFSCAWS5S4FOQRRUS24FZAAISFYAC

Afterwards, the current UI of pijul allows you to pijul clone --state <state> a particular state into a new working tree. There seems to be no UI to transition to a different state inside the same repository, right now. If you must have the old state as a separate channel, you would then pull from the freshly cloned repo into a second one. E.g.:

# clone state 4PUV from `repo-a' into `old-state'
repo-a $ pijul clone --state 4PUV... . ../old-state
repo-a $ pijul channel new old-state
# pull old state back into channel `old-state'
repo-a $ pijul pull --to-channel old-state ../old-state

spacefrogg · January 26, 2022, 3:14pm

This is exactly where you are misunderstanding what changes (the Pijul term) are and what commits (the GIT term) are.

A change is not a point in history, at least not only. It is definitely not a point in history regarding the whole state of the working tree. You said, you haven’t read the earlier thread in detail. So, I’m hesitant to elaborate on this point, because you might not want to read about it…

A short example, though:

# create repo 'a' and record two changes
$ pijul init a; cd a; echo a>a; pijul record a -m a; echo b>b; pijul record b -m b; pjiul log --state

Change 3DXPVOZHTLM7BIYCTGXX3K4F2BDDOGJ2HIOZWMXGU26PLTTX44QQC
Author: spacefrogg
Date: 2022-01-26 14:51:15.756437822 UTC
State: JJB46NHV4MMXUHA3IWKYSQOBFEUM4VSH3SP3YKVO67OO3USRQS7AC

    b

Change BZFCUGORTCHPONN47D5AVS53DJ3PQA3UH3BAKN6XC6XZVG3ACMHAC
Author: spacefrogg
Date: 2022-01-26 14:51:01.428570226 UTC
State: GLSE5Y6RYOUO7CKZG7FC53T3GQL32NGVBMTBFTT3HC7X4MB2B7JQC

    a

Change NELXJKXLXLESBHFZGRFQHKKEULVWEUIAHFUDXVUTO2LJD3TV66JQC
Author:
Date: 2022-01-26 14:51:04.403293549 UTC
State: AZ2YWX57RU7YAAQHNLPW42BDTGWTTSSED73EXFFMIDWOZPYMPBVQC

Now we create a second repo and pull the patches in reverse, because they are independent of each other, this will result in the same repository, BUT:

# create repo 'b' and pull changes in reverse order (need to also specify the root change!)
$ cd ..; pijul init b; cd b; pijul pull ../a -- 3DXPV NELXJK
$ pijul pull ../a -- BZFCU; pijul log --state 

Change BZFCUGORTCHPONN47D5AVS53DJ3PQA3UH3BAKN6XC6XZVG3ACMHAC
Author: spacefrogg
Date: 2022-01-26 14:51:01.428570226 UTC
State: JJB46NHV4MMXUHA3IWKYSQOBFEUM4VSH3SP3YKVO67OO3USRQS7AC

    a

Change 3DXPVOZHTLM7BIYCTGXX3K4F2BDDOGJ2HIOZWMXGU26PLTTX44QQC
Author: spacefrogg
Date: 2022-01-26 14:51:15.756437822 UTC
State: USANBAQ4NXDHG3NBGPQUGWWRICQC2MGVEABX5JV5AGMEWMAPXUBAC

    b

Change NELXJKXLXLESBHFZGRFQHKKEULVWEUIAHFUDXVUTO2LJD3TV66JQC
Author:
Date: 2022-01-26 14:51:04.403293549 UTC
State: AZ2YWX57RU7YAAQHNLPW42BDTGWTTSSED73EXFFMIDWOZPYMPBVQC

As you can see, the root change NELX represents the same state AZ2Y in both repositories. Also, the newest changes represents the same state in both repositories, namely JJB4, although the changes themselves are different. Also notice that the developer in repository a can only go back to the state GLSE while the developer repository b can only go back to state USAN. Developer a will revert the change to file b and developer b will revert the change to file a.

The cause for this is that the history is linearised arbitrarily (well, not totally arbitrary) by Pijul to give you an impression of something that just doesn’t exist in its system, which is a single linear history.

m4b · January 27, 2022, 3:32am

Thanks for your responses, your second example is very good.

So in the case of my example above, if the order of the patches being reverted is irrelevant (i.e., they commute), then it still isn’t clear to me why the operation: revert all patches which are the reverse dep of change ABCD?

It seems to me this information must be known, since reverting some arbitrary change ABCD without supply the reverse deps of ABCD yields an error message indicating this.

I believe this would be sufficient for most of my uses of revision control, specifically, identify some interesting patch (pijul log), and “reset to that patch”, which in pijul land could mean just revert the reverse deps of that patch (although this is less satisfying if trying to bisect, since one of the unrelated patches of ABCD may be the cause)

It’s definitely an interesting property/challenge that the histories are not linear and so the operation “revert to point” is less defined, but from a pragmatic standpoint, and again this returns to the original issue:

how are developers using pijul expected to manage regressions in functionality, or exploring arbitrary states of the repo? This part isn’t quite clear to me; the reasons for why this might be a differently stated problem than git or why it might be harder/less defined are, but that doesn’t mean I still don’t need to manage regressions or explore arbitrary history

spacefrogg · January 27, 2022, 11:07am

In GIT and friends, we have (roughly) come to the following approach when investigating a regression:

After the error happens, start bisecting immediately without (or only somewhat) investigating the cause of the error
When identified, then look at the commit that broke the code and infer its relation to the error.

I believe with Pijul, you would want to go the other way around:

After the error happens, investigate which file is effecting the error.
pijul log -- error-file does the obvious, showing the log of changes relating to error-file. They can still be independent, so don’t count on log order!
Better use pijul credit error-file to find the change that introduced the error
Use pijul unrecord --reset <change> to remove the effect of the change from the current channel. It is the opposite command of pijul apply <change>.

Matthijs · January 27, 2022, 10:47pm

If one really wanted to bisect, could one not do so simply on dates of recording? That wouldn’t be infallible of course, but is it not a fine heuristic?

spacefrogg · January 28, 2022, 10:40am

Bisecs will be much less useful in Pijul, because the “distance” between two related changes is shorter.

Suppose you have a repository in which you change 100 files one after another and go back to the first and start afresh.

In GIT and friends, two changes in any single file would always lie 100 commits apart. In Pijul, two changes in any single file are always adjacent to each other. So, there is no need for bisect, just unrecord the last change in the file of interest.

There are, obviously, more intricate cases with more complex dependencies, but the general message stays the same: Unrelated changes automatically fall out of your scope.

pmeunier · January 28, 2022, 10:56am

They’re still far away in the log, but I agree that you can unrecord them without unrecording all the others.
That said, I don’t quite see why that makes bisect “less useful”, I think it just makes it (1) harder if you want the same semantics as in Git, since going back in history is harder, and (2) different, since it opens the way to new workflows with bisect.

I don’t know how to make it automatic in regression tests, for example, but figuring it out is an interesting project.

spacefrogg · January 28, 2022, 1:00pm

I guess, what I meant was that with a single linear history for the whole tree, you can trust that all changes that could possibly be effecting your regression are reverted when you bisect (or jump back in the history).

1st example:
Suppose you have a script that runs on all files in data/ and at some point, one of it broke. Now, there is not factual dependency between the script (where the error becomes visible) and the new broken data file. Concentrating on the history of the script is of no use in that case.

So, in Pijul, it does matter to identify the semantic interdependencies to find and fix regressions. In a linear history, you can (mindlessly) go back to make sure you undid whatever could potentially be causing the regression. Then you can reliably go forward to circle-in on the actual problem.

Without having a stable linearisation of history, I guess, bisecting it will be less useful. (Because you still have the burden of identifying the semantic interdependencies.)

2nd example:
Suppose the presence of two patches actually cause the regression. One removes a bounds check in the provider of a function, the second removes the bounds check in the consumer of the same function (used from a different file). Maybe, because it was recognised (independently) that one of them was superfluous, but either developer had their own opinion, which one…

As I showed in an earlier post. Without a stable linearisation, the two developers will see each others’ changes as the second one, the one that broke the code. So, non-linearisation may impose an additional communication burden.

EDIT: These are all just random thoughts. I don’t mean to imply that I favour any particular approach on this topic.

Matthijs · January 28, 2022, 2:33pm

Not quite. Suppose

At first, everything is fine.
One day, a mistake is made, but nothing breaks.
Later, somewhere else a refactoring is carried out. The refactoring is correct, but things break anyway.
Only after a while the breakage is noticed.

Now, git-bisecting will only lead one to the (correct!) refactoring, not to the original mistake.

Which is not to say git-bisect is useless in this case: it still points to relevant – if not responsible – code, which probably makes it easier to find the real cause.

I expect the same to hold for Pijul, except that (as illustrated by your bounds checking example) the probabilities will be different: Pijul will more often point you to ‘merely’ relevant (as opposed to offending) patches.

I agree. However, I expect such dependencies to go unrecorded very regularly. There should be tools for dealing with this.

@pmeunier Is it possible to mark two past patches as in conflict with each other?

pmeunier · January 28, 2022, 3:41pm

Good question. I assume you’re not talking about conflicts in the usual sense, but rather to mark patches as incompatible with one another.

This would lead to inextricable situations where two “cathedrals of patches” leading to major features suddenly become unapplicable to the same codebase, and prevent any attempt to merge. I suppose this wouldn’t stop the authors of some package manager (Cabal for example), but it isn’t really what we want here.

More modern package managers like NPM and Cargo solve the issue by having multiple versions, and the possibility to have all of them in the same package/crate, but we don’t have that here.

joyously · January 28, 2022, 6:59pm

Since we are talking about history, it seems to me that you would use the date. When I pull changes from someone else, they should retain their original date and author. The history should be shown according to date. I go back to a previous state by using the date. If that means that there should be more options for date or that Pijul needs to auto-tag by date, then that needs to happen.

Topic		Replies	Views
How to view a file in the past Question	2	256	April 22, 2024
Equivalent of checking out an old commit? Question	18	2600	February 21, 2021
Modify change that has dependencies Question	4	214	May 7, 2024
Git-pijul python module Community	9	1372	December 8, 2022
Channels of channels Development	7	689	June 18, 2022

How to move through history?

Related topics