The patch hashes are currently base64-encodings of the SHA2-512 of the patch as written in binary.
This is not optimal, especially when we need to copy-paste hashes (which does not happen very often for me).
I was thinking of transitionning to base58. Since Pijul 0.8 only supports SHA2-512 hashes, this can be made backwards compatible (looking at the length of the encoded hash is enough to know the encoding).
However, this is not forward-compatible: current versions of Pijul will be in trouble when the hash function is changed in the future.
Alternatively, we can drop support for base64, and release a small conversion tool. What do you think?
I am in favour of changing the patch hash format. Some time I implemented hex encoding of patches, but for SHA-512 they become extremely long
I’m not sure the forwards compatibility is a big issue - if an old version of Pijul needs to work with a different hash patch format, then a small conversion tool might come in handy, but it’d be likely that the old version would have other changes to handle as well.
In an attempt to implement all breaking changes we are aware of before 0.9, I implemented this.
The Nest and crates.io have been updated. Unfortunately, older versions of Pijul cannot pull or push patches from remote repositories with the new format. Upgrading can be done by:
cargo install pijul to install Pijul ≥ 0.8.2.
pijul clone all your repositories locally. Pijul 0.8.2 knows about the format change, and can clone the old patches.
Is there any reason for using SHA-512? That seems like an excessive choice for a security level; SHA-256, with a security level of 128 bits, would be sufficient.
However, even better would be SHA-512/256, which is SHA-512 with a different IV truncated to 256 bits. It also has a security level of 128 bits, but it isn’t vulnerable to length-extension attacks like SHA-512 or SHA-256 are. Length extension attacks aren’t likely a problem, since Pijul is just using the hashes as an identifier and checksum but not as a MAC, but it is a good property to have in case anyone ever does make an assumption that would be invalidated by a length-extension attack.
That would cut the length of the identifiers in half, which would help with the slight expansion by switching to base58.
Yes. We chose the hardest standardised hash at the time, because it’s always easier to decrease the security level later than to run into security problems.
That said, the design of libpijul on hashes is entirely forward-compatible: actually, the first byte of our patch identifiers indicates which hash function to use (in base64, all hashes started with A).
We might add Blake2s instead of SHA2 in the next release, which is capable of producing shorter hashes. Older patches will keep their identifier, though.
Something to consider is the multihash format used by IPFS.
The idea is that the hash and encoding are documented in the output and it would be easier to change things in the future in a backward compatible way.