(Debian) pristine-tar and how to *not* do packaging
Pristine-tar is a nice tool when packaging for Debian in a git repository. The basic idea is that when checking in a new upstream tarball into the upstream branch, some hash information and binary delta is saved into the pristine-tar branch. Later, if the tarball is needed, the upstream branch is packaged up, and using xdelta the hash sum is adjusted so that it matches the original hash sum.
Nice idea, and I am using it since long – until recently I hit consistent problems – and an author and maintainer who has lost interest (or time).
So what has happened: I agreed to sponsor a new package, 3dldf, which is related to TeX as it is a kind of 3d-MetaPost. The maintainer, Jerome Benoit, has done a great job packaging it, and preparing a git repository for development.
So after a few times back and forth, we were ready to upload to Debian, so I started my usual process:
- build the source package from the git repository
- lintian the source package
- build the binary package in a clean cowbuilder of up2date sid
- run lintian on the whole package
- make install and functionality tests
Unfortunately, this time I always stumbled over the very first step, creating the source package. And it was always the generation of the pristine tar file that broke. I was able to recreate it on a Debian/stable system, which Jerome uses for generation of the git repository, but not on my Debian/sid system. My suspect is some tar output file change, but no idea.
Searching a bit on the web brought forth quite a lot of cases where pristine-tar is not able to recreate tar balls. There is even a branch in the upstream repo collecting those. Unfortunately it seems that the upstream maintainer, who is also the Debian maintainer of the package, as given up working on it about a year ago.
I sent a few bug reports with debug details, but the only effect was that the maintainer orphaned the package.
So where are we now? Debian has probably hundreds, if not thousands of git repositories using pristine-tar. We all rely now on an unmaintained, orphaned, and buggy piece of software. That is not good.
Bottomline for me – I will stop using pristine-tar for new projects, and will search how to replace or fix my old repositories to get rid of pristine-tar. Sad, but true. The maintainer mentioned in the orphaning email dgit, I guess I have to check that out.
Stumbled across this via a discussion around an Arch package, no less. A bit late to the party. I for one won’t mourn the passing of pristine-tar. I have a lot of sympathy for what Joey says in point 2 – the subversive aspect of the program. IMHO release tarballs are a relic of how things used to work, and are not very relevant anymore. I go as far as to package git snapshots rather than releases so as to roll some of my patches, since accepted upstream, into the orig part of the Debian packaging and save having to have debian/patches/* and all that stuff.
Looking back at the problem that pristine-tar is trying to solve, it’s a way to confirm the integrity of the upstream source, built on the predicate that we checksum the origtar ball only and not its contents. But there’s the problem. For where this is still important, if we stored checksums of the contents, we wouldn’t need all this archeology to reproduce the tarball after the fact. And in fact if we’re using a VCS for our packaging and importing the upstream sources into it, this is pretty much done for us already.
Hi Jonathan,
thanks for your comment. Hmm, even if release tarballs are getting less and less used, we still have to prepare them for a -1 upload to Debian. And after that we have to use the very same one for building new versions. True, since uploads >> -1 don’t contain the orig tarball, we can use whatever we want, but we have to be sure that we build against the same files as we did with the -1 version.
True, having a git branch with upstream and clear tags is a good thing, but I am not aware of tools that allow us to circumvent the necessity to have an .orig tarball available when building from git.
So how is your modus operandi? I have most packages nowadays in git, with an upstream branch, and proper tags.