Inconsistent version numbers in subversion
Update 2017-10-31: The developers of Subversion were a bit surprised, but it turned out to be a genuine bug. The bug has been fixed the same day.
The TeX Live project uses Subversion as its main repository. Reasons for that there are many, see my blog post on svn and git. But today I realized that subversion is broken, badly broken, because one can easily end up with inconsistent version numbers.
To understand why this is a problem, I restart from the already linked blog: “…Our packages have revision numbers based on the latest change revision of all the contained files. Since subversion has a linear history and central repository, this guarantees that – in contrast to version numbers provided by authors – the revision numbers are strictly increasing…”.
Well, it turned out that this assumption was wrong. Because it is easy to trick subversion to have wrong revision numbers for files. Consider the following flow:
- revision A: file xxx is changed
- revision A+n: file xxx is changed again
- revision A+n+m: file xxx is changed back to the state at revision A
In our case we have files listing other packages, and they are getting moved around at times. In this case one package was moved from one collection to another and back.
Now, if user A does svn up between revision A+n and A+n+m, then the final revision in his checkout of file xxx will be A+n+m. But a user B that does NOT do svn up between A+n and A+n+m will end up with a final revision of A for file xxx in his checkout. And boooom, revisions are different, although the files are in sync with the server.
I cannot grasp what the developers of subversion thought about here, but I consider it deeply broken by design. One more reason the rewrite all our distribution scripts to work with git instead of subversion.
The recommended way to trace releases is to create tags in the repository.
Please read the linked blog about what TeX Live is and how we use svn. We have about 5000 different packages with different release cycles in one svn, tagging does not make sense here, in particular since we update practically every day some packages. Imagine all of Debian in one svn repository.
Have you ever tried it?
Tried what? Subversion? Well, yes, read the comments.
Is that clipart from Office 97?
Haha, good one. Could be indeed!
“I cannot grasp what the developers of subversion thought about here,”
I was not around when the system was designed, but as one of the remaining active SVN developers, I can assure you that you’re using SVN in a way it wasn’t mean to be used. Revision numbers were never intended to be used as public labels for released software. Sorry.
If you are interested in discussing this further, please post to users@subversion.apache.org. We are always happy to help solve process questions such as this.
Another note: If you changed your process such that it avoids mixed-revision working copies, it seems your concerns would be addressed. You will either have to enforce the ‘svn update’ step, or address the repository directly.
See related documentation here:
http://subversion.apache.org/faq.html#hidden-log
http://svnbook.red-bean.com/nightly/en/svn.basic.in-action.html#svn.basic.in-action.mixedrevs
Hope this helps.
We don’t have and use mixed-revision working copies. I compared two completely separate checkout of the same central repository, one of which is updated daily (our master) and one that is updated only very rarely (mine, because I have another checkout via git-svn which I prefer).
Hi Stefan,
thanks for your comment. We are using svn now since 2005 (when we moved away from Perforce), without about 45000 commits, a repo size (.svn included) of 25G, and about 160000 files. Since about 10 years we use the revision numbers as guarantee for increasing versions. The reason is that we aggregate about 6800 packages from different developers into a big “product” (TeX Live) and need to guarantee increasing versions, which the version numbers of the original authors does *not*.
Subversion works very nice here, and in practice it is not a problem because we *always* build from the same checkout that is done every day at the same time. But sometimes, for testing new functionality, I do test runs at home, and realized that the revision numbers of files are different from what is on the main checkout.
What I want to say, even with not being used for “public version numbers” or whatever, I *always* thought that revision numbers in subversion are consistent across checkout, alas they aren’t. And I stand by my words that this is not good.
I must admit I am still unsure what behaviour you are expecting exactly.
Do you expect that a file with content xxx will map to the same revision number as an old version of the same file (at an older revision) which also happened to contain content xxx?
Do you expect that a working copy will only ever contain data which maps to one particular revision?
Neither of these assumptions hold in Subversion’s design.
The first assumption does not hold because old revisions are immutable. If a file changes back to its prior content, this constitutes a change from the predecessor revision which means a new (immutable) revision must be created. Older revisions simply don’t matter in this context.
The second assumption breaks because mixed-revision working copies are a necessary part of Subversion’s design. If mixed-revision working copies did not exist, every commit operation would imply an immediate update of the *entire* working copy to keep every file and directory in the working copy at the same revision. Furthermore, Subversion was designed to replace CVS, which could also create working copies containing an arbitrary mix of revisions of files (cvs up -r1.x file). So this feature was retained to keep such use cases working after a transition from CVS to SVN.
Hi Stefan,
my assumption is quite easy: I assume having two checkout of the same central repository, both are uptodate, not using mixed revisions, both are `svn up`-ed at the same time, both don’t have any ignored or out-of-repo content. In this case, I assume that the “last committed revision” in the output of `svn status -v` for the same file in both repositories agree. But this is not the case: Here on my computer I get:
45627 39121 karl Master/tlpkg/tlpsrc/collection-langgreek.tlpsrc
while on the main checkout I get
45643 44192 karl Master/tlpkg/tlpsrc/collection-langgreek.tlpsrc
My assumption is purely that the “last committed version” is correct, which it either is not, or my interpretation of the English explanation is wrong.
Does that help you?
Please let’s move this discussion to users@subversion.apache.org. There are so many misconceptions in the above comment that I very strongly recommend not cluttering up this comment thread with basic explanations. Note that “plain English” is definitely not the best description of “what I did in my command line.” So it’s possible that you’re talking past each other, but it’s also possible that something in the tool chain (git-svn perhaps?) is doing things it shouldn’t be.
HI Branko, Stefan,
I have posted to the subversion user mailing list.
@Branko: I am not talking about git-svn, I am talking about basic subversion usage and the output of svn status -h and svn status -v. Let us continue on the ML.