Subversion vs. Git: Myths and Facts

There are a number of Subversion vs. Git comparisons around the web and most of them are based on myths rather than facts. The list below is intended to bust some of these myths. Although it doesn't tell which version control system is better, it should help you to understand the actual state of affairs.

1.Git repositories are significantly smaller than equivalent Subversion ones

False. A myth.

The particular delta compression algorithms used in both version control systems differ in many details, but in general Subversion and Git store data in the same way. This results in the fact that Subversion and Git repositories with equivalent data will have approximately the same size. Except for the case of storing a lot of binary files, when Subversion repositories could be significantly smaller than Git ones (because Subversion’s xdelta delta compression algorithm works both for binary and text files).

Example: repository size benchmarks

2.Branches are expensive in Subversion

False. A myth.

Branches in Subversion are implemented with Copy-On-Write strategy (referred to as ‘Cheap Copies’ in the svnbook). No matter how large a repository or project is, it takes a constant amount of time and space to make a branch. In fact, Subversion branches are extremely cheap beginning with version 1.0 and you can branch even for small bugfixes in a very busy and large project.

Example: branch creation benchmarks

3.It is required to manually specify the range of revisions when you merge two branches in Subversion

False. An outdated myth.

Starting with Subversion 1.5 (released in June 2008), Subversion implements the merge tracking feature and manual revision range specification is not required anymore. Moreover, Subversion 1.8 (released in June 2013) provides automatic reintegration merges that further simplify merging changes between branches.

Additional information: Branching and merging described in SVNBook

4.There is an auxiliary .svn directory in each folder of Subversion working copy

False. An outdated myth.

Starting with Subversion 1.7 (released in Oct 2011), working copies have centralized metadata storage and there is a single .svn directory in the root of working copy.

5.Nobody uses Subversion anymore

False. A myth.

Despite all the marketing buzz related to Git, such notable open source projects as FreeBSD and LLVM continue to use Subversion as the main version control system. About 47% of other open source projects use Subversion too (while only 38% are on Git). The numbers are much better for companies, because Subversion is de facto standard enterprise version control system. Moreover, every month a number of companies migrate to Subversion from such version control systems as ClearCase and TFS.

6.Distributed version control systems are inherently superior to centralized ones such as Subversion

False. There is a parity.

Distributed version control systems (DVCS) are just another approach to implement revision control. As it always happens, different approaches have their pros and cons. DVCS may be great for certain projects, but they have a number of limitations that become roadblocks for others: no access control, full copy of repository on every computer, no exclusive files locks and so on.

7.Git scales well for larger projects

False. Larger projects become split into a number of smaller repositories.

While Git is used for such renowned open source projects as Linux Kernel, it does not scale well for truly large projects. The Linux Kernel repository takes about 2Gb of disk space and it is acceptable to have the full copy of such repository on each developer’s laptop. However, a problem arises when the repository size reaches hundreds of gigabytes. This leads to a typical strategy to split large projects into a number of smaller Git repositories and let developers to clone their subset only (see the list of GNOME repositories). The obvious drawbacks of this strategy are the additional maintenance burden, loss of the atomic whole-project commits, inability to make consistent branches, etc.

In contrast, Subversion does not limit the size of the repository. There is no practical limit for Subversion repository size and multiple projects can be stored in a monolithic repository without any restrictions. For example, all the projects of the Apache Software Foundation are stored in a single Subversion repository.

But the question is, do you really need to store multiple projects in a single monolithic repository (monorepo)? Git community insists that large monolithic repositories became redundant and have to be split into multiple smaller repositories. However, such companies as Facebook and Google maintain their codebase in huge, monolithic repositories. There is a number of reasons for that. For example, some of the advantages of the use of monolithic repositories are better code visibility, atomic large-scale refactorings, better dependency management and collaboration across teams. Read the article Why Google Stores Billions of Lines of Code in a Single Repository for further details.

8.Git scales well for larger teams

Certain workflow limitations exist.

While Git is successfully used for such crowded open source projects with thousands of involved developers as Linux Kernel, it may not scale well for other large teams with different workflows. In Git each developer must be up-to-date against the entire upstream repository before promoting changes from a private repository. If your team doesn’t follow Integration-Manager or Dictator and Lieutenants workflows, it will face a work slowdown because promoting to a ‘blessed’ public repository will be effectively serialized.

Thanks to mixed-revision working copies, Subversion allows better concurrent work because only the individual files in question must be up-to-date before promotion.

9.Merge operation is always painful in Subversion

Spotted problems exist.

In most cases merges become painful in Subversion only if you have file or folder renames in the merged branches. Due to historical reasons, Subversion doesn’t properly track file and folder renames (mostly because file renames rarely happened before refactorings were invented). Best practices to prevent tree conflicts during merge are simple: limit file and folder renames in branches, prefer to refactor code in the trunk. It is important to note that improved merging and better tree conflict handling are the hot features for the next Subversion release.

10.Git has leaky abstraction and crazy command line syntax

True. There are competing DVCS with better abstraction and clean syntax.

Git was initially designed as a low-level version control system, so it allows advanced users to do a lot of hacky things but does not provide enough safety and abstraction for beginners and average users. Also Git is widely criticized for the poorly designed and chaotic command line syntax. That leads to the longer learning curve and could significantly increase the total cost of ownership for large teams with mixed levels of expertise.

However, a complicated abstraction model is not mandatory for DVCS: the competing DVCS called Mercurial has a much more consistent abstraction model and provides a cleaner command line syntax. It's worth to note that Subversion is as easy and safe as any version control should be.

Additional information: Mercurial vs Git: Why Mercurial?

11.Git history is not safe

True.

Git is officially described as a stupid content tracker and it doesn’t care too much about keeping the precise history of changes in your repositories. Such features as implicit file rename tracking and ‘git rebase’ command make it hard to find out the true history of changes in your codebase.

In contrast, with Subversion you always can get exactly the same data from your repository as it was in any moment in the past. Also you can easily trace all changes made to the particular file or folder, because Subversion history is permanent and always definite.

Example: losing history after rename in Git

12.Git does not provide granular read access control

True.

Because of the distributed nature of Git, each Git user has the full copy of the repository and effectively has the complete read access to the entire content of the repository. While this approach is sufficient for open source repositories that rarely contain any confidential information, it could be not acceptable for most of the enterprise projects.

At the same time Subversion provides a path-based authorization system that allows to granularly control who is authorized to read and modify files in the repositories and is sufficient even for large enterprise installations.

13.Git is not friendly to binary files

True.

Modern version control systems are based on the assumption that most of the versioned files are mergeable. In other words, it should be nearly almost possible to merge two concurrent changes made to a single file. This model is called Copy-Modify-Merge and it is used in both Subversion and Git.

The above assumption usually is not applicable for binary files and that’s why Subversion provides support for the alternative Lock-Modify-Unlock model (that is implemented by means of the svn lock command and the predefined svn:needs-lock property). Since Git is inherently distributed, it does not support exclusive files locks at all. This makes it hard to adopt Git for enterprise projects where a lot of non-mergeable binary assets usually exist.

Example: repository size benchmarks

The following example is designated to show that the size difference of Subversion and Git repositories is insignificant. The example is based on the comparison of the size of the official WordPress codebase repository which is powered by Subversion and its mirror hosted on GitHub.

The sizes of Subversion and Git repositories are pretty the same: 186MB in Subversion (35599 revisions) vs. 169 MB in Git (32647 revisions). Git repository is only 17 MB less than the corresponding Subversion repository, however it has less revisions as well (35599 in Subversion vs. 32647 in Git).

	Subversion	Git
Data Source	https://core.svn.wordpress.org/	https://github.com/wordpress/wordpress
Number of Revisions	35599	32647
Repository Size	195,153,948 bytes	177,922,471 bytes
Software Version	Subversion 1.9.2, 64 bit	Git 2.6.3, 64 bit
Comments	The repository was generated from the complete dump stream of the official WordPress repository. The repository has the Subversion 1.9 format with all the default settings.	The repository was simply cloned from GitHub. There are less number of revisions because some part of history is omitted in the mirror Git repository.

As you can see, the difference of repository size is truly insignificant because the Git repository is only 10% smaller than the corresponding Subversion one. That’s not a surprise, because both version control systems use generally the same data structures and algorithms to store data in repositories.

Example: branch creation benchmarks

The following example is designated to show that making a branch in Subversion is very cheap in time and space and that there is no significant difference when compared to Git. The example is based on the comparison of time and disk space required to make a branch of the official WordPress codebase repository's trunk which is powered by Subversion and making a corresponding branch with its mirror hosted on GitHub.

	Subversion	Git
Data Source	https://core.svn.wordpress.org/	https://github.com/wordpress/wordpress
Size before branch	195,153,948 bytes	177,922,471 bytes
Size after branch	195,155,256 bytes grows on 1308 bytes	177,922,831 bytes grows on 360 bytes
Time to create branch	0.093 s	0.031 s
Size after commit to the branch	195,157,201 bytes grows on 1945 bytes	177,924,250 bytes grows on 1419 bytes
Software Version	Subversion 1.9.2, 64 bit	Git 2.6.3, 64 bit
Details	Subversion repository is located on the same computer with the working copy. Copy-On-Write works incrementally.

You should have noticed that time taken by branch creation in Subversion is 62 milliseconds longer than in Git, however this is still less than average length of a human blink of an eye. It also has a difference in disk space where Subversion branch takes about 1 kilobyte more than Git one and in the age of terabyte disks this is negligible as well. Therefore, it can be considered that both of these differences are of no practical significance.

Example: losing history after rename in Git

Git tracks file renames implicitly and it could be surprisingly easy to lose history for files that are committed with both rename and significant content change. The reproduction script is pretty easy:

Init a new Git repository:
```
$ git init
```
Create a new text file with some initial content:
```
$ echo Initial content > file1.txt
```

Add the created file to Git and commit it with the simple message.

$ git add file1.txt
$ git commit -m "first change"
[master (root-commit) b7fe376] first change
1 file changed, 1 insertion(+)
create mode 100644 file1.txt

Rename the text file:
```
$ git mv file1.txt file2.txt
```
Replace the content of the renamed file:
```
$ echo Very new data > file2.txt
```

Commit the renamed file to Git:

$ git add file2.txt
$ git commit -m "second change"
[master 967f823] second change
2 files changed, 1 insertion(+), 1 deletion(-)
delete mode 100644 file1.txt
create mode 100644 file2.txt

Examine the history of the renamed file and find out that ‘git log’ does not show the “first change” commit for the renamed text file:

$ git log --follow file2.txt
commit 967f8231ee59f4f0b97cff2ce72a152c74298820
Author: John Doe <you@example.com>
Date:   Tue Nov 17 10:13:07 2015 -0500
    second change

As you can see, the history is partially lost after a file rename and there are no simple ways to find out the previous name of the file and changes made to its content.

Found this helpful? Share with your friends!

Please help others to get rid of the myths and understand the facts about Subversion and Git. Share this page in your favorite social network!

Subversion vs. Git: Myths and Facts

1.Git repositories are significantly smaller than equivalent Subversion ones

2.Branches are expensive in Subversion

3.It is required to manually specify the range of revisions when you merge two branches in Subversion

4.There is an auxiliary .svn directory in each folder of Subversion working copy

5.Nobody uses Subversion anymore

6.Distributed version control systems are inherently superior to centralized ones such as Subversion

7.Git scales well for larger projects

8.Git scales well for larger teams

9.Merge operation is always painful in Subversion

10.Git has leaky abstraction and crazy command line syntax

11.Git history is not safe

12.Git does not provide granular read access control

13.Git is not friendly to binary files

Example: repository size benchmarks

Example: branch creation benchmarks

Example: losing history after rename in Git

Further reading

Found this helpful? Share with your friends!