Maybe it’s due to working mainly in small shops, but I’ve had to merge and split git repositories a number of times in the past few years (usually related to folks pushing for microservice architecture, and then changing their minds).
Fortunately, git allows us to do both operations without losing the history.
Needless to say, those operations should not be part of your daily flow, but it’s good to have in your toolbox if you ever find yourself in a situation that requires it.
Some of the operations described here, especially when it comes to splitting, are extremely destructive as they can rewrite your git history. As a result, other contributors may not be able to push or send pull requests anymore.
Ideally, these operations should be done on fresh clones, and pushed to new remotes.
Let’s say you have 2 local repositories that you want to merge, called
B, and you want to copy the history of
Step 1: Organization
In order to save yourself some headaches, it’s better to plan ahead how you want the final repository layout to be. You’ll also want to move files around to avoid any conflicts.
For simplicity, I want the end result to be:
A repository, using whatever tools you want, create the
src/A directory and move everything in there.
After that, execute
git add -A to update the index and make sure you
Repeat the process for repository
And now everything should be where you want it to be.
Step 2: Pull
B repository, you will need to
git pull with the
allow-unrelated-histories to, surprise, allow the unrelated histories of both repositories to be merged together.
If both repositories are within the same directory, you’d can use this:
git pull --allow-unrelated-histories ../A.
Once that’s done, check the content of your repository and your
git log to make sure that everything is good, and then you are done!
In this scenario, we’ll be reverting the changes we’ve done above, going from 1 repository containing:
and split them back into 2 repositories, with the content at the root of each ones.
Danger Will Robinson
Splitting a repository is a highly destructive operation. Not only is it possible to lose the content of your repository, in fact, that’s basically how the tool works. It rewrites the history and wipes out any traces of the items you have taken out.
Even when things go well, and you can stitch the repositories back into one using the steps above, commits that used to contain changes that were formerly in the same repository will stay separate. That data is gone forever.
For this operation, we’ll be using a 3rd party tool called git filter-repo. Note that this tool is a recommendation from the git filter-branch man pages.
Install it for your environment and we’ll be good to go.
To makes things easier, do 2 fresh clones of your repository. Each clones will be rewritten with only the content we want.
Using git filter-repo
Open your command prompt/shell/terminal into the root of your first clone.
git filter-repo --subdirectory-filter src/A
This will remove everything that is NOT under
src/A and move the content to the root of the repository.
Repeat the same operation for your second clone, changing
src/B, and you are done!
These are some of the ways I’ve used in the past to split and merge git repositories. What are yours? Hit me up on twitter!