This is an approach that can be use to safely remove chunky git history, when it becomes troublesome for the developers. However, it requires coordination from the whole team consuming this git repository.
1. Cleaning up current state
To make sure there are no un-synced changes before rewriting history:
- Delete any unused branches in the shared version of the repository (if there is one)
- Have everyone commit their uncommitted work and push the changes
At this point, the central repository contains a central authoritarian copy, which is a good place to make changes and have everyone sync on them later on.
2. Make the historical deletions on the repository
Deleting the repository history will make it lighter, but the whole set of commits after the offending introductions need to be rewritten.
2.1. Analyze what to delete
git-filter-repo
is an amazing tool that can check the storage used by already deleted files and folders.
2.2. Execute filter
At this point, you can issue a command to actually modify the repository. But for safety measures, git-filter-repo
will unlink the remote repositories. What we’ll do is re-add it and push all the branches. Remember that this is risky. However, if it all goes sideways, you always have collaborators’ copies as backup (they can always force-push all branches), or you can have a pre-done other clone of the repository.
--path
filters the repository down to that path, including it in the filter. However, --invert-paths
reverts the filter, including everything except what was already included.
3. Verify everything is fine
Do not skip this step.
Verify everything is alright, including:
- The repository has decreased in size (re-clone it)
- All the files are present (directory diff between backup and current repo)
- All branches are present
- All tags are present
4. Re-sync with team
For each person that needs a copy of the repository:
- Option 1: re-clone repository. This is the best option since it will give them a fresh version of the repository.
- Option 2: for each branch:
git fetch -fp
git checkout branch
git reset --hard origin/branch