Remove folder and its contents from git/GitHub’s history

This post is a copy from a thread on stackoverflow

It was found that the --tree-filter option used in other answers can be very slow, especially on larger repositories with lots of commits.

Here is the method that one can use to completely remove a directory from the git history using the --index-filter option, which runs much quicker:

  • Make a fresh clone of YOUR_REPO
    git clone YOUR_REPO
    cd YOUR_REPO
  • Create tracking branches of all branches
    for remote in `git branch -r | grep -v /HEAD`; do git checkout --track $remote ; done
  • Remove DIRECTORY_NAME from all commits, then remove the refs to the old commits (repeat these two commands for as many directories that you want to remove)
    git filter-branch --index-filter 'git rm -rf --quiet --cached --ignore-unmatch DIRECTORY_NAME/' --prune-empty --tag-name-filter cat -- --all
    git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
  • Ensure all old refs are fully removed
    rm -Rf .git/logs .git/refs/original
  • Perform a garbage collection to remove commits with no refs
    git gc --prune=all --aggressive
  • Force push all branches to overwrite their history (use with caution!)
    git push origin --all --force
    git push origin --tags --force

You can check the size of the repository before and after the gc with:

git count-objects -vH

About Thibaud Kloczko

Graduated in CFD, Thibaud Kloczko is a software engineer at Inria. He is involved in the development of the meta platform dtk that aims at speeding up life cycle of business codes into research teams and at sharing software components between teams from different scientific fields (such as medical and biological imaging, numerical simulation, geometry, linear algebra, computational neurology).

Leave a Reply

Your email address will not be published.