Git binary clean up
When I first started this blog, I simply started with experiments. The first iteration was a wordpress which was followed, very fast, by joomla. Neither of them lasted long. They are simply not for me.
I am lucky to be a part of a small group started in
#dgplug on Freenode. In mentioned group, I have access to a lot of cool and awesome people who can put me to shame in development. On the flip side, I live by a motto that says:
Always surround yourself with people smarter than yourself.
It’s the best way to learn. Anyway, back to the topic at hand, they introduced me to static blog generators. There my journey started but it started with a trial. I didn’t give too much thought to the repository. It moved from GitHub to Gitlab and finally here.
But, of course, you know how projects go, right ?
Once you start with one, closely follows other ones that crop up along the way. I put them on my TODO, literally. One of those items was that I committed all the images to the repository. It wasn’t until a few days ago until I added a
.gitattributes file. Shameful, I know.
No more ! Today it all changed.
First step first
Let’s talk about what we need to do a little bit before we start. Plan it out in our head before doing the actual work.
I will itemize them here to make it easy to follow:
- Clone a fresh repository to do the work in
- Remove all the images from the git repository
- Add the images again to git lfs
Sounds simple enough, doesn’t it ?
If you follow along this blog post, here’s what you can expect.
- You WILL lose all the files you delete from disk, as well, so make a copy
- You WILL re-write history. This means that the SHA of every commit since the first image was committed WILL mostly likely change.
- You WILL end up essentially with a new repository that shares very little similarities with the original, so BACKUP!.
Now that we got the warning out of the way, let’s begin the serious work.
Clone the repository
I bet you can do this with your eyes closed by now.
$ # Backup your directory ! $ mv blog.lazkani.io blog-archive $ git clone [email protected]:Elia/blog.lazkani.io.git blog.lazkani.io $ cd blog.lazkani.io
Easy peasy, lemon squeezy.
Remove images from history
Now, this is a tough one. Alright, let’s browse.
Oh what is that thing git-filter-repo ! Alright looks good.
We can install it in different ways, check the project documentation but what I did, in a python virtual environment, was.
$ pip install git-filter-repo
BEWARE THE DRAGONS
git-filter-repo makes this job pretty easy to do.
$ git filter-repo --invert-paths --path images/ Parsed 43 commits New history written in 0.08 seconds; now repacking/cleaning... Repacking your repo and cleaning out old unneeded objects HEAD is now at 17d3f5c Modifying a Nikola theme Enumerating objects: 317, done. Counting objects: 100% (317/317), done. Delta compression using up to 2 threads Compressing objects: 100% (200/200), done. Writing objects: 100% (317/317), done. Total 317 (delta 127), reused 231 (delta 88), pack-reused 0 Completely finished after 0.21 seconds.
That took almost no time. Nice !
Let’s check the directory and fair eonugh it no longer has
Add the images back !
Okay, for this you will need git-lfs. It should be easy to find your package manager. This is a debian 10 machine so I did.
$ sudo apt-get install git-lfs
Before you commit to using git-lfs, make sure that your git server supports it.
If you have a pipeline, make sure it doesn’t break it.
I already stashed our original project like a big boy, so now I get to use it.
$ cp -r ../blog-archive/images .
Then we can initialize git-lfs.
$ git lfs install Updated git hooks. Git LFS initialized.
Okay ! We are good to go.
Next step, we need to tell git-lfs where are the files we care about. In my case, my needs are very simple.
$ git lfs track "*.png" Tracking "*.png"
I’ve only used PNG images so far, so now that they are tracked you should see a
.gitattributes file created if you didn’t have one already.
From this step onward, git-lfs doesn’t differ too much from regular git. In this case it was.
$ git add .gitattributes $ git add images/ $ git status On branch master Changes to be committed: (use "git restore --staged <file>..." to unstage) modified: .gitattributes new file: images/local-kubernetes-cluster-on-kvm/01-add-cluster.png new file: images/local-kubernetes-cluster-on-kvm/02-custom-cluster.png new file: images/local-kubernetes-cluster-on-kvm/03-calico-networkProvider.png new file: images/local-kubernetes-cluster-on-kvm/04-nginx-ingressDisabled.png new file: images/local-kubernetes-cluster-on-kvm/05-customize-nodes.png new file: images/local-kubernetes-cluster-on-kvm/06-registered-nodes.png new file: images/local-kubernetes-cluster-on-kvm/07-kubernetes-cluster.png new file: images/my-path-down-the-road-of-cloudflare-s-redirect-loop/flexible-encryption.png new file: images/my-path-down-the-road-of-cloudflare-s-redirect-loop/full-encryption.png new file: images/my-path-down-the-road-of-cloudflare-s-redirect-loop/too-many-redirects.png new file: images/simple-cron-monitoring-with-healthchecks/borgbackup-healthchecks-logs.png new file: images/simple-cron-monitoring-with-healthchecks/borgbackup-healthchecks.png new file: images/weechat-ssh-and-notification/01-weechat-weenotify.png
Now that the files are staged, we shall commit.
$ git commit -v [master 6566fd3] Re-adding the removed images to git-lfs this time 14 files changed, 40 insertions(+), 1 deletion(-) create mode 100644 images/local-kubernetes-cluster-on-kvm/01-add-cluster.png create mode 100644 images/local-kubernetes-cluster-on-kvm/02-custom-cluster.png create mode 100644 images/local-kubernetes-cluster-on-kvm/03-calico-networkProvider.png create mode 100644 images/local-kubernetes-cluster-on-kvm/04-nginx-ingressDisabled.png create mode 100644 images/local-kubernetes-cluster-on-kvm/05-customize-nodes.png create mode 100644 images/local-kubernetes-cluster-on-kvm/06-registered-nodes.png create mode 100644 images/local-kubernetes-cluster-on-kvm/07-kubernetes-cluster.png create mode 100644 images/my-path-down-the-road-of-cloudflare-s-redirect-loop/flexible-encryption.png create mode 100644 images/my-path-down-the-road-of-cloudflare-s-redirect-loop/full-encryption.png create mode 100644 images/my-path-down-the-road-of-cloudflare-s-redirect-loop/too-many-redirects.png create mode 100644 images/simple-cron-monitoring-with-healthchecks/borgbackup-healthchecks-logs.png create mode 100644 images/simple-cron-monitoring-with-healthchecks/borgbackup-healthchecks.png create mode 100644 images/weechat-ssh-and-notification/01-weechat-weenotify.png
Yes, I use
-v when I commit from the shell, try it.
The interesting part from the previous step is that git-filter-repo left us without a remote. As I said, this repository resembles very little the original one so the decision made by git-filter-repo is correct.
Let’s add a new empty repository remote to our new repository and push.
$ git remote add origin [email protected]:Elia/blog.lazkani.io.git $ git push -u origin master Locking support detected on remote "origin". Consider enabling it with: $ git config lfs.https://git.project42.io/Elia/blog.lazkani.io.git/info/lfs.locksverify true Enumerating objects: 338, done./13), 1.0 MB | 128 KB/s Counting objects: 100% (338/338), done. Delta compression using up to 2 threads Compressing objects: 100% (182/182), done. Writing objects: 100% (338/338), 220.74 KiB | 24.53 MiB/s, done. Total 338 (delta 128), reused 316 (delta 127), pack-reused 0 remote: Resolving deltas: 100% (128/128), done. remote: . Processing 1 references remote: Processed 1 references in total To git.project42.io:Elia/blog.lazkani.io.git * [new branch] master -> master Branch 'master' set up to track remote branch 'master' from 'origin'.
And the deed is done.
If you were extremely observant so war, you might’ve noticed that I used the same link again while I said a new repository.
Indeed, I did. The old repository was renamed and archived here. A new one with the name of the previous one was created instead.
After I pushed the repository you can notice the change in size. It’s not insignificant. I think it’s clearner now. The 1.2MB size on the repository is no longer bothering me.