How to Use S3 as a Private Git Repository

The files that generate the content for this website are stored in a git repository (for those that aren’t software people, “git” is a version control system that lets you keep a history of files in a logical way, including all changes over time). When I first started, I just kept it locally on my personal laptop (backed up to an external hard drive occasionally).

But what if I didn’t have my personal laptop and wanted to edit or create a post? Obviously I needed to keep the master copy of the source files on a server somewhere (“in the cloud”). This post is about what options I considered and the absurdly simple version of just keeping a private git repository in S3. This has been written up elsewhere, but given how ridiculously useful this is, it deserves more than just a tweet.

Finding a Solution

My first thought was actually GitHub. This is a hosted, social collaboration site for all kinds of software. While it would have worked, I would have to pay if I wanted to keep content private (for drafts and such). Public (and free) didn’t make much sense either: this is a personal blog site and not really collaborative (yet).

Next up, I considered just putting a git repository on a server that I have access to from Sonic.net, my old ISP that I still have an account at. The issue there is that the shell account I have isn’t necessarily always available. I discovered I would have to install git itself which suggests they weren’t really expecting me to do something like this. In any case, that account isn’t really meant for reliable and available backup.

Then I realized: I have a massive and cheap place to store data in a very reliable place of known as “S3”. The only sticking point is I needed to “push” my local git repository to one hosted in an S3 bucket. I googled and found Jgit and a couple blog posts on the subject.

Note: I do work for Amazon and since I was already using S3 to host the website itself, it was an obvious choice. However, I suspect this might work for other forms of “cloud storage” assuming Jgit supports it or you can modify it to support your hosted store.

Using S3 as a Private Git Repository

Create an S3 bucket

First, you do need an AWS account and an S3 bucket. Since my website was originally named “r343l.com”, I named my git repository bucket “r343l.gitrepos”. S3 buckets have to have globally unique names so most of my buckets are prefixed this way. Steps:

  1. Login into your AWS account.
  2. Go to the AWS management console and choose S3
  3. Click “Create bucket” and give it a name. By default the bucket should only grant credentials to you, but I would double check.

Setup local credentials to access S3

This step is to get the AWS access ID and secret key you need to access the repository. For this example, we’re just using the “main” one for the account. If you were sharing the bucket with someone else, you might create some other principals and grant permissions on the bucket to those. But let’s assume we’re doing this the simple way. Note that once you have this file on your local machine, you should treat them like SSH private key credentials.

  1. Pull the drop-down for Security Credentials from the My Account / Console menu at the top-right.
  2. Create a file in your home directory (I’m on Mac and not sure where exactly this goes on Windows) called “.jgit_s3_public”. Make sure it’s only readable by you (permissions 600). The commands on a unix-ish system are touch ~/.jgit_s3_public; chmod 600 ~/.jgit_s3_public.
  3. In the AWS credentials page, scroll down until you see “Your access keys”. Click the “Show” part to get the secret key. Copy both into the created file as follows:
accesskey: [access ID for AWS]
secretkey: [secret key for AWS]
acl: private

Everything between and including the square brackets are the values you copied from the AWS credentials page. The “acl” line ensures that newly uploaded files stay private.

Install Jgit

Download the software. I chose the self-contained shell script version that contains both a script to run Jgit (it’s a Java application) and the Java code itself. It sits in my home “bin” directory and is aliased to the command “jgit”.

Make a git repository and upload to S3!

Let’s pretend your bucket is foo.gitrepos where all your personal git repositories will be stored. In my case, I have my website master in here, but another use might be for blog posts that get hosted at another site. You know how everyone says to edit outside of the blog software because your browser might crash? Well, even better than editing outside the blog software is saving those files to a separate place in case the hosted blog loses stuff! For this example, let’s pretend you have a tumblr.

cd ~/
mkdir myrants-tumblr
cd myrants-tumblr
... create some files 
git init
git add *
git commit -m "my new files yay!"
git remote add s3 amazon-s3://.jgit_s3_public@foo.gitrepos/projects/myrants-tumblr
jgit push s3 refs/heads/master  ### NOTE: jgit not git!

You now have your files in S3!

Getting your files out of S3

What if you’re on another computer and desperately want at these files? Well, all you have to do is install jgit and setup the client credentials as before. Then clone down the git repository:

jgit clone amazon-s3://.jgit_s3_public@foo.gitrepos/projects/myrants-tumblr

Then you can edit or add files as needed. Pushing it back up to S3 is the same as the initial push. Note that once you have a local repository setup in multiple places, you’ll probably at some point need to update them. With jgit the commands are:

jgit fetch               ## gets updates from the S3 master
git merge s3/master

Basically, use git for local commands that manipulate the local repository (adding, committing, merging) and jgit for any interactions that involve sending or receiving data from the S3 bucket.

That’s it! I hope this was helpful.

Comments