Nikita Vasilyev

Jekyll on Amazon S3

This website runs on Jekyll and it’s deployed to Amazon S3. This is a guide on how I run development and staging environments, gzip and upload only modified files.

Production and Development configs

On production, I combine all my JS files into one and then minify it with Google Closure Compiler.

On development, I include non-minified JS files one-by-one using plain-old script elements. I also include extra debug scrips.

I use different Jekyll configs for production and development:
_config.yml for production deployment
dev_config.yml for local development

dev_config.yml is generated by merging _dev_config.yml (starting with underscore) into _config.yml by:

./tolya_deployer dev_config

Jekyll and Modified Time

After building Jekyll, I run a script to preserve modified time of unchanged files.

Let’s say a publish a new post. This results in the following files being modified:

/jekyll-on-amazon-s3/index.html
/index.html

However, Jekyll regenerated the entire website including copying non-modified files. In our example, source/main.css hasn’t changed, but Jekyll copied it and set modified time to the last build time.

I wrote a script that reverts modified time of Jekyll-copied files back to the one set on the source file. The script looks at public/main.css and checks if it matches source/main.css. When it does, it transfers modified time from source file to the generated file.

./tolya_deployer jekyll

Git

The build directory, public/, is a separate Git repository. After each deploy I make a commit.

What’s the point of it?

I parse git status --porcelain=v1 output to see what files were added, deleted, or modified. Then I gzip only added and modified files, then copy them to staging/. I don’t gzip unchanged or removed files; I copy them to staging/ unchanged.

I gzip files locally. I use Zopfli with high iteration count for minimal file size. This takes time. The early version of my deployment script gzipped all files before each deploy — it was painfully slow.

Git workflow has several added benefits: git status and git diff to see what about to get deployed. git log to see deployment history.

Deploying to Amazon S3

I use s3cmd:

s3cmd sync --verbose --no-mime-magic --cf-invalidate --acl-public staging/ s3://n12v.com/

sync command is sort of like Rsync.

--cf-invalidate create Amazon CloudFront request specifically with changed paths.

--no-mime-magic don’t look at the content of the file to guess MIME-type. Deciding MIME-type based on file extension is more predictable.

--acl-public make files accessible via HTTP.

s3cmd sync doesn’t know which files are gzipped and which aren’t. When gzipping, I store each path in an array. For each of these files, I set Content-Encoding: gzip header:

s3cmd modify --add-header='Content-Encoding: gzip' s3://n12v.com/index.html
s3cmd modify --add-header='Content-Encoding: gzip' s3://n12v.com/jekyll-on-amazon-s3/index.html

I tried using the official Amazon S3 & CloudFront CLI but it didn’t seem to provide a convenient way to invalidate only CloudFront files that I changed on S3.


Here’s the source code of my deployment script. Feel free to ask questions and share your Jekyll or Amazon S3 tips!

Published by
Nikita Vasilyev
· Updated