Jekyll on Amazon S3

This website runs on Jekyll and it’s deployed to Amazon S3. This is a guide on how I run development and staging environments, gzip and upload only modified files.

Production and Development configs

On production, I combine all my JS files into one and then minify it with Google Closure Compiler.

On development, I include non-minified JS files one-by-one using plain-old script elements. I also include extra debug scrips.

I use different Jekyll configs for production and development:
_config.yml for production deployment
dev_config.yml for local development

dev_config.yml is generated by merging _dev_config.yml (starting with underscore) into _config.yml by:

./tolya_deployer dev_config

Jekyll and Modified Time

After building Jekyll, I run a script to preserve modified time of unchanged files.

Let’s say a publish a new post. This results in the following files being modified:

/jekyll-on-amazon-s3/index.html
/index.html

However, Jekyll regenerated the entire website including copying non-modified files. In our example, source/main.css hasn’t changed, but Jekyll copied it and set modified time to the last build time.

I wrote a script that reverts modified time of Jekyll-copied files back to the one set on the source file. The script looks at public/main.css and checks if it matches source/main.css. When it does, it transfers modified time from source file to the generated file.

./tolya_deployer jekyll

Git

The build directory, public/, is a separate Git repository. After each deploy I make a commit.

What’s the point of it?

I parse git status --porcelain=v1 output to see what files were added, deleted, or modified. Then I gzip only added and modified files, then copy them to staging/. I don’t gzip unchanged or removed files; I copy them to staging/ unchanged.

I gzip files locally. I use Zopfli with high iteration count for minimal file size. This takes time. The early version of my deployment script gzipped all files before each deploy — it was painfully slow.

Git workflow has several added benefits: git status and git diff to see what about to get deployed. git log to see deployment history.

Deploying to Amazon S3

I use s3cmd:

s3cmd sync --verbose --no-mime-magic --cf-invalidate --acl-public staging/ s3://n12v.com/

sync command is sort of like Rsync.

--cf-invalidate create Amazon CloudFront request specifically with changed paths.

--no-mime-magic don’t look at the content of the file to guess MIME-type. Deciding MIME-type based on file extension is more predictable.

--acl-public make files accessible via HTTP.

s3cmd sync doesn’t know which files are gzipped and which aren’t. When gzipping, I store each path in an array. For each of these files, I set Content-Encoding: gzip header:

s3cmd modify --add-header='Content-Encoding: gzip' s3://n12v.com/index.html
s3cmd modify --add-header='Content-Encoding: gzip' s3://n12v.com/jekyll-on-amazon-s3/index.html

I tried using the official Amazon S3 & CloudFront CLI but it didn’t seem to provide a convenient way to invalidate only CloudFront files that I changed on S3.

Here’s the source code of my deployment script. Feel free to ask questions and share your Jekyll or Amazon S3 tips!

Published July 30, 2019 by

Nikita Vasilyev

· Updated August 07, 2019