I recently migrated my blog from Wordpress to GitHub Pages. Being the first post on this new platform it seems fitting that I write about the migration experience. If you are not familiar with GitHub pages you can go to that previous link and it will give you a nice overview of what it is and what you can do with it. The short version? It’s a free hosting for your site which is linked to your GitHub account. This means that when you push your code to your repo, your site is magically updated. This is something really cool and convenient.

Creating a GitHub page

So what’s the first step? Go to http://www.github.com and create a new repository called “#{your_user_name}.github.io”. In my case that was “jlordiales.github.io”. This repository works just as any other repository you might have so you can pull, push and all the things you usually do. After doing that my new page was available at http://jlordiales.github.io

Next step, clone your new repository and show a hello world page to make sure your new page is working properly:

$ cd ~
$ git clone git@github.com:jlordiales/jlordiales.github.io.git
$ cd jlordiales.github.io
$ echo 'Hello World' > index.html
$ git add .
$ git commit -m "Initial commit"
$ git push -u origin master

Went to my browser, typed http://jlordiales.github.io and boom, a lovely hello world page. Let’s take a moment to reflect on what just happened. You created a regular git repository, you checked in an index.html file and your page was instantly updated and available to the whole world to see. All with just a git push.

A basic blog page

Now that you have a basic hello world page hosted and working, what should you do? Do you start hand-crafting html content? Probably not. You are probably going to be using some type of framework to provide all the boiler plate that you don’t want to deal with. Depending on your needs and what you are trying to achieve with your new page you have several options: Bootstrap, Jekyll, Foundation and a bunch of others.

Since I wanted to have my blog hosted and Jekyll’s description says that “it is a simple, blog-aware, static site generator”, it seemed like that would be a good fit. I had never worked with Jekyll before so I went to their home page and saw the “Get up and running in seconds” snippet. Seemed easy enough. However, instead of doing a gem install jekyll I wanted to have a Gemfile with all my dependencies, so I created one in my repo folder with the following content:

source 'https://rubygems.org'

gem 'jekyll'

Followed by a quick

$ bundle install
$ jekyll new blog

That generated a skeleton structure for a Jekyll blog. I won’t go into much details about each folder and each file because the Jekyll documentation already covers this pretty thoroughly. The thing that is worth noting is that after creating this base structure you can do a

$ jekyll serve --watch

And go to http://localhost:4000 to see your blog running locally. Furthermore you can start playing around modifying the Markdown files inside _posts, save them, refresh your browser and automatically see the changes reflected there. Even more interesting is the fact that you can now push all these new files to github and see the live blog working in http;//#{your_user_name}.github.io

What is happening under the hood is that when you push your code GitHub runs a jekyll build command on your repo. This command reads all your markdown files in your folder, together with some HTML and configuration files and generates a _site folder with static HTML that is served directly from GitHub.

Migrating my posts from Wordpress

At this point you have a functional blog where you can just write markdown files and Jekyll will transform that into HTML that is nicely rendered in your github page. If you are starting a new blog then you can just delete the example posts that are generated by Jekyll and start writing your own. However, if you are migrating from another blog like I was, then you probably want all your old posts, comments and metadata from your old blog in your new one.

In my case, all my old stuff was in http://jlordiales.wordpress.com. The first thing I was sure I wanted were my posts. I saw that the Jekyll documentation had a Section on Blog migration. I decided to give that a shot, so I followed the installation instructions and then when straight to the Wordpress.com section. I saw that I first needed to export all my Wordpress data using their export tool so I did that and I got my wordpress.xml file with all my posts and metadata (comments, tags, sections, etc.). With that file I run the Jekyll importer and… it didn’t work. Well, it kind of worked:

  • it successfully imported my posts and images but instead of converting the Wordpress syntax into markdown it converted it directly into html

  • it didn’t convert all my source code snippets into the format expected by Jekyll’s default syntax highlighter Rouge

Since that didn’t work I looked again at the Jekyll documentation and saw that they recommend a couple of other approaches in case the Jekyll importer doesn’t work. I decided to give Exitwp a shot. The setup was pretty straightforward following the project’s README. I needed to have pyton installed and the same wordpress.xml file that I had exported before from Wordpress. I ran the app and… it almost worked! Now all my posts were in Markdown format (including their tags and categories) except for my code lists. Definitely progress! Now, my source code snippets in Wordpress followed the format:

[sourcecode language="java"]
public class...
[/sourcecode]

If only I could tell Exitwp to look for these blocks and change them to:

{{< highlight java >}}
public class...
{{< / highlight >}}

It turns out you can! Theres a config.yaml file where you can specify custom replace regex that Exitwp will search for and apply for you while converting your posts. All I had to do was to define them as:

body_replace: {
  '\[sourcecode language="java"\]':  '{{< highlight java >}}',
  '\[sourcecode language="bash"\]':  '{{< highlight bash >}}',
  '\[\/sourcecode\]': '{{< / highlight >}}',
}

And there I had it. All my posts were now in Markdown format, including my code snippets. I was missing one thing though, my comments.

Migrating my old comments

Jekyll being a static HTML generator, one of the things you can not do out of the box is having comments on your posts. Luckily there are plenty of external providers that you can plug into your HTML with a simple JavaScript block. One of these providers (and one of the more popular ones) is Disqus

Since I always used Wordpress’ own comments system I didn’t have a Disqus account. So the first thing for me was creating an account there. After you do that you get a Universal Code that you can add to any page. It looks something like this:

<div id="disqus_thread"></div>
<script type="text/javascript">
/* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
var disqus_shortname = ''; // required: replace example with your forum shortname

/* * * DON'T EDIT BELOW THIS LINE * * */
(function() {
 var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
 dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
 (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
 })();
</script>
<noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>

Where do you put that code? Well, Jekyll has the concept of layout, which is basically a wrapper around your posts. The default post layout that you get with Jekyll when you do a jekyll new blog looks like this:

---
layout: default
---
<div class="post">

  <header class="post-header">
    <h1 class="post-title">{{ page.title }}</h1>
    <p class="post-meta">{{ page.date | date: "%b %-d, %Y" }}{% if page.author %} • {{ page.author }}{% endif %}{% if page.meta %} • {{ page.meta }}{% endif %}</p>
  </header>

  <article class="post-content">
    {{ content }}
  </article>

</div>

A simple header with the post title, author and date followed by the content of your post. Since all posts will use this same layout (assuming you want that) you could just copy paste the Disqus code in this layout and then all your posts would have comments enabled.

Even though that works, if you start doing the same for other services (Analytics, Social media, etc.) your post layout starts to get cluttered with lot of unrelated stuff. What I did instead was to create a disqus_comments.html file inside the _includes directory created by Jekyll with the content I showed before. The html files in this directory are partials that can be included by your posts and layouts in order to facilitate reuse and, like in my case, to keep things cleaner. So now instead of adding all the Disqus universal code in our layout we can just say:

 {% if page.comments %}{% include disqus_comments.html %}{% endif %} 

The if page.comments part allows you to decide on a post by post basis if you want to enable comments or not. You only need to say comments: true or comments: false on your YAML front matter

We now have Disqus comments integrated into our blogs but we still haven’t migrated our old comments from Wordpress to Disqus. Remember that wordpress.xml file that we exported from Wordpress when we were migrating our posts? That same file already has all our comments as well, we just need to import them into Disqus. Luckily that’s pretty easy to do. You can go here and directly upload your xml file. Disqus will take care of getting all your comments and adding them to your account for you to display them on your new blog.

Two small caveats here that I didn’t know at first and I lost some time figuring out:

  1. Since I didn’t use a domain name with my Wordpress blog and comments are associated with the permalink of each post, comments on my new blog were not showing up because the domain name was different. The way I solved this was to open the wordpress.xml file and replace every occurrence of “jlordiales.wordpress.com” with “jlordiales.github.io”

  2. Related to the previous issue is the way that Jekyll generates the permalinks to be used by your posts. The default format is /:categories/:year/:month/:day/:title.html. In contrast, Wordpress default permalink format is /:year/:month/:day/:title. Luckily, Jekyll lets you easily change this in your _config.yml file adding a line with

        permalink: /:year/:month/:day/:title
       

After taking care of this two things and importing my wordpress.xml file again into Disqus with the updated URLs, everything worked! I had my old comments on my posts!

Testing my blog

So what do you have until now? You have a github repository with HTML, Markdown and YAML files. All these files are under version control and every time you push your changes you can see the result almost immediately in your blog, hosted freely on GitHub Pages. That sounds a lot like a typical application you might work on a daily basis. In other words, a sort of “blog as code”. Except for one thing: tests!

What happens if I change the permalink format and push my changes? I loose all my comments. What if my markdown has a syntax error? Jekyll won’t be able to compile it into HTML and my post will never make it live. If I don’t set the author or date on my YAML front matter then the posts will render with the wrong metadata.

So how can we test this? We basically want to parse the different Markdown and YAML files and assert that a given group of attributes are present and set to the correct values. We can do this in a lot of different ways but given that Jekyll lives under the Ruby ecosystem a good option is to use ruby with rspec. As an added benefit, ruby has a pretty simple and easy to use YAML parse library.

So where do we start? Adding rspec to our Gemfile, installing it and adding a spec folder on the root of our project (at the same level as _posts) with:

$ echo 'rspec' >> Gemfile
$ bundle install
$ mkdir spec

A simple test to parse the YAML Front matter and make sure that comments are enabled could look like this:

require 'yaml'
describe "Posts" do
  let(:posts) {Dir["_posts/**/*.md"]}

  it "should have comments enabled" do
    posts.each {|post| has_comments_enabled?(post) }
  end

  def has_comments_enabled?(post_file)
    expect(front_matter(post_file)["comments"]).to eq(true), 
      "Post #{post_file} does not have comments enabled"
  end

  def front_matter(post_file)
    content = File.read(post_file)
    yaml_delimiter = "---"

    front_matter = content[/#{yaml_delimiter}(.*?)#{yaml_delimiter}/m, 1]
    YAML.load(front_matter)
  end
end

You can save this file as “spec/posts_spec.rb” for instance and then do a:

$ rspec

on the root folder to run it. Now you have an automatic way of making sure all your posts always have comments enabled. Very similar tests can be added for other attributes or to parse the _config.yml file to assert on the values of this file.

Conclusions

To conclude this post I really recommend taking a look and GitHub Pages, Jekyll and its related tools. After a small initial setup you can be up an running in no time and even with free hosting on GitHub. There are a lot of other things you can do, like using your own custom domain, Continuous Integration using Travis, different Jekyll Themes, add Google Analytics and a bunch of other stuff. All under proper version control and backed up by tests.