My personal Hakyll cheatsheet

August 20, 2020

13 min read

1 Hakyll ❤ Pandoc

Since I don’t write Haskell code professionally anymore it takes me longer to get into the right rhythm. This post is intended for my future self more or less and should serve as a cheatsheet for Hakyll development. I’ve already written a high level overview of how to edit content and build the website in this README. Here I want to go deeper into how to construct new compilers and how to apply them in a context to templates.

Under the hood Hakyll integrates natively with Pandoc – the swiss-army knife of file converters. Pandoc is also written in Haskell and can convert files between a wide variety of file formats and can be extended with custom Lua filters. All this has some configuration complexities associated with it and below I also discuss the various configurations and extensions used to enable the auto-generation of table of contents and $\LaTeX$ math support on this website.

2 Website rules

2.1 Matching clause, routes and compilers

Writing rules in the DSL¹ is simple enough. A rule needs three things to be valid; a matching clause for the source file, a route for the compiled output and the compiler itself which is responsible for the transformation between the input and the output.

-- static css
match "css/**.css" $ do
   route   $ idRoute
   compile $ compressCssCompiler

We match all files with the .css extension in the css directory and its subdirectories (had we used * only files in the css directory would match). The idRoute preserves the filename as is but other routes may change the e.g. the extension or something more complex. For example we want to match files with .scss extension, compile them and pipe the compressed results to a file with the .css extension for browser compatibility.

match ("css/**.scss" .&&. complement "css/**_*.scss") $ do
   route   $ setExtension "css"
   compile $ sassCompiler
      >>= return . fmap compressCss

In some cases instead of a source file to match we just create an output directly. This is the case with index.html which is dynamically generated from from the latest blog posts.

-- index
create ["index.html"] $ do
   route   $ idRoute
   compile $ makeItem ""
      >>= loadAndApplyTemplate "templates/blog-list.html" (blogCtx 1 pages categories tags)
      >>= loadAndApplyTemplate "templates/default.html" defaultCtx 
      >>= indexCompiler
      >>= relativizeUrls

2.2 Templates and their contexts

Optionally the compiler may load a template to the input and interpolate variables from a context. We do so by chaining multiple compilers together into a more feature rich compiler. The output of a previous compiler is interpolated into the $body$ variable of the template. Other variables are made available to the template by defining them in the context. Before templates can be loaded they need to be compiled using the templateCompiler.

-- static pages
match "*.md" $ do
   route   $ pageRoute
   compile $ pandocCompiler
      >>= loadAndApplyTemplate "templates/page-detail.html" defaultCtx
      >>= loadAndApplyTemplate "templates/default.html" defaultCtx
      >>= relativizeUrls

-- templates
match "templates/*.html" $  -- note that the order does not matter
   compile $ templateCompiler

Contexts are monoids and can therefore be easily combined to create bigger contexts where the earlier definition of a non-empty (mempty) field takes precedence.

pageTitleField :: String -> Context String
pageTitleField key = 
   aliasContext alias metadataField <> -- use page title from metadata
   pathTitleField key               <> -- or read from the path
   constField key "Crypto and Code"    -- alternatively use this
   where
      alias x | x == key = "title"
      alias x            = x

A special case of the string context is the function context which defines a variable which value is a function which can be called in the template.

<header>
  $partial("templates/header.html")$
</header>

<p>
  $countwords(body)$
  $sum(1,2,3)$
</p>

2.3 Metadata, flow control and list iteration

Metadata can be placed in the front matter of the markdown formatted as YAML. The metadata can be made available both to the compiler and to the template in case the metadataField :: Context a is applied to the template. Tags can eiter be comma separated or as a valid YAML list.

---
title: This is the blog title
tags: tag1, tag2
tags: 
   - tag1
   - tag2
summary: |
  Introduction to the blog content
---

This is the blog body

Special function variables are used to define conditional branches to display and iterate over items. The variable names used within the $for(blogs)$ loop are defined in the blog list field. Everything between $sep$ and $endfor$ will be used as a separator between items, i.e. it is not included for the last element.

<div class="blog-list">
$for(blogs)$
  <div class="blog-item">
    $title$
  $if(blog.author)$
    $blog.author$
  $else$ 
    Unknown author
  $endif$
  <gdiv>
  $sep$
  <hr g>
$endfor$
</div>

3 Advanced features

Using the pandocCompilerWith some options can be passed in which affect the behavior of Pandoc. In addition extensions can be enabled which unlock additional features.

pandocCompilerWith :: ReaderOptions -> WriterOptions -> Compiler (Item String)

blogCompiler :: Compiler (Item String)
blogCompiler = do
   ident <- getUnderlying
   toc   <- getMetadataField ident "withtoc"
   pandocCompilerWith blogReaderOptions (maybe defaultOptions blogOptions toc)
   where
      defaultOptions = defaultHakyllWriterOptions
      blogOptions = const blogWriterOptions

3.1 Auto-generated table of contents

-- blogWriterOptions configures pandoc to include a table of contents
-- and uses MathJax to render math.
blogWriterOptions :: WriterOptions
blogWriterOptions = 
   defaultHakyllWriterOptions
      {
        writerHTMLMathMethod = MathJax ""
      , writerTableOfContents = True
      , writerNumberSections  = True
      , writerTOCDepth        = 2
      , writerTemplate        = 
         let
            toc = "$toc$" :: String
            body = "$body$" :: String
         in
            Just . renderHtml $ do
               H.div ! class_ "toc" $ do
                  toHtml toc
               toHtml body
      }

3.2 $\KaTeX\ $ to render $\LaTeX$ math

Pandoc can transform math to MathJax when configured correctly. KaTeX is able to render the output by embedding the javascript and CSS.

<link rel="stylesheet" href="/katex/katex.min.css">
<script defer src="/katex/katex.min.js"></script>
<script type="text/javascript" script defer src="/katex/auto-render.min.js" 
  onload="renderMathInElement(document.body);"></script>

Equations can be place on their own lines.

$$ \ln x = \int_{-\infty}^x \frac 1 y \, dy $$

\[ \ln x = \int_{-\infty}^x \frac 1 y \, dy \]

becomes

\[ \ln x = \int_{-\infty}^x \frac 1 y \, dy \]

Writing $x \equiv a \pmod{b}$ or $ x \equiv a \pmod{b} $ prdouces $x \equiv a \pmod{b}$.

blogReaderOptions :: ReaderOptions
blogReaderOptions = 
   defaultHakyllReaderOptions
      {
         readerExtensions = 
            (readerExtensions defaultHakyllReaderOptions) <> extensionsFromList
               [ 
                 Ext_tex_math_single_backslash  -- TeX math btw (..) [..]
               , Ext_tex_math_double_backslash  -- TeX math btw \(..\) \[..\]
               , Ext_tex_math_dollars           -- TeX math between $..$ or $$..$$
               , Ext_latex_macros               -- Parse LaTeX macro definitions (for math only)
               , Ext_inline_code_attributes     -- Ext_inline_code_attributes
               , Ext_abbreviations              -- PHP markdown extra abbreviation definitions
               ]
      }

3.3 Decks and presentations

Gooogle decks can easily be embedded using an HTML snippet and requires no special handling on Hakyll’s side. I wrap the standard <iframe> tag in a div tag with the a special class which makes the presentation responsive.

<div class="responsive">
<iframe src="https://docs.google.com/presentation/d/e/<ID>/embed
  ?start=false&loop=false&delayms=3000" frameborder="0" width="960" height="569" 
  allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe>
</div>

.responsive {
  overflow: hidden;
  padding-bottom:56.25%;
  position: relative;
  height: 0;
  iframe {
    left: 0;
    top: 0;
    height: 100%;
    width: 100%;
    position: absolute;
  }
}

reveal.js – the HTML presentation framework – uses standard HTML tags to define the presentation and supports themes. Enabling reveal.js is achieved by modifying the deck’s base template and adding a few options to the decks metadata.

---
reveal: true
theme: league
---

<head>
$if(reveal)$
  <link rel="stylesheet" href="/reveal.js/reset.css">
  <link rel="stylesheet" href="/reveal.js/reveal.css">
  <link rel="stylesheet" href="/reveal.js/theme/$theme$.css" id="theme">

  <!-- Theme used for syntax highlighted code -->
  <link rel="stylesheet" href="/reveal.js/plugin/highlight/monokai.css" id="highlight-theme">
$endif$
</head>

<body>
  <div class="reveal">
    <div class="slides">
      $body$
    </div>
  </div>

$if(reveal)$
  <script src="/reveal.js/reveal.js"></script>
  <script src="/reveal.js/plugin/notes/notes.js"></script>
  <script src="/reveal.js/plugin/markdown/markdown.js"></script>
  <script src="/reveal.js/plugin/highlight/highlight.js"></script>
  <script>
    // More info about initialization & config:
    // - https://revealjs.com/initialization/
    // - https://revealjs.com/config/
    Reveal.initialize({
      hash: true,

      // Learn about plugins: https://revealjs.com/plugins/
      plugins: [ RevealMarkdown, RevealHighlight, RevealNotes ]
    });
		</script>
$endif$
</body>

3.4 Paginating posts

Blog posts are paginated by constructing a Paginate object with buildPaginateWith and then creating rules for every page.

type PageNumber = Int

data Paginate = Paginate
   { paginateMap        :: M.Map PageNumber [Identifier] -- used in blogCtx
   , paginateMakeId     :: PageNumber -> Identifier 
   , paginateDependency :: Dependency
   }

buildPaginateWith
    :: MonadMetadata m
    => ([Identifier] -> m [[Identifier]])  -- group items into pages
    -> Pattern                             -- items to paginate
    -> (PageNumber -> Identifier)          -- identifiers for the pages
    -> m Paginate

hakyll $ do

   pages <- buildPages visiblePattern (\i -> fromCapture "*/index.html" (show i))

   paginateRules pages $ \i _ -> do -- i is the page number
      route   $ idRoute
      compile $ makeItem (show i)
         >>= loadAndApplyTemplate "templates/blog-list.html" (blogCtx i pages categories tags)
         >>= loadAndApplyTemplate "templates/default.html" defaultCtx
         >>= indexCompiler
         >>= relativizeUrls

buildPages :: (MonadMetadata m, MonadFail m) 
   => Pattern -> (PageNumber -> Identifier) -> m Paginate
buildPages pattern makeId = 
   buildPaginateWith
      (return . paginateEvery blogPerPage <=< sortRecentFirst) 
      pattern 
      makeId

To access pages pagination links are added to the relevant context. paginateContext returns a default paginate context which provides standard pagination fields but I like to alias those fields to give them a bit more friendly names.

-- paginateContext returns a default pagin
paginateContext :: Paginate -> PageNumber -> Context a

blogCtx :: PageNumber -> Paginate -> Tags -> Tags -> Context String
blogCtx i pages categories tags = 
      listField "blogs" (blogDetailCtx categories tags) (loadBlogs pat) <>
      categoryListField "categories" categories                         <>
      tagsListField "tags" tags                                         <>
      pagesField i                                                      <> --page links
      defaultCtx
  where
      pat = fromList . fromMaybe [] . M.lookup i . paginateMap $ pages
      pagesField = aliasContext alias . paginateContext pages -- alias standard fields
      alias "pages.first.number"    = "firstPageNum"
      alias "pages.first.url"       = "firstPageUrl"
      alias "pages.next.number"     = "nextPageNum"
      alias "pages.next.url"        = "nextPageUrl"
      alias "pages.previous.number" = "previousPageNum"
      alias "pages.previous.url"    = "previousPageUrl"
      alias "pages.last.number"     = "lastPageNum"
      alias "pages.last.url"        = "lastPageUrl"
      alias "pages.current.number"  = "currentPageNum"
      alias "pages.count"           = "numPages"
      alias x                       = x

3.5 Support for tags and categories

Tags are supported in a similar manner to pagination. The Tags object is constructed with either buildTags or buildTagsWith. Hakyll provides functions to sort tags, render them and adding to a context. Rules for each tag must be created using tagsRules in order for tags to work.

data Tags = Tags
   { tagsMap        :: [(String, [Identifier])]
   , tagsMakeId     :: String -> Identifier
   , tagsDependency :: Dependency
   }

-- buildTags takes a pattern for loading resources and 
-- a mapping from the tag name to its identifier
buildTags :: MonadMetadata m => Pattern -> (String -> Identifier) -> m Tags

-- getTags returns the tag field from the metadata as a list of strings
-- the field value can either be tags separated by a comma or a valid YAML list
getTags :: MonadMetadata m => Identifier -> m [String]

hakyll $ do

   tags <- buildTags visiblePattern (fromCapture "tags/*/index.html")

   -- index
   create ["index.html"] $ do
      route   $ idRoute
      compile $ makeItem ""
         -- tags passed to blogCtx
         >>= loadAndApplyTemplate "templates/blog-list.html" (blogCtx 1 pages categories tags)
         >>= loadAndApplyTemplate "templates/default.html" defaultCtx 
         >>= indexCompiler
         >>= relativizeUrls

      tagsRules tags $ \tag pattern -> do
         tagPages <- buildPages pattern (\i -> fromCaptures "tags/*/*/index.html" [tag, show i])
         route   $ idRoute
         compile $ makeItem tag
            >>= loadAndApplyTemplate "templates/blog-list.html" (blogCtx 1 tagPages categories tags)
            >>= loadAndApplyTemplate "templates/default.html" defaultCtx
            >>= indexCompiler
            >>= relativizeUrls
         paginateRules tagPages $ \i _ -> do -- blog tags pages (i is page within tag)
            route idRoute
            compile $ do
               makeItem tag
                  >>= loadAndApplyTemplate "templates/blog-list.html" (blogCtx i tagPages categories tags)
                  >>= loadAndApplyTemplate "templates/default.html" defaultCtx
                  >>= indexCompiler
                  >>= relativizeUrls

blogDetailCtx :: Tags -> Tags -> Context String
blogDetailCtx categories tags = 
   dateField "date" "%B %e, %Y"                 <>
   mapContext dropFileName (urlField "url")     <>
   categoryField' "category" categories         <>
   tagsField' "tags" tags                       <> -- tags to context
   field "pages.next.url" nextBlog              <>
   field "pages.previous.url" previousBlog      <>
   summaryField "summary"                       <>
   teaserField  "summary" blogSnapshot          <> 
   previewField "summary" blogSnapshot          <> 
   readingTimeField "reading.time" blogSnapshot <>
   defaultCtx

Tags can be rendered in a standard way using renderTags, as a tag cloud using renderTagCloud or in a custom way using tagsFieldWith.

tagsField' :: String -> Tags -> Context a 
tagsField' = 
   tagsFieldWith getTags (renderLink "#") (mconcat . intersperse " ")

renderLink :: String -> String -> (Maybe FilePath) -> Maybe H.Html
renderLink _ _   Nothing       = Nothing
renderLink pre text (Just url) =
   Just $ do
      toHtml pre
      H.a ! href (toValue $ toUrl url) $ toHtml text

4 Rolling your own

4.1 Collecting rules in the Rules monad

The compiler rules live inside the Rules monad which is an instance of the MonadMetadata type class which enables monads to retrive the Metadata of the source file identified by the Identifier. Metadata is a wrapper around a YAML object with functions to lookup it’s values as either strings or lists of strings.

data Rules a

instance Monad Rules
instance Functor Rules
instance Applicative Rules
instance MonadMetadata

class Monad m => MonadMetadata m where
   getMetadata    :: Identifier -> m Metadata
   getMatches     :: Pattern -> m [Identifier]

-- getAllMetadata returns all metadata associated with a pattern
getAllMetadata :: MonadMetadata m => Pattern -> m [(Identifier, Metadata)]

-- getItemUTC tries to extract and parse the time from the published field 
-- or from the filename. 
getItemUTC :: MonadMetadata m	=> TimeLocale	-> Identifier	-> m UTCTime	

-- match matches a pattern and adds a rule for it.
match :: Pattern -> Rules () -> Rules ()

-- hakyll runs the rules in the IO monad
hakyll :: Rules a -> IO ()

4.2 Identifiers and patterns

data Item a = Item
   { itemIdentifier :: Identifier
   , itemBody       :: a
   } deriving (Show, Typeable)

instance Item
instance Item
instance Item
instance Show a => Show (Item a)
instance Binary a => Binary (Item a)

4.3 Writing Compilers and loading Items

The other monad to implement the MonadMetadata type class is the Compiler monad which as the name implies compiles matched source files and takes care of dependencies between rules such that if one rule relies on the results of another rule they are executed in the correct order. The Compiler monad is an instance of Alternative which makes it easy to combine potentially failing compilers (using empty, fail, or throwError) compilers which are tried in sequence until one of them succeeds or else the combined compiler fails.

Compiler a values have access to a range of functions which can retrieve the identifier, the source and path of both the matched file and any file identified by an Identifier. Keep in mind that for such function to succeed there needs to exist a rule for that Identifier in the Rules monad. Otherwise Hakyll doesn’t have any knowledge of its existence.

data Compiler a -- a is the the type of the output - usually String

instance Monad Compiler
instance Functor Compiler
instance Applicative Compiler
instance Alternative Compiler
instance MonadMetadata Compiler
instance MonadError Compiler

-- compile adds a compiler to the Rules monad. Note that the Compiler value is Item a
compile :: (Binary a, Typeable a, Writable a) => Compiler (Item a) -> Rules ()

-- makeItem lifts a value into the the Compiler monad.
makeItem :: a -> Compiler (Item a)

-- getRoute returns the route for a specified item
getRoute :: Identifier -> Compiler (Maybe FilePath)

Parser to try to extract and parse the time from the published field or from the filename. See dateField for more information. Exported for user convenience.

-- getResourceBody returns the full contents of the matched source file
-- as a string without metadata preamble.
getResourceBody :: Compiler (Item String)

-- load an item compiled elsewhere.
load :: (Binary a, Typeable a) => Identifier -> Compiler (Item a)

-- loadAll loads a dynamic list of items.
loadAll :: (Binary a, Typeable a) => Pattern -> Compiler [Item a]

4.4 Snapshots to define a pipeline stage

When loading a compiled resource it usually has the base layout templates applied to it making it unusable in dynamic lists. Snapshots are the solution to that problem. During the compiler construction a snapshot of the resource can be saved in middle of the pipeline allowing you to later load that stage of the compiled resource.

-- blogs
match allPattern $ do
   route   $ blogRoute
   compile $ blogCompiler
      >>= saveSnapshot blogSnapshot -- saving a Snapshot before templates are applied
      >>= loadAndApplyTemplate "templates/blog-detail.html" (blogDetailCtx categories tags)
      >>= loadAndApplyTemplate "templates/default.html" defaultCtx
      >>= indexCompiler
      >>= relativizeUrls

loadBlogs :: Pattern -> Compiler [Item String]
loadBlogs = 
   recentFirst <=< flip loadAllSnapshots blogSnapshot -- load blog items using a Snapshot

4.5 Working with contexts and templates

Contexts define variables which can be accessed and interpolated in a template. There are three ways to define a Context.

Use the Context data constructor.
Use field functions, e.g. field, constField or listField.
Derive a new context from another context using mapContext.

data ContextField
   = StringField String
   | forall a. ListField (Context a) [Item a]

-- Context data constructor wraps a function which takes a key (variable name), 
-- a list of arguments (for function fields), an item and returns a ContextField
-- in the Compiler monad
newtype Context a = Context
   { unContext :: String -> [String] -> Item a -> Compiler ContextField
   }

instance Semigroup (Context a)
instance Monoid (Context a) where
   -- mempty is basically empty from Alternative
   mempty                          = missingField 
   -- mappend is <|> from Alternative
   mappend (Context f) (Context g) = Context $ \k a i -> f k a i <|> g k a i 

-- field takes a key and a function that constructs a value based on the item 
-- (e.g. accessing metadata) and returns a new context.
field :: String	-> (Item a -> Compiler String)-> Context a	 

-- constField takes a key and a constant value and returns a context
constField :: String -> String -> Context a

-- listField takes a string, a context to be applied inside a 
-- $for(..)$ expression, and the items to iterate over.
listField :: String -> Context a -> Compiler [Item a] -> Context b

-- mapContext takes a function, a context and returns a new context
-- with all field values transformed.
mapContext :: (String -> String) -> Context a -> Context a

The Context data constructor is rarely used but necessary if the the context needs to be able to return mempty which is implimented as a failing Compiler. In the Compiler monad failure can be signalled with empty, noResult or throwError.

-- aliasContext maps a new key to another key. If the other key
-- is not defined or returns empty the alias returns empty.
aliasContext :: (String -> String) -> Context a -> Context a
aliasContext f (Context c) = 
   Context $ \k a i -> c (f k) a i <|> c' k -- Compiler implements Alternative
   where 
      c' k = noResult $ unwords ["Tried to alias", k, "as", f k, "which doesn't exist"]

readingTimeField :: String -> Snapshot -> Context String
readingTimeField key snapshot = 
   field key calculate -- calculate :: Item String -> Compiler String
   where
      calculate item = do
         body <- loadSnapshotBody (itemIdentifier item) snapshot
         return $ withTagList acc body
      acc ts = [TagText (show (time ts))]
      time ts = foldl' count 0  ts `div` 265
      count n (TagText s) = n + length (words s)
      count n _           = n

decksDetailCtx :: Context String
decksDetailCtx = 
   dateField "date" "%B %e, %Y"             <>
   mapContext dropFileName (urlField "url") <> -- drops the file name from the urlField
   defaultCtx                               <>
   constField "theme" "black"

-- fail with an error message
noResult :: String -> Compiler a

-- empty is the identity of <|>
empty :: Alternative f => f a

-- throwError used within a monadic computation begins exception processing
throwError :: Monad m, MonadError e m | m -> e => e -> m a

DSL stands for Domain Specific Language. Hakyll is an eDSL, embedded Domain Specific Language, which means that the DSL is defined in the underlying programming languages code as opposed to a completely new language.↩︎

@code #haskell