How this page is generated - Part 02 Useful Hakyll Contexts


Article image

Custom fields are a nice way to create dynamic content, in this article I want to show the collection of fields used on this blog.

Built with Hakyll – Part 02: Custom Fields

In the previous article I introduced the basic mechanisms of Hakyll. Part of this introduction was that every content lives in Hakyll as an Item. An Item is created by loading some content from a file (or a wildcarded list of files). In most cases these are Markdown files. For example, this is what loading an Item looks like:

match "posts/**.md" $ do
    route $ setExtension "html"
    compile
        $   pandocCompiler
        >>= loadAndApplyTemplate "templates/post.html" postCtx

For each *.md file inside posts/ an .html file is generated by compiling the document using pandoc (which turnsthe markup it into html) and including this content in a template, namely the templates/post.html template, by loading the template and applying it with a given Context.

What is a Context? Contexts deliver meta information to the templating engine. The information is constructed for each document individually and by default contains (in that order)

  1. A body field: current content of the document (usually after conversion using pandoc)
  2. Metadata fields defined in the document’s frontmatter
  3. A url urlField: url of the final document
  4. A path pathField: path of the original file
  5. A title titleField: title defined in frontmatter

These variables can in turn be used in the template and will be substituted by Hakyll. You can read more about templates, again, on the official Hakyll Website

Custom fields

Generally Contexts are the only way to include dynamic content. Anything that is not plain text from the template or the document comes through a context Field. Fields hold data that is individually derived from each Item i.e. document. They are composed to form a full context which is fed into the templating engine. As Fields are implemented as Monoids composing gets very easy:

context :: Context String
context = mconcat
    [ defaultContext
    , pathField "sourcefile"
    ]
-- or
context = defaultContext
       <> pathField "sourcefile"
       <> constField "field" "value"

Hakyll comes with a range of default fields:

While setting up the structure of this blog I found the “need” of having some more fields available though. This is how I started to adapt some implementations I found on the internet, and developed my own. Find those in generator/Fields.hs.

Most fields are created by giving an implementation of (Item a -> Compiler String) to the following field function.

field :: String -> (Item a -> Compiler String) -> Context a

NOTE

I am by no means an expert in Haskell at this point. I have learned alot writing the engine behind all this (Which in the end is actually my main motivator). But I am certain some implementations could be done more idiomatically and/or efficiently. Thus, take the ideas that I had critically.

I do welcome any coments in form of issues on Github. :)

peekField

I showed this field already in the previous article, it is one of the first fields I made to understand the logic behind them. It simply takes the fist length words and makes them availlable to the template engine under key. As templates are applied after pandoc converted the document to html I needed to take the original content from an earlier created snapshot of the document.

peekField
    :: String           -- ^ Key to use
    -> Int              -- ^ length to peak
    -> Snapshot         -- ^ Snapshot to load
    -> Context String   -- ^ Resulting context
peekField key length snapshot = field key $ \item -> do
    body <- itemBody <$> loadSnapshot (itemIdentifier item) snapshot
    return (peak body)

    where peak = T.unpack . T.unwords . take length . T.words . T.pack

Problem with this is undeniably that it does not that markup into account at all. That way Code blocks are included without any styling and look very bad.

Git Fields

I wanted to allow readers of my blog to follow the history of an article. As the sourceof this blog is hosted on GitHub using GitHub’s history view would be an easy way to achieve this, I thought.

-- Git related fields
--------------------------------------------------------------------------------
data GitVersionContent = Hash | Commit | Full
     deriving (Eq, Read)

instance Show GitVersionContent where
    show content = case content of
        Hash -> "%h"
        Commit -> "%h: %s"
        Full -> "%h: %s (%ai)"

-- Query information of a given file tracked with git
getGitVersion :: GitVersionContent -- Kind of information
              -> FilePath          -- File to query information of
              -> IO String         --
getGitVersion content path = do
    (status, stdout, _) <- readProcessWithExitCode "git" [
        "log",
        "-1",
        "--format=" ++ (show content),
        "--",
        "src/"++path] ""

    return $ case status  of
        ExitSuccess -> trim stdout
        _           -> ""

    where trim = dropWhileEnd isSpace

-- Field that contains the latest commit hash that hash touched the current item.
versionField :: String -> GitVersionContent -> Context String
versionField name content = field name $ \item -> unsafeCompiler $ do
    let path = toFilePath $ itemIdentifier item
    getGitVersion content  path

-- Field that contains the commit hash of HEAD.
headVersionField :: String -> GitVersionContent -> Context String
headVersionField name content  = field name $ \_ -> unsafeCompiler $ getGitVersion content  "."

With the current implementation of getGitVersion I am able to get the latest commit that changed any given document. It spawns a git process and reads its output afterwards.

I can even choose from predefined formats:

  • Hash gives the Commit’s hash,
  • Commit the Hash+Message
  • Full Hash+Message+Time

Although I think for many pages the number of git invocations might lead to significantly longer build times, for the time being it works rather well.

readTimeField

Another really simple but usefull function that naïvely computes the reading time of a document. Essentially it counts the words of the docutment snapshot and devides it by the average reading velocoty of about 200 words/min.

readTimeField :: String -> Snapshot -> Context String
readTimeField name snapshot = field name $ \item -> do
    body <- itemBody <$> loadSnapshot (itemIdentifier item) snapshot
    let words = length (T.words . T.pack $ body)
    return $ show $ div words 200

publishedGroupField

Adapted from biosphere.cc

This field is actually a listField. It is used on the archive page to group posts by year. It is also a quite intimidating one at first sight, but it works perfectly, so 🤷🏼‍♂️.

It works by first extracting the year out of the date of every post using extractYear. It then groups the resulting tuples by the year item and merges the groups.

Have I already mentioned that working with elements contained in a Compiler Monad is incredibly weird? - It is!

Anyway, in the end the template can use the list referenced by name that exposes a field year containing the actual year of a list of posts that is accessable as posts and that have a given postContext applied.

publishedGroupField :: String           -- name
                    -> [Item String]    -- posts
                    -> Context String   -- Post context
                    -> Context String   -- output context
publishedGroupField name posts postContext = listField name groupCtx $ do
    tuples <- traverse extractYear posts
    let grouped = groupByYear tuples
    let merged = fmap merge $ grouped
    let itemized = fmap makeItem $ merged

    sequence itemized

    where groupCtx = field "year" (return . show . fst . itemBody)
                  <> listFieldWith "posts" postContext (return . snd . itemBody)

          merge :: [(Integer, [Item String])]  -> (Integer, [Item String])
          merge gs = let conv (year, acc) (_, toAcc) = (year, toAcc ++ acc)
                      in  foldr conv (head gs) (tail gs)


          groupByYear = groupBy (\(y, _) (y', _) -> y == y')

          extractYear :: Item a -> Compiler (Integer,  [Item a])
          extractYear item = do
             time <- getItemUTC defaultTimeLocale (itemIdentifier item)
             let    (year, _, _) = (toGregorian . utctDay) time
             return (year, [item])

concatField

An actual use of functionFields – yey.

It is used to dynamically apply a different header in the base template. There I construct a path to a partial template using another constField (item-type) which, for posts will display a customized header

$partial(concat("templates/includes/",item-type, "-header.html"))

The implementation is simply:

concatField :: String -> Context String
concatField name = functionField name (\args item -> return $ concat args)

FunctionFields

For the unintroduced, function fields are defined as:

functionField :: String                                  -- name
              -> ([String] -> Item a -> Compiler String) -- actual function
              -> Context a

When used, like in the above example (concat(“hello,” " “,”world")) the function

(\args item -> return $ concat args)

is evaluated where args contains exactly the arguments given (args == [“hello,”" “,”world"]). as well as the Item it is used on i.e. the document. You can now do all you want with the document’s body and arguments given. Apparently though calling the functionField with the same field as argument is not possible.

For another explanation see also Beerend Lauwens’ post

plainTocField

Although I have written another implementation of this one that allows additional classes applied on certain elements I would like to show this anyway. To generate a simple table of contents pandocs builtin toc gereator is leveraged. I load the documents body, give it to panodc to parse it into a Pandoc _ [Block] and use that to write html with the template table − of − contents. The output of that is only the table of contents and nothing else.

plainTocField :: String -> Int -> String -> Context String
plainTocField name depth snapshot = field name $ \item -> do

    body <- loadSnapshot (itemIdentifier item) snapshot

    let writerOptions = def
            {
              writerTableOfContents = True
            , writerTOCDepth = depth
            , writerTemplate = Just "$table-of-contents$"
            }
        toc = case runPure (readHtml defaultHakyllReaderOptions
                                     (T.pack $ itemBody body))
               >>= \pandoc -> runPure ( writeHtml5String writerOptions pandoc) of
                   Left err    -> fail $ ""
                   Right item' -> T.unpack item'

    return toc

Final Words

As I have already mentioned above, I am not the expert in Haskell that I’d like to be. But writing this blog’s engine has taught me much and was a great excuse to dive Haskell and understand its ideas.

I hope you enjoyed this post anyway. For ideas critic and about this use the GitHub issue tracker