Jinja2 Filters

This site uses the Pelican framework to generate articles and static pages, which in turn implements the Jinja2 render engine to parse templates and content files. In order to style certain pages, I’ve implemented a few custom Jinja2 filters.


Leveraging the Jinja2 render engine has been quite the learning experience. Starting fresh from the Gwern source, I realized that I didn’t want to learn and implement Haskell. However, the Pelican static site generator seemed to be an adequate replacement for the Hakyll implementation favored by Gwern1. Plus, it had the added benefit of being written in Python, which I have a wealth of experience writing and understanding.

Additionally, Python is visually similar to Haskell, which should aid in porting over any of the Hakyll filters that Gwern uses that could be useful here. And there are a few that I’ve noticed on my cursory glance of the source that would be of great use here too.

Custom Jinja2 Filters

Variables can be modified by filters. Filters are separated from the variable by a pipe symbol (|) and may have optional arguments in parentheses. Multiple filters can be chained. The output of one filter is applied to the next.

For example, {{ name|striptags|title }} will remove all HTML Tags from variable name and title-case the output (title(striptags(name))).

Filters that accept arguments have parentheses around the arguments, just like a function call. For example: {{ listx|join(', ') }} will join a list with commas (str.join(', ', listx)).

Jinja2 comes pre-equipped with some built in filters which worked well enough, but eventually I decided that I needed some more flexibility. While I’m sure I could’ve used the built in functionality and chained the relevant filters together in ways that would produce what I wanted, I felt that there were some which were just far easier to implement as a custom filter.

Adding Custom Filters to Pelican

To add custom filters to Pelican, you need to do two things in the pelicanconf.py file:

  1. Define the filter as a Python function
  2. Add the filter to the JINJA_FILTERS

Drop Caps CSS support

Due to pure laziness in implementing the Gwern style “drop caps” replacement/rewrite filters, I opted to just follow their styling and page layout. This meant I needed to be able to extract a “drop cap” metadata attribute from the markdown file and insert it into the template as a class on the <body> element. This is because the rewrite module expects to see the class name there as it scans for the first <p> child to apply the style.

Because of how Pelican parses files, I needed a way to pass this metadata up from the content markdown to the base.html template, since that’s where I’m writing the <body> tag2.

def drop_cap_css(page):
    return page.metadata.get("dropcap");

This really simply gets the metadata tag “dropcap” from the markdown and passes it into the Jinja2 engine. It gets called in the base.html template as:

{% if page_name and dropcap %}
    <body class="{{page_name}} {{dropcap}}">
{% elif page_name and not dropcap %}
    <body class="{{page_name}}">
{% elif dropcap and not page_name %}
    <body class="{{dropcap}}">
{% else %}
    <body>
{% endif %}

Exclude/Filter Category/Tag

This one was a necessary choice as the next to implement. This site includes some old blog posts from an old wordpress site of mine, and some of those posts are decidedly depressing and slightly too personal to want to host publicly. And while they are public by the nature of this site, I didn’t want them listed on the index, but I did want lists of categories on the home page— And I wanted those lists to automatically populate so I wouldn’t have to maintain them. This all meant that I needed to implement a method of filtering out certain tags and categories from a list of articles on generation.

def exclude_tags(articles, tags):
    return [article for article in articles if not any(i for i in article.tags if i in tags)];

This is a nifty one-liner. It consumes a list of articles, and a list of tags to exclude, and checks the list of tags on that article. It then returns a new filtered list of articles that don’t have any tags that match any of the supplied tags.

Filtering is essentially the same:

def filter_tags(articles, tags):
    return [article for article in articles if any(i for i in article.tags if i in tags)];

The only difference is this returns a list of only articles who have tags in common with the supplied list. In addition, the Exclude/Filter filters for Category are coded the same, except they don’t need to do the loop through article tags, since article:category is kept 1:1.

All the above filters get called from the template the same way:

<section id="blog" class="level1">
    <h1>Blog Posts</h1>
    {% if articles|filter_category("blog posts")|exclude_tags(['depression', 'personal'])|length == 0 %}
    <em>No articles...</em>
    {% else %}
    <ul class="post-list">
        {% for article in articles|filter_category("blog posts")|exclude_tags(['depression', 'personal']) %}
        {% if loop.index <= 8 %}
        <li>
            <p class="entry-title"><a href="{{ SITEURL }}/{{ article.url }}" rel="bookmark" title="Permalink to {{ article.title|striptags }}">{{ article.title }}</a></p>
        </li>
        {% endif %}
        {% endfor %}
    </ul>
    {% endif %}
</section>

Here, we can see the list of articles first being filtered down to “blog posts” only, and this is further chained into the “exclude tags” filter to remove any posts with the tags “depression” or “personal” from the articles list. This is then used to generate the “Blog Posts” category listing on the index[^3].

[^3]This was originally done by grouping and displaying categories, but I wanted to ensure that the “Blog” section was always beside the “Newest” section, and this would’ve broken if I had any categories in the future which would be alphabetically sooner than “Blog”.

Alphabetical Grouping

I wanted to filter and group tags/categories for displaying on the respective index pages, as sort of a throw back and reference towards physical indices in books. This means that I need a way of taking tags and articles (presented by Pelican as a list of tuples), organizing them into a dictionary with a key of the first letter of the tag name, and a value of a list of tuples representing the same tag/article list tuple that Pelican provides.

The final object will be in the shape:

{
    'key': [
        (tag1, [article1, article2, ...]),
        (tag2, [article1, article2, ...]),
        ...
           ],
    ...
}

The code iterates through the provided tuples, deconstructs them, sorts and filters them, and finally recombines and groups them, transforming a flat list into a structured dictionary as described above, with the keys ordered alphabetically as it gets returned.

def group_alpha(tags):
    tdict = {}
    for tag, articles in tags:
        key = tag.name[0].lower()
        if key in tdict.keys():
            val = tdict[key]
            tta = (tag, articles)
            val.append(tta)
            tdict.update({key: val})
        else:
            val = [(tag, articles)]
            tdict[tag.name[0].lower()] = val
    return {key: value for key, value in sorted(tdict.items())};

It is called and then iterated through as follows:

{% set gTags = tags|group_alpha %}
{% for key in gTags %}
    <section id="tag-{{key}}" class="level1">
        <h1>{{ key }}</h1>
        <ul class="post-list">
            {% for tag, articles in gTags[key]|sort %}
                <li><p class="entry-title"><a href="{{ SITEURL }}/{{ tag.url }}">{{ tag }}</a> ({{ articles|count }})</p></li>
            {% endfor %}
        </ul>
    </section>
{% endfor %}

The first line sets a variable to store the restructured dictionary, which is then used in the for loop to extract the tags and data. Since the value is stored in the same format that Pelican expects it to be, it can conveniently be passed through the existing “sort” filter to order the tags alphabetically, and then used to count the articles with that tag, which is taken from the sample template provided.

JINJA_FILTERS Object

In order to expose these custom filters to Jinja2 through Pelican, the following dictionary needs to be added to the pelicanconf.py file:

JINJA_FILTERS = {
    "drop_cap_css": drop_cap_css,
    "exclude_category": exclude_category,
    "filter_category": filter_category,
    "exclude_tags": exclude_tags,
    "filter_tags": filter_tags,
    "group_alpha": group_alpha
}

  1. Gwern even goes so far as to recomment not implementing in Haskell unless you already have experience in it. I have taken their words to heart.↩︎

  2. I know I should do this elsewhere, but laziness and convenience put it here, and now I don’t want to refactor my entire template structure to change it.↩︎