A tool to export Notion pages → Hugo-powered static sites/blogs

This post is about how/why I wrote notion-blog-exporter, if you just want to use it, go here.

I want to write more than I do, and had been noticing that one of the reasons I don’t write that many blog posts is that it’s hard to get started. I host this blog statically using Hugo, and while my current workflow works fine I write just infrequently enough that I have to relearn it every time. Currently the workflow goes something like this: pull up a markdown editor, find my local blog folder, look up the command to start hugo in preview mode, point a browser at the local hugo server, and then finally start writing. After looking at vim and writing for a bit, I then have to scroll around in my browser window to re-read my latest changes.

These are obviously the minorest of gripes, but I want to post more frequently, and all of the above adds up to a fair amount of activation energy. Also, this workflow requires me to be at a real computer with vim and hugo installed to update the blog. I’d like to be able to easily update it from my iPad or from someone else’s machine.

I’d also recently found myself using Notion pretty frequently and really enjoying the writing experience there. It hits the sweet spot: WYSIWYG-y enough that you don’t have to mentally render markdown or keep track of matching punctuation marks and brackets (I do enough of that while coding, thank you) but still keyboard-driven and unobtrusive and markdown-like in spirit. And, perhaps most importantly, it has a good markdown export function.

After seeing that there are a few libraries that have implemented unofficial APIs (an official API is theoretically coming soon) for notion, so whipping up something to automatically post from Notion to my blog seemed like a fun challenge that might lead to me blogging more. Of the libraries I looked at, notion-py seems to be the furthest along and I like Python, so I dug in.

The library is well designed and was easy to get started with, but it was missing some key features:

  1. it only has partial markdown export support

  2. it doesn’t support downloading images or embedded files from notion

Luckily, these were both fixable with a bit of scripting. While notion-py automatically translates Notion’s inline formatting (e.g. bold, italics, code, etc.), it doesn’t translate block level formatting, so things like numbered and bulleted lists and quote blocks just come through as normal text. To fix this, I added a few handler methods that take care of generating a markdown wrapper for the block type.

The second one was a bit tricker as I didn’t really understand how Notion’s API works or how to get at data that notion-py doesn’t expose. First stop was switching over to Chrome’s network panel to watch requests go by. After scrolling through a few images it became apparent that they were downloading the images via signed links from Amazon S3. It took me a bit to find it, but all blocks in notion-py expose a .get() method that will return the actual data structure Notion stores on the server side. After some tinkering, was able to extract the requisite IDs from there and open this Pull Request (not yet merged as I write this) to add a download_file() method to notion-py.

After that, it was just a matter of setting up a cron job to run every morning and check if there is anything new in Notion. If so, it saves it to the hugo folder, and then commits and pushes. From there, Netlify takes over and runs hugo remotely to deploy this article you’re reading now to their CDN.

This was a fun little project, and if you’re reading this, then it was also a successful one. It’s not entirely done yet, I’ll be adding some more documentation and anything else I think of as I use this more. Also, I’d like to add one more feature: Hugo allows you to add extra fields to the frontmatter of a post. I’d like to use these to record the block/page ID of the Notion post the article is coming from. This way I can change the title of an article and notion-blog-exporter will still be smart enough to update the correct file.