Splitting Out Neopolitan's Parser with a New Feature
I designed a markdown1 replacement. It's called Neopolitan2.
Then, I built a static site generator3 to use it. It's called Neopoligen4.
Neopoligen requires a parser5 to turn Neopolitan files into web pages. The parser is currently embedded in the generator. That's worked well for a few years. Mostly because I haven't made changes to Neopoligen. Until recently, it's done everything I need.
Speed Builder
Making new sites is my jam. I roll out new subdomains6 all the time. While Neopoligen handles multiple sites, it doesn't do everything I want. I've grown used to the limitations for this site. Working within them for new sites is constricting. The various frictions led me to make a new static site generator. It's great. The features free me up to make more interesting sites with less effort.
The only problem is Neopolitan.
The new generator can't use it since the parser is embedded in Neopoligen. That means it's time to split them apart.
Tweaking Stability
I've been using Neopolitan for years now. There were several iterations at the start. Differences in the syntax that separate sections, spans, and attributes. The natural process of making a thing, working with it, then improving it.
Those changes stopped a long time ago. I'm not sure when. It's been so long I don't remember. That doesn't matter. The important thing is the format is stable. A good thing since I've got 11,5327 documents written in it.
I do have a feature to add. It won't break anything. It'll just make content easier to work with.
Section Work
Neopolitan documents are made of sections. A typical document might look like this:
That example has two sections. Each starts with two dashes followed by the section type. -- title and -- todo specifically.
Each type of section belongs to a category that defines how the content inside it is treated. -- title is a generic section that's used for basic text. The -- todo section belongs to the checklist category. Each line that starts with a [] becomes an item on the list. Items can be marked off by adding an "x" like: [x].
The full list of categories is:
- checklist - A collection of items that can be checked off.
-
description-list - A way to create HTML
Description Lists8. - generic - Everything that's not part of another category
- json - For embedding data
- list - Basic lists without numbers
- numbered-list - Ordered lists with numbers
- plugin - A section that calls out to other apps/tools to process the content.
- raw - Content that's passed through the parser without changes (e.g. html, css, javascript, etc...)
Adding new generic sections to the current parser doesn't require any work. Anything that isn't explicitly defined to be another category simply becomes generic.
Defining new section types for other categories is more complex. It requires configuring the parser explicitly. That creates three issues:
- You have to know how to configure the parser.
- You have to have access to the parser to make the changes.
- Sections won't work the same way if different parsers have different configurations.
That last issue is the biggest problem. No one wants to use a tool that behaves differently depending on where you use it.
A Little Help From A Friend
That brings us to the new feature.
Instead of defining new section types in the parser, you defined them in the documents themselves. Specifically, you put the name of the category after the name of the section type you're making.
For example, this creates a checklist for a book reading list.
With that feature in place, Neopolitan documents can be shared freely with confidence they'll work the same everywhere.
One More Time
The change is relatively minor. The main effort will be splitting the Neopolitan parts out of Neopoligen.
I could update the existing version. But, there's a ton of cruft. It was the first real project I wrote in rust. It works, but I know way more about what I'm doing now. So, I'm starting with a clean slate. I expect it won't take long, given that this is version 44 and that I really want to start using it.
Stay tuned,
-a
Endnotes
The reason for having different section types in Neopolitan is so they can be treated differently in the output. For example, I use -- aside, -- book-review, and -- youtube sections. Each type gets its own output template. This becomes very powerful when sections include attributes.
For example, my -- book-review sections look like this:
The title and author become independent values. I can put them wherever I want in the output template.
Each section category includes a limited number of default types (i.e. sections where you don't have to explicity add the category after the type name). There are also shorthands for the category names. For example, cl can stand in for checklist. So, you can do:
At press time, the complete list of shorthands and default types looks like this:
- checklist/cl: checklist, cl, runbook, todo, todos
- description-list/dl: description-list, dl
- generic: everything that's not part of another category
- json: json, metadata
- list: list, notes
- numbered-list/nl/ol: nl, numbered-list, ol
- plugin: plugin
- raw: cli, code, comment, css, data, html, javascript, markdown, output, raw, result, results, path, pre, text
I could build the Neopolitan parser into the next iteration of Neopoligen. That would limit its usefulness. Splitting it out means I can use it in other projects like syntax highlighters and formatters. It also makes it easier for folks to use it in other projects.
It's probably worth pointing out that the Neopolitan parser doesn't produce HTML output. It builds an Abstract Syntax Tree (aka AST) which other apps/tools then use to generate outputs. That's how Neopoligen will produce the HTML for web pages. Other apps could do different things or use the AST for syntax highlighting, formatting, etc...
Avoiding different parsers interpreting the same document in different ways is an explicit design goal. I want to do what I can to prevent the Markdown situation where there's multiple parses that all do things a little differently.
Footnotes
The go-to format for text content in the content management systems preferred by techies.
The text format I designed to replace markdown. It's a little more complex but way more capable.
Static site generator is another name for a website builder. A key feature is that there are places that will host them for free.
The website builder for Neopolitan files. I haven't really released it yet. That'll change when this work is complete.
An easy way to think about parsers is that they read in files in one format in a way that lets you make other formats.
Subdomain websites are like children of parent website. They have addresses that change the first part of the main sites name. For example, my site's main address is www.alanwsmith.com. One of my subdomains is neopolitan.alanwsmith.com. A bunch of others are at links.alanwsmith.com (which is itself a subdomain).
This is a post I wrote about how awesome they are: How To Grow Your Digital Garden With Subdomains
I love taking notes. I love posting them. My grimoire is how I do both.
A super under-used HTML element. Partially because it's not included in Markdown by default. Each item in the list contains two parts: a term and a description. For example: