Writting A Neovim Syntax Highlighter For Neopolitan In Tree-Sitter


I spent the past few days learning how to make a Tree-sitter parser to add neopolitan syntax highlighting to Neovim. It was an adventure but I'm happy with where I ended up.

Languages And Parsing

Tree-sitter's default parsing is done with JavaScript regex. Unfortunately, some regex features (like lookahead matches) aren't available. It has something to do with the way the parser works. I don't know the details. Regardless, I needed lookahead so I needed another option.

It turns out you can extend Tree-sitter by making custom parsers written in C or C++. Two languages I've never used. Some parts of Tree-sitter are written in Rust (which I do know a little) so I was hoping I could find a way to use that. I don't know if it's possible, but if it is I couldn't figure it out. So, I dug into C on the Learn X In Y Minutes site. I didn't really learn the language, but I figured out enough to get a parser working.

How's It Look

I'm writing up how I built the parser in another post. For now, here's an example of the how it looks in Neovim.

A screenshot of a text document with the title Neopolitan Syntax Highlighting at the top. The background is a dark gray. The majority of the text is white with difference sections showing computer code that's highlighted in multiple colors

I'm really happy with how this turned out. I've been working on the format and corresponding site generator for a year. I've been using them for a few months, but this syntax highlighting makes it feel real.

It's also very cool because the concrete syntax tree generated can be accessed from other Neovim plugins. That means I can use it to add functionality that executes code blocks and puts the output directly inside the files (similar to org-mode on emacs).

That's gonna take things to the next level.



The tree-sitter-neopolitan repo

Source code for the project is here


tree-sitter-neopolitan README

The README for the parser/highlighter/runner



The plain-text format I made to manage website content. It's a cross between markdown, mdx, and org-mode.



"a parser generator tool and an incremental parsing library"

I'm not sure what all that really means, but I know it can power syntax highlighting. Even better, the syntax tree it makes can be accessed via Neovim Lua plugins. That's how I plan to make code blocks executable directly in the docs


Creating Tree-sitter Parsers

I spent a bunch of time on this page. For the most part, it's pretty high quality documentation. A few more code examples would have been nice, but I'd probably say that no matter how many were on the page


Syntax Highlighting in Tree-sitter

This is where I learned about the "Language Injection" that let you have HTML (or whatever) highlighted code blocks inside the neopolitan docs



This is the Neovim plug-in I'm using to wire up Tree-sitter. It took a bit to get things configured. I'm writing up a post with those details as well


Tree-sitter CLI

Creating new parsers is done with the CLI tool. It allows you to generate them and preview the output directly in the terminal. It even has a testing which is great. The parser output gets complicated pretty fast. I'm not sure how you could make a parser without a test suite.


Learn X In Y Minutes

This is where I got my feet with with C and figured out enough to get a parser working. If you already have experience in one language it seems like a decent approach for learning the basics of a similar one. I certainly wouldn't say I've learned C, but I know the syntax for "if" statements now and along with the samples on the Tree-sitter site that was enough to put things together