Using Rust's 'Result' Approach For Neopolitan's Parser

May 2025

It Came To Me In A Dream

Like, literally.

I was drifting in and out of a nap after a few hours working on Neopolitan's parser^neo. The AST^ast was floating around my head^dreams. The way I'm using an ok/error container to wrap results kept bubbling to the surface.

The structure looks like this when the parser completes a journey through a valid file:

{
  "ok": {
    "blocks": [
      {
        "attrs": {},
        "category": "basic",
        "children": [
          {
            "category": "text-block",
            "kind": "text-block",
            "spans": [
              {
                "category": "text",
                "content": "Visions Of Errors In My Sleep",
                "kind": "text"
              }
            ]
          }
        ],
        "end_block": null,
        "flags": [],
        "kind": "title"
      }
    ]
  }
}

If something goes wrong, the top level ok turns into error.

{
  "error": {
    "message": "[Relevant Error Message Goes Here]",
    "completed": "[Input the parser made it through]",
    "failed_at": "[Input the parser failed on]"
  }
}

This is straight out of Rust's Result^result playbook. It forces you to check for errors before you can use the data. I love how explicit it is.

Catching Round One

My parser^nom sends errors when it hits files it can't parse. For example, two dashes on their own line are invalid.

-- title

This text is fine. 

It's part of a valid ``-- title`` block. 
But, these next two dashes sitting on their
own line aren't allowed.

--

The parser chokes on those. It sends
an error message starting with them
as the ``failed_at`` content.

Those dashes are invalid because they don't have any text behind them that identifies what type of block they're starting. The parser ejects as soon as it hits the issue and throws an error back to the AST.

That's great. But, it's not everything.

On The Run

Basic Neopolitan blocks of content looks like this:

-- title

What's A Parser To Do

You ever dream about error messages?
Yeah, me too...

It starts with two dashes followed by the kind of block the content contains (-- title in this case) ^kinds.

Blocks can also nest each other. The process uses opening and closing lines with / characters.

-- title/


-- /title

Other blocks can slot bewtween them.

-- title/
  
  -- div

  Splitting title content into
  multiple divs feels weird.

  -- div

  But, it's totally valid. And,
  it keeps the examples consistent.


-- /title

This opens the possiblity of a new type of error: Opening a block but never closing it.

-- title/

  -- div

  It doesn't matter what goes 
  here because there's no 
  closing ``-- /title``.
  The parsing will fail.

Failure To Track

The parser eats the rest of the file looking for the closing -- /title line. It pukes when it hits the end of the file wihtout finding one.

Unfortunately, the error information that's available when this happens doesn't include where things went off the rails.

That means I can't provide a message that points you to what needs to be fixed. You're left to poke around the file on your own trying to figure out what happened.

It sucks.

Enhanced Vision

That brings us back to the dream.

The ok/error from the top of the AST floated down the tree, bounced around a bit, then melted into the blocks.

Queue realization that I can use the technique from the top of the file for the blocks too. It would let me identify where errors happen beyond what the parser does on its own. The AST transforming into:

{
  "ok": {
    "blocks": [
      {
        "ok": {
          "attrs": {},
          "category": "basic",
          "children": [
            {
              "category": "text-block",
              "kind": "text-block",
              "spans": [
                {
                  "category": "text",
                  "content": "Everything's Alright",
                  "kind": "text"
                }
              ]
            }
          ],
          "end_block": null,
          "flags": [],
          "kind": "title"
        }
      }
    ]
  }
}

On To The Pondering

Given that I started writing this as soon as I got up, the approach is barely into the exploration stage. I need to think through how it effects the parser, the output templates, etc...

My mind's bubbling on all that stuff. But, this feels like one of those light bulb moments. Not one where I had an original idea. One where I realized something I learned about elsewhere offers a solution that wouldn't have otherwise occurred.

-a

end of line

Endnotes

Neopolitan has two content types: blocks and spans. The post talks about blocks. I expect to apply the same technique to spans. Effectively wrapping everything in ok/error blocks.

{
  "ok": {
    "blocks": [
      {
        "ok": {
          "attrs": {
            "class": [
              {
                "ok": {
                  "category": "text",
                  "content": "ping",
                  "kind": "text"
                }
              }
            ]
          },
          "category": "basic",
          "children": [
            {
              "ok": {
                "category": "text-block",
                "kind": "text-block",
                "spans": [
                  {
                    "ok": {
                      "category": "text",
                      "content": "Ok, Ok, Ok",
                      "kind": "text"
                    }
                  }
                ]
              }
            }
          ],
          "end_block": null,
          "flags": [],
          "kind": "title"
        }
      }
    ]
  }
}

And good eye on you if you noticed that the value of the class attribute in an array of spans that also uses the ok/error.

Of course, this adds complexity. I'm not worried about it from the parser side. I think I can add end-of-file checks to anything that has open/close parts to deal with things.

The main thing to solve for is working with the data in output templates. It would be a bummer to have to add checks into every little template. One possibility is to pre-process the AST to shift everything up. That's not great, but it would work.

I don't think that's going to be necessary. If my mental model is right, you can drop in two files (one for blocks and one for spans) that act as gateways that everything passes through. They'll do the ok/error check and hand off the results to the proper next step.

The other inspiration I'm taking from Rust is how great the error messages are. I wrote an entire post singing their praises.

The TL;DR is that folks who work on the compiler have the philosophy that it's an issue worthy of being called a bug if an error message doesn't give you enough information about itself to fix it.

It's one of my favorite things about Rust and a goal I've got for Neopolitan.

It's still early thinking, but I'm already kicking around the idea of using numbered error messages like Rust does to provide information about how to solve issues.

The more frictions I can remove from the app and processes, the more folks will enjoy using them. That's a goal worthy of spending time on.

Footnotes

neo ⤴

Neopolitan - plain-text for websites

It's like Markdown on steroids.

ast ⤴

Abstract Syntax Trees (AST)

The Neopolitan parser reads the files we humans make and turns them into data that apps can use to build websites. The formal name for that type of data is Abstract Syntax Tree.

dreams ⤴

Yep. I sometimes dream in code. It's weird, but you get used to it.

result ⤴

Rust Error handling with the Result type

Rust's learning curve was pretty steep for me. One thing that took a while to get my head around is how Result works. Basically, when you get data back from something, you have to check if it's ok before you try to use it.

If felt like unnecessary overhead when I first started learning the language. Now, I don't like working without. (Which, of course, is how this entire post came to be.)

nom ⤴

nom - The Rust parser that eats data byte by byte

Neopolitan's parser is built on top of nom. A rust based parser combinator library. (Basically, a collection of little pre-built parsers that I assemble like Lego to build the one for Neopolitan.)

kinds ⤴

Neopolitan has seven primary kinds of content blocks: Basic, Checklist, CSV, JSON, List, Numbered List, and Raw. The kind of block determines how the content inside it is parsed.

Basic sections (like -- title) are just text. JSON blocks get turned into data objects in the AST that are available to use in templates.

Each kind of block gets its own output template. -- title and -- details are both Basic blocks. Using individual templates is what lets one become an <h1> and the other a <details> element in the output.