Using JSON Data for a Test Runner

May 2025

Preface

I happen to be doing this in Rust. The same approach can be used in most any language.

More Than One

I'm writing test cases for Neopolitan's reference parser¹^,². There's hundreds of them³. Making individual tests for each case is a pain. I ended up building a test runner that looks like this:

#[test]
fn run_tests() {
    let cases: Vec<_> =
        TEST_DATA.split("######").collect();
    for case in cases {
        let parts: Vec<_> = case
            .split("~~~~~~")
            .map(|c| c.trim())
            .collect();
        let left: ExampleSpan =
            serde_json::from_str(parts[1]).unwrap();
        let right = parse_example(parts[0]);
        assert_eq!(left, right);
    }
}

figure 1

It parses a TEST_DATA string that contains the raw material for multiple tests. The tests are split up. Each one is run individually.

Data Under Test

TEST_DATA is a big string of test cases. The contents for each case are separated by ###### tokens. Inside each test case, the data to test and the expected output are separated by ~~~~~~ tokens.

Here's an example with two tests in it:

``alfa``

~~~~~~

{ "content": "alfa" }

######

``alfa
bravo``

~~~~~~

{ "content": "alfa bravo" }

figure 2

The key feature of the run_tests() function is using serde_json to convert the JSON strings into the appropriate types for the test.

Original Approach

This is a lot nicer than what I was doing before. It required making a full #[test] for each case. They looked like this:

#[test]
fn original_test_example() {
    let source = "alfa";
    let left = ExampleSpan::Text {
        content: "alfa".to_string(),
    };
    let right = example_span(parts[0])
      .unwrap().1;
    assert_eq!(left, right);
}

figure 3

That individual case is a bit shorter than the code for the runner. But, you have to make one for each test case. The duplication adds up fast.

The Bigger They Are

That really starts to suck with more complicated types. Take this code from the parser tests as an example:

#[test]
fn code_span_newline_test() {
    let source = "``\nalfa\n|\nbravo:\ncharlie\n|\ndelta\n``";
    let mut attrs = BTreeMap::new();
    attrs.insert(
        "bravo".to_string(),
        vec![Span::Text {
            content: "charlie ".to_string(),
        }],
    );
    let flags = vec!["delta".to_string()];
    let left = Span::Code {
        attrs,
        flags,
        spans: vec![Span::Text {
            content: "alfa".to_string(),
        }],
    };
    let right = code_span(source).unwrap().1;
    assert_eq!(left, right);
}

figure 4

Compare that to the TEST_DATA contents for the same test:

``alfa|bravo: charlie|delta``

~~~~~~

{ 
    "category": "code", 
    "attrs": {
        "bravo": [
            {
                "category": "text", 
                "content": "charlie"
            }
        ]
    }, 
    "flags": ["delta"], 
    "spans": [
        {
            "category": "text", 
            "content": "alfa"
        }
    ]
}

figure 5

Less complicated. Easier to type. Easier to read.

Specs For Everyone

There's an added bonus. The test cases can be used to validate other parsers.

The rust code in figures 1 and 2 is "implementation specific". Meaning, it's how my parser is generating the output. Other parsers will work in their own way. That doesn't matter. All they need to do is produce JSON that matches the tests.

A Full Example

Here's what putting everything together in practice looks like:

use serde::{Deserialize, Serialize};

fn main() {
    println!("Hello, world!");
}

#[derive(Debug, Deserialize, PartialEq, Serialize)]
#[serde(tag = "category", rename_all = "lowercase")]
pub enum ExampleSpan {
    Text { content: String },
}

pub fn parse_example(input: &str) -> ExampleSpan {
    let content = input
        .replace("`", "")
        .replace("\n", " ")
        .to_string();
    ExampleSpan::Text { content }
}

#[cfg(test)]
mod test {
    use super::*;
    use serde_json;

    const TEST_DATA: &str = r#"

``alfa``

~~~~~~

{ 
    "category": "text",
    "content": "alfa"
}

######

``alfa
bravo``

~~~~~~

{ 
    "category": "text",
    "content": "alfa bravo"
}

######

``intentional error``

~~~~~~

{ 
    "category": "text",
    "content": "throw here"
}

    "#;

    #[test]
    fn run_tests() {
        let cases: Vec<_> =
            TEST_DATA.split("######").collect();
        for case in cases {
            let parts: Vec<_> = case
                .split("~~~~~~")
                .map(|c| c.trim())
                .collect();
            let left: ExampleSpan =
                serde_json::from_str(parts[1]).unwrap();
            let right = parse_example(parts[0]);
            assert_eq!(left, right);
        }
    }
}

figure 6

Output:

running 1 test
test test::run_tests ... FAILED

failures:

---- test::run_tests stdout ----

thread 'test::run_tests' panicked at src/main.rs:74:13:
assertion `left == right` failed
  left: Text { content: "throw here" }
 right: Text { content: "intentional error" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    test::run_tests

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

figure 7

There are three test cases in the TEST_DATA. The first two pass. The last one is an intentional failure to demonstrate the error message shown in the output in figure 7.

Notes

This example uses serde and serde_json to map the data between Rust and JSON. These are added in the Cargo.toml file with:
```
[dependencies]
serde = { version = "1.0.219", features = ["derive"] }
serde_json = "1.0.140"
```
I expect there are other things you can use. I've never had the need to look beyond serde.
Running the tests is done with cargo test.
The example has a main() function that prints out Hello, world!. It's not required for the test runner. The only reason it's there is because I didn't this sample in main.rs. If you don't put a main() function in that file the compiler yells at you.
The output in figure 7 includes running 1 test and test result: FAILED. 0 passed; 1 failed; That's because everything happens inside the single run_tests() function (i.e. it doesn't matter how many cases are in the TEST_DATA. It's always reported as a single test).

Building The Suite

I don't know how many test will end up in Neopolitan. Wouldn't surprise me if it breaks a thousand. Using this approach makes them easier to build. Adding the fact that it provides tests for other parsers makes it hard to beat.

Now, back to writing them.

-a

end of line

Footnotes

1 ⤴

Neopolitan plain-text format

The file format I designed to manage website content. It's like markdown on steroids.

2 ⤴

The first one that'll act as the standard to evaluate other parsers with.

3 ⤴

Giving the same input to different Markdown parsers results in different outputs. My goal is to prevent that with Neopolitan. I'm defining as many specific behaviors as I can think of to support that.