Using JSON Data for a Test Runner
Preface
I happen to be doing this in Rust. The same approach can be used in most any language.
More Than One
I'm writing test cases for Neopolitan's reference parser1,2. There's hundreds of them3. Making individual tests for each case is a pain. I ended up building a test runner that looks like this:
It parses a TEST_DATA
string that contains the raw material for multiple tests. The tests are split up. Each one is run individually.
Data Under Test
TEST_DATA
is a big string of test cases. The contents for each case are separated by ######
tokens. Inside each test case, the data to test and the expected output are separated by ~~~~~~
tokens.
Here's an example with two tests in it:
``alfa`` ~~~~~~ { "content": "alfa" } ###### ``alfa bravo`` ~~~~~~ { "content": "alfa bravo" }
The key feature of the run_tests
function is using serde_json
to convert the JSON strings into the appropriate types for the test.
Original Approach
This is a lot nicer than what I was doing before. It required making a full #[test]
for each case. They looked like this:
That individual case is a bit shorter than the code for the runner. But, you have to make one for each test case. The duplication adds up fast.
The Bigger They Are
That really starts to suck with more complicated types. Take this code from the parser tests as an example:
Compare that to the TEST_DATA
contents for the same test:
``alfa|bravo: charlie|delta`` ~~~~~~ { "category": "code", "attrs": { "bravo": [ { "category": "text", "content": "charlie" } ] }, "flags": ["delta"], "spans": [ { "category": "text", "content": "alfa" } ] }
Less complicated. Easier to type. Easier to read.
Specs For Everyone
There's an added bonus. The test cases can be used to validate other parsers.
The rust code in figures 1 and 2 is "implementation specific". Meaning, it's how my parser is generating the output. Other parsers will work in their own way. That doesn't matter. All they need to do is produce JSON that matches the tests.
A Full Example
Here's what putting everything together in practice looks like:
use ;
running 1 test
test test::run_tests ... FAILED
failures:
---- test::run_tests stdout ----
thread 'test::run_tests' panicked at src/main.rs:74:13:
assertion `left == right` failed
left: Text { content: "throw here" }
right: Text { content: "intentional error" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
test::run_tests
test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
There are three test cases in the TEST_DATA
. The first two pass. The last one is an intentional failure to demonstrate the error message shown in the output in figure 7.
Notes
-
This example uses
serde
andserde_json
to map the data between Rust and JSON. These are added in theCargo.toml file with:[dependencies] serde = { version = "1.0.219", features = ["derive"] } serde_json = "1.0.140"
I expect there are other things you can use. I've never had the need to look beyond
serde
. -
Running the tests is done with
cargo test
. -
The example has a
main
function that prints outHello, world!
. It's not required for the test runner. The only reason it's there is because I didn't this sample inmain.rs
. If you don't put amain
function in that file the compiler yells at you. -
The output in figure 7 includes
running 1 test
andtest result: FAILED. 0 passed; 1 failed;
That's because everything happens inside the singlerun_tests
function (i.e. it doesn't matter how many cases are in theTEST_DATA
. It's always reported as a single test).
Building The Suite
I don't know how many test will end up in Neopolitan. Wouldn't surprise me if it breaks a thousand. Using this approach makes them easier to build. Adding the fact that it provides tests for other parsers makes it hard to beat.
Now, back to writing them.
-a
Footnotes
The file format I designed to manage website content. It's like markdown on steroids.
The first one that'll act as the standard to evaluate other parsers with.
Giving the same input to different Markdown parsers results in different outputs. My goal is to prevent that with Neopolitan. I'm defining as many specific behaviors as I can think of to support that.