Home
Head's Up: I'm in the middle of upgrading my site. Most things are in place, but there are something missing and/or broken including image alt text. Please bear with me while I'm getting things fixed.

Split A String On A Separator With Escapes In Rust With nom

-- warning

These are getting started notes

There's an issue with other escape characters being
picked up that needs to be addrssed. 

Look at the link seciton of the snippet.rs file 
for Neopolitan for an example of what I ended up with


-- todo

[] Look at this code to see if it can do what you need directly

-- code
-- rust

escaped_transform(none_of("\\|"), '\\', value("|", tag("|"))),




-- h2


Original Notes To Review


-- note

This works, but the above might be a simpler approach


-- p

This is what I'm using to parse out strings that use ``|``
characters as separators while allowing them to be escaped
with ``\\``


-- code
-- rust

use nom::branch::alt;
use nom::bytes::complete::escaped_transform;
use nom::bytes::complete::tag;
use nom::bytes::complete::take_until;
use nom::character::complete::none_of;
use nom::combinator::eof;
use nom::combinator::rest;
use nom::combinator::value;
use nom::multi::many_till;
use nom::sequence::tuple;
use nom::IResult;
use nom::Parser;

fn main() {
    test1();
    test2();
    println!("done");
}

fn test1() {
    let source = "Lift|the|stone|up|high";
    let expected = vec!["Lift", "the", "stone", "up", "high"];
    let result = split_on_separator_with_escapes(source, "|");
    assert_eq!(expected, result.unwrap().1);
}

fn test2() {
    let source = "Dip\\|the|pail|in\\|the|water";
    let expected = vec!["Dip|the", "pail", "in|the", "water"];
    let result = split_on_separator_with_escapes(source, "|");
    assert_eq!(expected, result.unwrap().1);
}

fn split_on_separator_with_escapes<'a>(
    source: &'a str,
    separator: &'a str,
) -> IResult<&'a str, Vec<String>> {
    let mut separator_with_escape = String::from("\\");
    separator_with_escape.push_str(separator);
    let (_, items) = many_till(
        alt((
            tuple((
                escaped_transform(
                    none_of(separator_with_escape.as_str()),
                    '\\',
                    value(separator, tag(separator)),
                ),
                tag(separator),
            ))
            .map(|x| x.0.to_string()),
            tuple((take_until(separator), tag(separator))).map(|x: (&str, &str)| x.0.to_string()),
            rest.map(|x: &str| x.to_string()),
        )),
        eof,
    )(source)?;
    Ok(("", items.0))
}


Seems like there's probably or more effificent way to do it. 
I got this working though, so I'm rolling with it. 


-- ref
-- id: nom
-- title: The nom Parser Combinator Library 
-- url: https://github.com/rust-bakery/nom

"Eating data byte by byte". This is the Rust library I'm using 
to process my Neopolitan documents for my site.