I'm using the nom parser to parse a language. On top of that I'm using nom_supreme for some quality of life improvements (e.g. error handling).

It is going well, but I'm stuck on one puzzle which I'm hoping that someone can help me with.

First for some background, the nom tag function returns a parser that consumes a string. For example:

fn parser(s: &str) -> IResult<&str, &str> {tag("Hello")(s)}assert_eq!(parser("Hello, World!"), Ok((", World!", "Hello")));

nom_supreme has a drop in equivalent with the same name that has some errors handling improvements (mainly that it embeds the tag in the error).

The the function signature is similar (I've reordered some of the types to make them easier to compare):

// nom_supreme https://github.com/Lucretiel/nom-supreme/blob/f5cc5568c60a853e869c784f8a313fb5c6151391/src/tag.rs#L93pub fn tag<T, I, E>(tag: T) -> impl Clone + Fn(I) -> IResult<I, I, E>whereT: InputLength + Clone,I: InputTake + Compare<T>,E: TagError<I, T>

vs

// nom https://github.com/rust-bakery/nom/blob/90d78d65a10821272ce8856570605b07a917a6c1/src/bytes/complete.rs#L32pub fn tag<T, I, E>(tag: T) -> impl Fn(I) -> IResult<I, I, E>whereT: InputLength + Clone,I: Input + Compare<T>,E: ParseError<I>{

At a superficial level, they work the same. The difference occurs when I use the nom_supreme parser in a closure.

This example with nom compiles:

pub fn create_test_parser(captured_tag: &str) -> impl FnMut(&str) -> AsmResult<String> + '_ {move |i: &str| {let captured_tag_parser = nom::bytes::complete::tag(captured_tag);let (i, parsed_tag) = captured_tag_parser(i)?;Ok((i, String::from(parsed_tag)))}}

but this example with nom_supreme fails with an error:

lifetime may not live long enoughreturning this value requires that '1 must outlive 'static

pub fn create_test_parser(captured_tag: &str) -> impl FnMut(&str) -> AsmResult<String> + '_ {move |i: &str| {let captured_tag_parser = nom_supreme::tag::complete::tag(captured_tag);let (i, parsed_tag) = captured_tag_parser(i)?;Ok((i, String::from(parsed_tag)))}}

I've tried:

  1. Cloning "captured_tag" -> got a "using 'clone' on a double reference" error
  2. captured_tag.to_owned() -> got a "returns a value referencing data owned by the current function" error
  3. Cloning "captured_tag" in outer scope -> same lifetime error
  4. captured_tag.to_owned() in outer scope -> got a "captured variable cannot escape FnMut" error
  5. Using "Arc", this works! but why do I need to resort to higher level memory management when the standard nom tag function works

I feel like I'm missing some way to transfer the ownership of that string into the closure. The string getting passed into create_test_parser is just a string literal, so it shouldn't really have a lifetime tied to the caller.

If you want to play around with it, a stripped down example project is at: https://github.com/NoxHarmonium/nom-parser-stack-overflow-example/tree/main

2

Best Answer


While I was deconstructing the problem to ask the question I managed to solve my own issue. The solution is in the error message.

returning this value requires that '1 must outlive 'static

I just need to make sure that the string getting captured had a static lifetime, which is fine for my use case because all the inputs to the function are string literals.

pub fn create_test_parser(captured_tag: &'static str) -> impl FnMut(&str) -> AsmResult<String> + '_ {move |i: &str| {let captured_tag_parser = nom_supreme::tag::complete::tag(captured_tag);let (i, parsed_tag) = captured_tag_parser(i)?;Ok((i, String::from(parsed_tag)))}}

However, this might not solve the problem for everyone, and I'm still unsure what the difference is between the nom version of tag and the nom_supreme version which causes this requirement. I'd love it if someone had a more insightful answer!

Apparently this is obvious, but here is my analysis of this:

pub fn tag<T, I, E>(tag: T) -> impl Clone + Fn(I) -> IResult<I, I, E>whereT: InputLength + Clone,I: InputTake + Compare<T>,E: TagError<I, T>

The thing to note here is the bound E: TagError<I, T>. You pass ErrorTree<&'_ str> in as E. ErrorTree<&'_ str> is a type alias for GenericErrorTree<&'_ str, &'static str, &'static str, Box<dyn Error + Send + Sync + 'static>>.

The trait TagError<I, T> is implemented for GenericErrorTree<I, T: AsRef<[u8]>, C, E>. Notice: Because of the way that ErrorTree is defined, the T here is inferred to be &'static str. T is also the type that the tag function takes as input, i.e the type of captured_tag, so it's forced to be &'static str.

Now, I don't know anything about nom_supreme so there might be very good reasons for this to be the case. The docs even say "T is typically something like &'static str or &'static [u8]."

But: you can define your own ErrorTree alias that's generic over T:

type ErrorTree<I, T> = nom_supreme::error::GenericErrorTree<I,T,&'static str,Box<dyn std::error::Error + Send + Sync + 'static>,>;

and consequently

pub type AsmResult<'a, 'b, O> = IResult<&'a str, O, ErrorTree<&'a str, &'b str>>;

and your function signature becomes

pub fn create_test_parser<'a, 'b>(captured_tag: &'a str,) -> impl FnMut(&'b str) -> AsmResult<'b, 'a, String> + 'a

and it works.

Again, this might break some other nom_supreme API down the pipeline, I don't know, but these lifetimes look and feel more correct to me.