I'm using the nom parser to parse a language. On top of that I'm using nom_supreme for some quality of life improvements (e.g. error handling).
It is going well, but I'm stuck on one puzzle which I'm hoping that someone can help me with.
First for some background, the nom tag
function returns a parser that consumes a string. For example:
fn parser(s: &str) -> IResult<&str, &str> {tag("Hello")(s)}assert_eq!(parser("Hello, World!"), Ok((", World!", "Hello")));
nom_supreme has a drop in equivalent with the same name that has some errors handling improvements (mainly that it embeds the tag in the error).
The the function signature is similar (I've reordered some of the types to make them easier to compare):
// nom_supreme https://github.com/Lucretiel/nom-supreme/blob/f5cc5568c60a853e869c784f8a313fb5c6151391/src/tag.rs#L93pub fn tag<T, I, E>(tag: T) -> impl Clone + Fn(I) -> IResult<I, I, E>whereT: InputLength + Clone,I: InputTake + Compare<T>,E: TagError<I, T>
vs
// nom https://github.com/rust-bakery/nom/blob/90d78d65a10821272ce8856570605b07a917a6c1/src/bytes/complete.rs#L32pub fn tag<T, I, E>(tag: T) -> impl Fn(I) -> IResult<I, I, E>whereT: InputLength + Clone,I: Input + Compare<T>,E: ParseError<I>{
At a superficial level, they work the same. The difference occurs when I use the nom_supreme
parser in a closure.
This example with nom
compiles:
pub fn create_test_parser(captured_tag: &str) -> impl FnMut(&str) -> AsmResult<String> + '_ {move |i: &str| {let captured_tag_parser = nom::bytes::complete::tag(captured_tag);let (i, parsed_tag) = captured_tag_parser(i)?;Ok((i, String::from(parsed_tag)))}}
but this example with nom_supreme
fails with an error:
lifetime may not live long enoughreturning this value requires that
'1
must outlive'static
pub fn create_test_parser(captured_tag: &str) -> impl FnMut(&str) -> AsmResult<String> + '_ {move |i: &str| {let captured_tag_parser = nom_supreme::tag::complete::tag(captured_tag);let (i, parsed_tag) = captured_tag_parser(i)?;Ok((i, String::from(parsed_tag)))}}
I've tried:
- Cloning "captured_tag" -> got a "using 'clone' on a double reference" error
- captured_tag.to_owned() -> got a "returns a value referencing data owned by the current function" error
- Cloning "captured_tag" in outer scope -> same lifetime error
- captured_tag.to_owned() in outer scope -> got a "captured variable cannot escape FnMut" error
- Using "Arc", this works! but why do I need to resort to higher level memory management when the standard nom tag function works
I feel like I'm missing some way to transfer the ownership of that string into the closure. The string getting passed into create_test_parser
is just a string literal, so it shouldn't really have a lifetime tied to the caller.
If you want to play around with it, a stripped down example project is at: https://github.com/NoxHarmonium/nom-parser-stack-overflow-example/tree/main
Best Answer
While I was deconstructing the problem to ask the question I managed to solve my own issue. The solution is in the error message.
returning this value requires that '1 must outlive 'static
I just need to make sure that the string getting captured had a static lifetime, which is fine for my use case because all the inputs to the function are string literals.
pub fn create_test_parser(captured_tag: &'static str) -> impl FnMut(&str) -> AsmResult<String> + '_ {move |i: &str| {let captured_tag_parser = nom_supreme::tag::complete::tag(captured_tag);let (i, parsed_tag) = captured_tag_parser(i)?;Ok((i, String::from(parsed_tag)))}}
However, this might not solve the problem for everyone, and I'm still unsure what the difference is between the nom
version of tag
and the nom_supreme
version which causes this requirement. I'd love it if someone had a more insightful answer!
Apparently this is obvious, but here is my analysis of this:
pub fn tag<T, I, E>(tag: T) -> impl Clone + Fn(I) -> IResult<I, I, E>whereT: InputLength + Clone,I: InputTake + Compare<T>,E: TagError<I, T>
The thing to note here is the bound E: TagError<I, T>
. You pass ErrorTree<&'_ str>
in as E
. ErrorTree<&'_ str>
is a type alias for GenericErrorTree<&'_ str, &'static str, &'static str, Box<dyn Error + Send + Sync + 'static>>
.
The trait TagError<I, T>
is implemented for GenericErrorTree<I, T: AsRef<[u8]>, C, E>
. Notice: Because of the way that ErrorTree
is defined, the T
here is inferred to be &'static str
. T
is also the type that the tag
function takes as input, i.e the type of captured_tag
, so it's forced to be &'static str
.
Now, I don't know anything about nom_supreme
so there might be very good reasons for this to be the case. The docs even say "T
is typically something like &'static str
or &'static [u8]
."
But: you can define your own ErrorTree
alias that's generic over T
:
type ErrorTree<I, T> = nom_supreme::error::GenericErrorTree<I,T,&'static str,Box<dyn std::error::Error + Send + Sync + 'static>,>;
and consequently
pub type AsmResult<'a, 'b, O> = IResult<&'a str, O, ErrorTree<&'a str, &'b str>>;
and your function signature becomes
pub fn create_test_parser<'a, 'b>(captured_tag: &'a str,) -> impl FnMut(&'b str) -> AsmResult<'b, 'a, String> + 'a
and it works.
Again, this might break some other nom_supreme
API down the pipeline, I don't know, but these lifetimes look and feel more correct to me.