The invisible macro transcriber constraint

2023-09-25 - (3 min read)

Disclaimer: this post is rather small. I don't have time to expand further.

Introduction

I always described macro rules as a bunch of "tokens to tokens" functions. For instance, rustdoc generates the following documentation for anyhow! macro:

// https://docs.rs/anyhow/1.0.75/anyhow/macro.anyhow.html
macro_rules! anyhow {
    ($msg:literal $(,)?) => { ... };
    ($err:expr $(,)?) => { ... };
    ($fmt:expr, $($arg:tt)*) => { ... };
}

I read the first rule as a function that generates a piece of AST from a literal. Likewise, I represented the second rule as a function that generates a piece of AST from an expression, and so on.

The problem

Let's take a smaller example that better suits this blogpost:

macro_rules! foo {
    ($($id:ident)*) => { ... };
}

I read this rule as a function that generates a piece of AST from zero or more identifiers. In other words, the only repetition constraints are stated in the macro matcher itself.

Well, well, well. WELL.

I was wrong.

It turns out that it is possible to add some repetition constraints in the macro transcriber itself. For instance, we can provide a transcriber that requires at least one identifier to be passed to our foo macro:

macro_rules! foo {
    ($($id:ident)*) => {{
        $( foo!(@discard $id); )+ // <-- ⚠️
    }};
    
    // Don't look at this - it's just a clean way to discard tokens.
    (@discard $tt:tt) => {}
}

By using a + as a repetition operator in the transcription, we added a new constraint (there must be at least one identifier) that is not represented in the macro matcher (and not shown in rustdoc).

Let's test it with different amounts of identifiers:

fn main() {
    foo!();
    foo!(a);
    foo!(a b);
    foo!(a b c);
}

Compiling this emits the following error (playground link):

error: this must repeat at least once
 --> src/main.rs:3:10
  |
3 |         $( foo!(@discard $id); )+ // <-- ⚠️
  |          ^^^^^^^^^^^^^^^^^^^^^^^

This shows that the assumption that all the repetition constraints are stated in the macro matcher is wrong, which means that we can't trust the documentation generated by rustdoc to tell if a macro invocation matches a given set of repetition constraints.

Closing thoughts (wait no)

This kind of pattern is quite easy to spot. It would be great to have a tool that checks that the repetition operator defined in the macro matcher matches the repetition operator defined in the macro transcriber. A tool with a silly pun in its name, with a huge picture of an American actor in its README.

Community feedback

Somehow this article was reposted to the Rust Zulip, where more experienced people made interesting comments. Here's a summary:

There are other sources of constraints, which are not shown in the macro matcher either. For instance, a macro may expand to a macro call, which may have its own constraints.
The meta_variable_misuse lint does pretty much what I wanted to implement at first. It is not enabled by default because it can lead to false positives and false negatives. More infos in rust-lang/rust#61053 (comment).