Cracking This Nut

4 minute read

57 Days Until I Can Walk

I’ve made some good progress on the procedural macros workshop today. I think for the first time since I started this little exercise, I am actually feeling reasonably confident about my understanding of macros.

I’ve been able to abstract some simple tasks such as defining an Ident struct for the builder, or ensuring that we can extract named fields from the struct into two declarative macros. It would be nice if there were a simple way to check the types of the inputs to the macro (I am aware of ty. That’s not what I’m looking for). But I suppose because the macro’s code will be substituted directly into the AST, if there is a syntax error brought on by incorrect typing, the compiler will still pick it up.

macro_rules! extract_struct_fields {
  ($expr:expr) => {
    match &$expr.data {
      Data::Struct(s) => &s.fields,
      Data::Enum(_)   => panic!("Builder not supported on enum type"),
      Data::Union(_)  => panic!("Builder not supported on union type"),
    }
  }
}

macro_rules! generate_builder_ident {
  ($expr:expr) => {
    Ident::new(&format!("{}Builder", $expr.ident), Span::call_site())
  }
}

In both cases I am assuming that the type of the input is DeriveInput. As I say, I’m not sure if there is any better way to enforce this.

I’ve gone back and rewritten a lot of my initial builder solution, this time breaking everything up into smaller functions. My lack of understanding earlier in the week meant that I was trying to cram everything into a single call to quote!, but after taking a step back I have been able to break everything up into smaller, distinct chunks which I think leads to much more readable, easier to maintain code.

A few nice, simple patterns have emerged as well. For example, there is some conditional logic involved in assigning the builder fields to the concrete instance’s fields depending on whether or not a field is optional. There doesn’t seem to be any kind of conditional logic in quote!, so this needs to be handled in Rust. I found that the following little pattern worked quite well—using an iterator to generate individual TokenStreams for each field, depending on how it should be handled, then pushing those TokenStreams into a single parent stream which is returned.

fn define_struct_field_assignments(fields: &Fields) -> TokenStream {
  let field_assignments = fields.iter().map(|field| {
    let ident = &field.ident;

    if is_optional(&field) {
      quote! { #ident: self.#ident.clone(), }
    } else {
      quote! { #ident: self.#ident.clone().ok_or(stringify!(#ident must be set))?, }
    }
  });

  TokenStream::from_iter(field_assignments)
}

This function can then be called from elsewhere in the macro, and the resulting tokens can be included in a new TokenStream

fn define_build_method(input: &DeriveInput) -> TokenStream {
  let ident       = &input.ident;
  let field_asmts = define_struct_field_assignments(input);

  quote! {
    pub fn build(&mut self) -> Result<#ident, Box<dyn std::error::Error>> {
      Ok(#ident {
        #field_asmts
      })
    }
  }
}

A combination of little patterns like this and better modularization of the code have (of course) made it much easier to extend my code to meet David’s expanding requirements in the workshop. The full solution currently looks as follows. With some improvements to naming conventions, this feels like it is approaching a much better solution than the one I posted previously:

use proc_macro2::{ Span, Ident, TokenStream };
use quote::{ quote };
use syn::{ parse_macro_input, Data, DeriveInput, Field, Fields, GenericArgument, PathArguments, Type };

macro_rules! extract_struct_fields {
  ($expr:expr) => {
    match &$expr.data {
      Data::Struct(s) => &s.fields,
      Data::Enum(_)   => panic!("Builder not supported on enum type"),
      Data::Union(_)  => panic!("Builder not supported on union type"),
    }
  }
}

macro_rules! generate_builder_ident {
  ($expr:expr) => {
    Ident::new(&format!("{}Builder", $expr.ident), Span::call_site())
  }
}

fn is_optional(field: &Field) -> bool { 
  if let Type::Path(path) = &field.ty {
    let segments = &path.path.segments;
    return segments.iter().any(|segment| {
      let ident = segment.ident.to_string();
      ident == "Option"
    });
  }

  false
}

fn get_setter_arg_type(field: &Field) -> &Type {
  if !is_optional(field) { return &field.ty }

  if let Type::Path(path) = &field.ty {
    for segment in &path.path.segments {
      if let PathArguments::AngleBracketed(bracketed) = &segment.arguments {
        if let GenericArgument::Type(ty) = &bracketed.args.first().unwrap() {
          return &ty;
        }
      }
    }
  }

  &field.ty
}

fn define_struct_field_assignments(input: &DeriveInput) -> TokenStream {
  let fields = extract_struct_fields!(input);

  let field_assignments = fields.iter().map(|field| {
    let ident = &field.ident;

    if is_optional(&field) {
      quote! { #ident: self.#ident.clone(), }
    } else {
      quote! { #ident: self.#ident.clone().ok_or(stringify!(#ident must be set))?, }
    }
  });

  TokenStream::from_iter(field_assignments)
}

fn define_build_method(input: &DeriveInput) -> TokenStream {
  let ident       = &input.ident;
  let field_asmts = define_struct_field_assignments(input);

  quote! {
    pub fn build(&mut self) -> Result<#ident, Box<dyn std::error::Error>> {
      Ok(#ident {
        #field_asmts
      })
    }
  }
}

fn define_field_setters(input: &DeriveInput) -> TokenStream {
  let fields = extract_struct_fields!(input);

  let setters = fields.iter().map(|field| {
    let ident = &field.ident;
    let ty    = get_setter_arg_type(&field);

    quote! {
      fn #ident(&mut self, #ident: #ty) -> &mut Self {
        self.#ident = Some(#ident);
        self
      } 
    }
  });

  TokenStream::from_iter(setters)
}

fn define_builder_methods(input: &DeriveInput) -> TokenStream {
  let builder_ident = generate_builder_ident!(input);
  let build         = define_build_method(input);
  let setters       = define_field_setters(input);

  quote! {
    impl #builder_ident {
      #setters
      #build
    }
  }
}

fn define_builder_fields(fields: &Fields) -> TokenStream {
  let field_definitions = fields.iter().map(|field| {
    let ident = &field.ident;
    let ty    = &field.ty;

    if is_optional(&field) {
      quote! { #ident: #ty, }
    } else {
      quote! { #ident: Option<#ty>, }
    }
  });

  TokenStream::from_iter(field_definitions)
}

fn define_builder_struct(input: &DeriveInput) -> TokenStream {
  let builder_ident   = generate_builder_ident!(input);
  let fields          = extract_struct_fields!(input);
  let builder_fields  = define_builder_fields(fields);

  quote! {
    pub struct #builder_ident {
      #builder_fields
    }
  }
}

fn define_builder_field_init(fields: &Fields) -> TokenStream {
  let field_inits = fields.iter().map(|field| {
    let ident = &field.ident;
    quote! { #ident: None, } 
  });

  TokenStream::from_iter(field_inits)
}

fn define_builder_constructor(input: &DeriveInput) -> TokenStream {
  let ident              = &input.ident;
  let builder_ident      = generate_builder_ident!(input);
  let fields             = extract_struct_fields!(input);
  let builder_field_init = define_builder_field_init(&fields);

  quote! {
    impl #ident {
      pub fn builder() -> #builder_ident {
        #builder_ident {
          #builder_field_init
        }
      }
    }
  }
}

#[proc_macro_derive(Builder, attributes(builder))]
pub fn derive(input: proc_macro::TokenStream) -> proc_macro::TokenStream {
  let input = parse_macro_input!(input as DeriveInput);

  let builder_struct      = define_builder_struct(&input);
  let builder_methods     = define_builder_methods(&input);
  let builder_constructor = define_builder_constructor(&input);

  let expanded = quote! {
    #builder_struct
    #builder_methods
    #builder_constructor
  };

  proc_macro::TokenStream::from(expanded)
}

That’s all for today folks, but it feels like we’re gradually approaching a (moderate) understanding of how Rust macros work.