Implementing mapply in D

Author: Dr Chibisi Chima-Okereke Created: October 27, 2020 20:20:32 GMT Published: October 28, 2020 02:42:04 GMT

Introduction

The mapply function in R takes in a varying number of arguments each of which is an R (atomic) vector or a list. It also takes in a function as one of its arguments. The idea is that a single element is taken from each argument and represents the input to the function. For instance the first element is taken from each of the vectors and this collection of elements becomes the input to the function which returns an item. This process is repeated over the vector(s) and results in a vector or list of output elements. So for a function \(f(\ldots)\) having \(p\) parameters, each of which is a vector \(arg_{j}\), where \(j = 1 \ldots p\), element \(i\) in the \(output\) vector is given by:

$$ f(arg_{1}[i], arg_{1}[i], \ldots, arg_{j}[i], \ldots, arg_{p}[i]) \Longrightarrow output[i] $$

Each argument vector \(arg_{j}\) can be of a different type to other arguments, for example argument \(arg_{j}\) could have a different element type to \(arg_{j + 1}\).

In D the closest analogue to vectors in R are arrays, so the terminology will change as we discuss the implementation of mapply in D. Compile time programming concepts are used extensively in ths article so if you are unfamiliar with these, check out an earlier article which covers the requisite knowledge.

Preliminaries

Here we implement a few templates some of which can be found in D’s std.meta library, but they are all pretty easy to implement so they are included here.

Firstly, the AliasSeq!(T...) template for creating a tuple of arguments that “spread out” as discussed in the aforementioned article.

alias AliasSeq(T...) = T;

Then a template to resolve whether a type is an array or not:

enum isArray(T) = false;
enum isArray(U: T[], T) = true;

a template to obtain the element type of an array type:

alias elementType(U: T[], T) = T;

a template type to resolve whether all the elements of a type sequence are arrays:

template allArrays(T...)
{
  static if(T.length != 0)
  {
    enum allArrays = isArray!(T[0]) && allArrays!(T[1..$]);
  }else{
    enum allArrays = true;
  }
}

a template named getElementTuple!(T...) that resolves what the tuple type of arguments for the function would be from a type tuple of array types.

template getElementTuple(T...)
{
  static if(T.length != 0)
  {
    alias getElementTuple = AliasSeq!(elementType!(T[0]), getElementTuple!(T[1..$]));
  }else{
    alias getElementTuple = AliasSeq!();
  }
}

The mapply function

The implementation of mapply presented here is not “production quality”, meaning that there was no effort to enforce parameter input quality. It works well for correct cases but is mostly for demonstration purposes.

Firstly the declaration:

auto mapply(alias fun, Args...)(Args args)
if(allArrays!(Args))
{/* code here */}

fun is the function on which we would like to run each set of arguments args in mapply. Stepping into the internals of the function we create the first element of the output:

//...
{
  alias E = getElementTuple!(Args);
  E firstArgs;
  static foreach(size_t i; 0..E.length)
  {
    firstArgs[i] = args[i][0];
  }
  auto first = fun(firstArgs);
  alias R = typeof(first);
  long n = args[0].length;
  R[] result = new R[n];
  result[0] = first;
  //....
}

in the above code, we use a static foreach to unroll the indexing of a tuple because each item in a tuple is statically accessed. We also obtain the element return type R and use it to create the return array. We run fun over all the collected tuples, and return the resulting array:

//...
{
  //...
  if(n > 1)
  {
    for(long i = 1; i < n; ++i)
    {
      E arg;
      static foreach(size_t j; 0..E.length)
      {
        arg[j] = args[j][i];
      }
      result[i] = fun(arg);
    }
  }
  return result;
}

That’s basically it. The whole function is given below:

auto mapply(alias fun, Args...)(Args args)
if(allArrays!(Args))
{
  alias E = getElementTuple!(Args);
  E firstArgs;
  static foreach(size_t i; 0..E.length)
  {
    firstArgs[i] = args[i][0];
  }
  auto first = fun(firstArgs);
  alias R = typeof(first);
  long n = args[0].length;
  R[] result = new R[n];
  result[0] = first;
  if(n > 1)
  {
    for(long i = 1; i < n; ++i)
    {
      E arg;
      static foreach(size_t j; 0..E.length)
      {
        arg[j] = args[j][i];
      }
      result[i] = fun(arg);
    }
  }
  return result;
}

Simple in essence, and shows how a little compile time programming can go a long way.

Demonstration

To demonstrate mapply, we present the following example:

import std.stdio: writeln;
struct Location
{
  string city;
  double latitude;
  double longitude;
  this(string city, double latitude, double longitude)
  {
    this.city = city;
    this.latitude = latitude;
    this.longitude = longitude;
  }
  Location opCall(string city, double latitude, double longitude)
  {
    return Location(city, latitude, longitude);
  }
}

Then run it on the following:

void main()
{
  auto capitals = AliasSeq!(["New York", "Paris", "London"], 
            [40.730610, 48.864716, 51.509865], 
            [-73.935242, 2.349014, -0.118092]);
  writeln("mapply on capitals: ", mapply!(Location)(capitals));
}

getting the output:

mapply on capitals: [Location("New York", 40.7306, -73.9352),
                      Location("Paris", 48.8647, 2.34901), 
                      Location("London", 51.5099, -0.118092)]

It’s that simple.

Conclusion

In practice there are very few instances in which a programmer actually requires dynamic behavior from their language. On the face of it writing a static version of the mapply function with the same functionality as that found in R can seem daunting to those unfamiliar with compile time programming. This article breaks down the steps required to implement such a function and shows that it is actually quite simple to accomplish the task. Programmers unversed in metaprogramming within static languages often assume that many more things can only be done with dynamic languages than is actually the case. With continually increasing sophistication and semantic power of static languages, the number of forms not practical using only static programming is small and diminishing. New and emerging static programming languages make these powerful semantic features easier to express.

Thank you for reading!