Creating and iterating over tuples in D

Author: Dr Chibisi Chima-Okereke Created: September 11, 2020 08:02:44 GMT Published: September 11, 2020 12:32:24 GMT

What is a tuple?

Tuples in D are a useful flexible alternative to structs and classes, some might say that they are a “poor man’s object”, but they are more than that. They are a convenient and easy to use container for items that are not atomic in type, however, they have two important issues that the programmer needs to be aware of, or else tuples can seem difficult to work with. The first issue is that tuples can not be returned by functions. The second is that they can not be iterated over like a regular array, because they are not arrays. The items in a tuple are not atomic and so can have unequal sizes, therefore their indexing access patterns are different from arrays. This article shows how to construct and iterate over tuples in a trouble free manner.

For readers new to compile time programming in D, a previous article gives a detailed introduction to the topic if such a reference is required.

Creating a tuple

D has an internal notion of a tuple but it is not immediately available to the programmer. Something extra has to be done before a tuple can be created. To create a tuple, we can either use the std.typecons module in the standard library, which provides an object a bit more complex than a base tuple, or we can brew our own using template functions, structs, classes, or general template expressions. In this article, we will only look at examples using template functions and structs, to create tuples.

Creating a tuple using a function

The basic notion of type sequences such as AliasSeq!(T...) was discussed in a previous article, this construct actually defines the type of a tuple in D. Just like any other static type, the types in the sequence defining the tuple must be known at compile time. The tuple itself can be runtime or compile time depending on how we instantiate it. Consider the function below:

import std.conv: to;
import std.stdio: writeln;

auto print(T...)(T x)
{
  pragma(msg, "Tuple of type: ", T);
  writeln("Tuple: ", x);
}

void main()
{
  print("New York", "New York", 40.712772, -74.006058, 8_399_000);
}

If we compile this, we get the output:

Tuple of type: (string, string, double, double, int)
Tuple: New YorkNew York40.7128-74.00618399000

The first line we get is a compile time message from pragma(msg, "Tuple of type: ", T), which shows the type sequence of the tuple. The second line is printed at runtime showing the contents of our tuple. The internal tuple construct is so “primitive” that the fields of a tuple are not even delimited with commas. It may be a good thing because it alerts the user that they are dealing with a very primitive construct, and that they should use the standard library for dealing with tuples, or write their own code to access tuple functionality.

As mentioned before tuples can not be returned by functions, if we try amending our print function to:

auto print(T...)(T x)
{
  pragma(msg, "Tuple of type: ", T);
  writeln("Tuple: ", x);
  return x;
}

we get the following error

Error: functions cannot return a tuple

If we can’t return a tuple from a function, how do we work with tuples? Even though tuples can not directly be returned in functions, we can put a tuple in a struct or class and return those.

Creating a tuple using a struct

Below, we implement a row in a data table using a tuple to denote the different types in each column.

import std.conv: to;
import std.stdio: writeln;

template Row(T...)
{
  struct Row
  {
    size_t length()
    {
      return T.length;
    }
    T row; //this is the tuple
    string[T.length] fields;
    this(string[] fields, T row)
    {
      this.row = row;
      this.fields = fields[];
    }
  }
}

Row!(T) makeRow(T...)(string[] fields, T row)
{
  return Row!(T)(fields, row);
}

void main()
{
  auto ny = makeRow(["City", "State", "Lat", "Long", "Pop"], 
          "New York", "New York", 40.712772, -74.006058, 8_399_000);
  writeln(ny);
}

compiling the above code gives the output:

Row!(string, string, double, double, int)("New York", "New York", 40.7128, -74.0061, 8399000, ["City", "State", "Lat", "Long", "Pop"])

which is a much nicer and more informative output than the internal tuple print. Note that Row!(T) makeRow(T...)(string[] fields, T row){\* ... code ... *\} is an easy alternative constructor interface, struct/class constructors do not have type inference but functions do.

Creating tuples using classes is done with essentially the same pattern as with structs, and this will not be shown in this article.

Iterating over a tuple

At compile time iteration can be done using a static foreach loop, recursive templates, or using CTFE, and all the data created are constants. Since this style is always used at compile time it is consistent. Iterating over a tuple at runtime is less straight forward because though the instance of the tuple is mutable, its indexing is fixed and must be accessed using constants. This is the case we will be focusing on here, and we shall show that you actually need to use a static foreach loop rather than a regular for/foreach loop for indexing.

Let’s say I wanted to write my own print function for the above Row!(T...) type. I could overload the toString(...) method available for all objects, but it is more instructive to look at it as a separate function altogether which we can name print. Below is a naive implementation of a print template function:

auto print(I: Row!(T), T...)(I x)
{
  string repr = "(";
  for(size_t i = 0; i < x.length; ++i)
  {
    repr ~= `"` ~ x.fields[i] ~ `" => ` ~ to!(string)(x.row[i]) ~ `, `;
  }
  enum last = T.length - 1;
  repr ~= `"` ~ x.fields[last] ~ `" => ` ~ to!(string)(x.row[last]) ~ `)`;
  writeln("Row: ", repr);
}

We can then add print(ny); to the main function, getting the error:

Error: variable i cannot be read at compile time

at first glance, this message can be a little confusing. Our tuple x is clearly a runtime variable when we declared it, so why is D indicating that the iterate i needs to be known at compile time? This is because tuple x is not an array, and we can not iterate it using runtime i, at least not in the current version of D DMD64 D Compiler v2.090.1, but we can remedy this using the static foreach compile time range construct:

auto print(I: Row!(T), T...)(I x)
{
  string repr = "(";
  static foreach(i; 0..(T.length - 1))
  {
    repr ~= `"` ~ x.fields[i] ~ `" => ` ~ to!(string)(x.row[i]) ~ `, `;
  }
  enum last = T.length - 1;
  repr ~= `"` ~ x.fields[last] ~ `" => ` ~ to!(string)(x.row[last]) ~ `)`;
  writeln("Row: ", repr);
}

which gives the required output:

Row: ("City" => New York, "State" => New York, "Lat" => 40.7128, "Long" => -74.0061, "Pop" => 8399000)

This resolves the situation because i is now known at compile time and our row object x is still runtime, which in this case is what we want. The static foreach loop here essentially does a loop unroll of the instructions so that for instance at i = 0, the code reads as:

repr ~= `"` ~ x.fields[0] ~ `" => ` ~ to!(string)(x.row[0]) ~ `, `;

an so on.

For objects based on tuples, the current length method:

size_t length()
{
  return T.length;
}

can be redefined as:

enum size_t length = T.length;

this means that we can change the static foreach loop in the print function to:

auto print(I: Row!(T), T...)(I x)
{
  string repr = "(";
  static foreach(i; 0..(x.length - 1))
  {
    repr ~= `"` ~ x.fields[i] ~ `" => ` ~ to!(string)(x.row[i]) ~ `, `;
  }
  enum last = T.length - 1;
  repr ~= `"` ~ x.fields[last] ~ `" => ` ~ to!(string)(x.row[last]) ~ `)`;
  writeln("Row: ", repr);
}

which works because x.length is an enum known at compile time.

Summary

Tuples in D are a useful and flexible tool, but their restrictions can lead to confusion concerning how to make use of them in routine programming tasks. In this article, we presented useful techniques for getting around these issues by showing how tuples can be created and returned, and how to iterate over them to make programming with tuples in D a straightforward task.