The D programming language semantics lends itself to building a wide range of application types, one of these is technical and scientific computing. However, the number of packages available for numerical computing is low. This article demonstrates how to create a basic numerical differentiation scheme in D using finite differences to calculate first and second partial derivatives when given an expression. We use template functions and some basic template declarations in this article, so if you are unfamiliar with these concepts please visit a previous article that covers them.
In this session, we need to define a number of items before we start writing finite difference code. Our examples use the following mathematical functions:
import core.stdc.math: sin = sinf, sin, sin = sinl,
cos = cosf, cos, cos = cosl,
pow = powf, pow, pow = powl;
We use the C-library math functions in the D compiler since they often perform better than their Phobos D counterparts. Define \(\pi\) as:
enum pi = 3.14159265358979323846;
In this scheme we will treat single and double precision (and above) separately. In fact, only the difference implementation for double
precision and above will be presented. Below we have some basic traits to differentiate the various relevant floating points.
enum isFloat(T) = is(T == float);
enum isHighFloat(T) = is(T == double) || is(T == real);
enum isFloatingPoint(T) = is(T == float) || is(T == double) || is(T == real);
Finite difference uses steps \(h\), which is different for single float compared with more precise float types. Some references use a value of the square root of machine epsilon for double float type and others use the cube root of the machine epsilon. Below is our step
implementation of the step function for double and higher precision:
template step(T)
if(isHighFloat!(T))
{
enum T step = (T.epsilon)^^0.5;
}
For single precision, the step
implementation below needs to be multiplied by the function parameter \(x\) before it becomes the actual step:
template step(T)
if(is(T == float))
{
enum T step = T.epsilon^^(0.5f);
}
For more details see the wikipedia article on numerical differentiation.
Many different books on Numerical Analysis discuss finite different schemes. The difference equations given here are adapted from chapter 8 of Numerical methods for engineers and scientists, 3rd Edition, by Amos Gilat, and Vish, Subramaniam.
The simple univariate equation to numerically evaluate the two-point central difference derivative for function \(f(x)\)at \(x = x^{(k)}\) for the first derivative is given by:
$$ \frac{\partial f(x)}{\partial x} \bigg|_{x = x^{(k)}} \approx \frac{f(x^{(k + 1)}) - f(x^{(k - 1)})}{2h} $$
where \(x^{(k + 1)} = x^{(k)} + h\) and \(x^{(k - 1)} = x^{(k)} - h\).
For the partial derivative of a single component of a multivariable \(\textbf{x}^{(k)} = \left(x_{1}^{(k)}, x_{2}^{(k)}, \ldots, x_{i}^{(k)}, \ldots, x_{n}^{(k)}\right)\), where the subscript \(i\) denotes the position in the array and the superscript \(k\) denotes the value \(x = x^{(k)}\), we then have:
$$ \frac{\partial f(\textbf{x})}{\partial x_{i}} \bigg |_{\textbf{x} = \textbf{x}^{(k)}} \approx \frac{f(x_{1}^{(k)}, x_{2}^{(k)}, \ldots, x_{i}^{(k + 1)}, \ldots, x_{n}^{(k)}) - f(x_{1}^{(k)}, x_{2}^{(k)}, \ldots, x_{i}^{(k - 1)}, \ldots, x_{n}^{(k)})}{2h} $$
The D code for carrying out this numerical partial derivative is given below:
auto ND(alias fun, size_t order, T, U...)(T[] x, U u)
if(isHighFloat!(T) && (order == 1))
{
enum T h = step!(T); //evaluated at compile time
auto df = new T[x.length];
for(long i = 0; i < df.length; ++i)
{
auto xp = x.dup;
auto xm = x.dup;
xp[i] += h;
xm[i] -= h;
df[i] = (fun(xp, u) - fun(xm, u))/(2*h);
}
return df;
}
In the above code, we pass the function fun
as a compile time item, and we constraint the order
to be 1
for the first derivative, and T
to be one of double
(64 bit) or real
(80 bit) floats.
Let’s look at another example, a more accurate scheme than the above two point system. For brevity lets change notation so that \(f(x_{i}^{(k)}) = f(x_{1}^{(k)}, x_{2}^{(k)}, \ldots, x_{i}^{(k)}, \ldots, x_{n}^{(k)})\), the four-point central finite difference for partial differentiation, is given by:
$$ \frac{\partial f(\textbf{x})}{\partial x_{i}} \bigg |_{\textbf{x} = \textbf{x}^{(k)}} \approx \frac{f(x_{i}^{(k - 2)}) - 8f(x_{i}^{(k - 1)}) + 8f(x_{i}^{(k + 1)}) - f(x_{i}^{(k + 2)})}{12h} $$
so an alternative implementation of the first order partial difference formula is:
auto ND(alias fun, size_t order, T, U...)(T[] x, U u)
if(isHighFloat!(T) && (order == 1))
{
enum T h = step!(T);
auto df = new T[x.length];
for(long i = 0; i < df.length; ++i)
{
// Written this way for clarity
auto x2m = x.dup;
auto x1m = x.dup;
auto x1p = x.dup;
auto x2p = x.dup;
x2p[i] += 2*h;
x2m[i] -= 2*h;
x1p[i] += h;
x1m[i] -= h;
df[i] = (fun(x2m, u) - 8*fun(x1m, u) + 8*fun(x1p, u) - fun(x2p, u))/(12*h);
}
return df;
}
We shall present one scheme for evaluating numerical second order partial derivatives. This scheme uses three points for the diagonal elements and four points for the off-diagonal elements. Also we shall use D’s nested arrays as the matrix output in this exercise (for brevity of implementation).
The difference equation for the diagonal (\(i\)) elements of the curvature (Hessian) matrix is given by:
$$ \frac{\partial f^{2}(\textbf{x})}{\partial x_{i}^2} \bigg |_{\textbf{x} = \textbf{x}^{(k)}} \approx \frac{f(x_{i}^{(k - 1)}) - 2f(x_{i}^{(k)}) - f(x_{i}^{(k + 1)})}{h_i^2} $$
the off-diagonal (\(i, j\)) terms are given by:
$$ \frac{\partial f^{2}(\textbf{x})}{\partial x_{i}\partial x_{j}} \bigg |_{\textbf{x} = \textbf{x}^{(k)}} \approx \frac{[f(x_{i}^{(k + 1)}, x_{j}^{(k + 1)}) - f(x_{i}^{(k - 1)}, x_{j}^{(k + 1)})] - [f(x_{i}^{(k + 1)}, x_{j}^{(k - 1)}) - f(x_{i}^{(k - 1)}, x_{j}^{(k - 1)})]}{2h_{i} 2h_{j}} $$
which is implemented below:
auto ND(alias fun, size_t order, T, U...)(T[] x, U u)
if(isHighFloat!(T) && (order == 2))
{
enum T h = step!(T)^^0.5;
enum T hsq = h*h;
auto curv = new T[][](x.length, x.length);
for(long i = 0; i < x.length; ++i)
{
for(long j = i; j < x.length; ++j)
{
if(i == j)
{
auto xp = x.dup; auto xm = x.dup;
xp[i] += h; xm[i] -= h;
auto f1m = fun(xm, u);
auto f1p = fun(xp, u);
auto f = fun(x, u);
auto tmp = (f1m - 2*f + f1p)/hsq;
curv[i][j] = tmp;
}else{
auto x1 = x.dup;
x1[i] += h;
x1[j] += h;
auto x2 = x.dup;
x2[i] -= h;
x2[j] += h;
auto x3 = x.dup;
x3[i] += h;
x3[j] -= h;
auto x4 = x.dup;
x4[i] -= h;
x4[j] -= h;
auto f1 = fun(x1, u);
auto f2 = fun(x2, u);
auto f3 = fun(x3, u);
auto f4 = fun(x4, u);
auto tmp = ((f1 - f2) - (f3 - f4))/(4*hsq);
curv[i][j] = tmp;
curv[j][i] = tmp;
}
}
}
return curv;
}
In this first example, we look at the first and second partial differentials for the equation:
$$ f(\textbf{x}) = 50 - x_{0}^{2} - 2x_{1}^{2} $$
the first order partial derivative is given by the equation:
$$ \frac{\partial f(x)}{\partial x_0} = -2x_{0}; \qquad \frac{\partial f(x)}{\partial x_1} = -4x_{1} $$
the second order partial derivative is given by the equation:
$$ \frac{\partial^{2} f(x)}{\partial x_0^2} = -2; \qquad \frac{\partial^{2} f(x)}{\partial x_{0} \partial x_{1}} = 0 $$
$$ \frac{\partial^{2} f(x)}{\partial x_{1}\partial x_{0}} = 0; \qquad \frac{\partial^{2} f(x)}{\partial x^{2}_{1}} = -4 $$
The code for the above example is given below:
double[] point = [10.0, 10];
auto f = (double[] x) => 50 - x[0]*x[0] - 2*x[1]*x[1]; //this is lambda in D
auto df = ND!(f, 1)(point);
writeln("First (partial) derivative: ", df);
auto cf = ND!(f, 2)(point);
writeln("Second (partial) derivative: ", cf);
output:
First (partial) derivative: [-20, -40]
Second (partial) derivative: [[-2, 0], [0, -4]]
From the above equations, clearly, the first order derivatives at [10.0, 10.0]
is given by:
$$ \frac{\partial f(\textbf{x})}{\partial x_0} = -20; \qquad \frac{\partial f(\textbf{x})}{\partial x_1} = -40 $$
clearly the output of second derivative compares well with the analytical values.
The equation to be differentiated in this second example is given below:
$$ f(\textbf{x}) = x_{0}\sin{x_{1}} + 1 $$
meaning that the first order partial differential is given by:
$$ \frac{\partial f(\textbf{x})}{\partial x_{0}} = \sin{x_{1}}; \qquad \frac{\partial f(\textbf{x})}{\partial x_{1}} = x_{0}\cos{x_{1}} $$
and the curvature matrix is given by:
$$ \frac{\partial^{2} f(\textbf{x})}{\partial x_0^2} = 0; \qquad \frac{\partial^{2} f(\textbf{x})}{\partial x_{0} \partial x_{1}} = \cos{x_{1}} $$
$$ \frac{\partial^{2} f(\textbf{x})}{\partial x_{1}\partial x_{0}} = \cos{x_{1}}; \qquad \frac{\partial^{2} f(\textbf{x})}{\partial x^{2}_{1}} = -x_{0}\sin{x_{1}} $$
The code for example 2 is given below:
point = [pi/2, pi/2];
f = (double[] x) => x[0]*sin(x[1]) + 1;
df = ND!(f, 1)(point);
cf = ND!(f, 2)(point);
writeln("First (partial) derivative: ", df);
writeln("Second (partial) derivative: ", cf);
which gives the output:
First (partial) derivative: [1, -4.96705e-09]
Second (partial) derivative: [[0, 0], [0, -1.5708]]
From the above equations, clearly, the first order derivatives at \([\frac{\pi}{2}, \frac{\pi}{2}]\) is given by:
$$ \frac{\partial f(\textbf{x})}{\partial x_0} = 1; \qquad \frac{\partial f(\textbf{x})}{\partial x_1} = 0 $$
and the curvature matrix is given by:
$$ \frac{\partial^{2} f(\textbf{x})}{\partial x_0^2} = 0; \qquad \frac{\partial^{2} f(\textbf{x})}{\partial x_{0} \partial x_{1}} = 0 $$
$$ \frac{\partial^{2} f(\textbf{x})}{\partial x_{1}\partial x_{0}} = 0; \qquad \frac{\partial^{2} f(\textbf{x})}{\partial x^{2}_{1}} = -\frac{\pi}{2} $$
which is (approximately) what we have in the numerical case.
The equation to be differentiated in this second example is given below:
$$ f(x) = cos(2\pi x_{0})\cos(2\pi x_{1})\cos(2 \pi x_{2})\cos(10(x_{0} + x_{1} + x_{2})) $$
The analytical equations of the gradient and curvature are too long to recite here (this is not a thesis, just a tutorial), but their code and outputs will be presented along with those of finite difference.
The code to calculate the finite difference for example 3 is given below:
point = [0.2, 0.5, 0.1];
f = (double[] x) => cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2]));
df = ND!(f, 1)(point);
writeln("First (partial) derivative: ", df);
cf = ND!(f, 2)(point);
writeln("Second (partial) derivative: ", cf);
which gives the output:
First (partial) derivative: [1.76999, 2.4734, 2.30734]
Second (partial) derivative: [[-100.733, -51.4672, -59.5472],
[-51.4672, -5.07351, -14.9286],
[-59.5472, -14.9286, -27.6556]]
The code to calculate the analytical first order partial derivative:
auto df1 = (double[] x) => -10*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2])) - 2*pi*sin(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2]));
auto df2 = (double[] x) => -10*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2])) - 2*pi*sin(2*pi*x[1])*cos(2*pi*x[0])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2]));
auto df3 = (double[] x) => -10*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2])) - 2*pi*sin(2*pi*x[2])*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(10*(x[0] + x[1] + x[2]));
writeln("First partial derivative: ", [df1(point), df2(point), df3(point)]);
which gives the output:
First partial derivative: [1.76999, 2.4734, 2.30734]
The code for the second order partial derivative is given below:
auto cf11 = (double[] x) => -4*(pi^^2)*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) - 100*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 40*pi*sin(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2]));
auto cf21 = (double[] x) => -100*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 4*(pi^^2)*sin(2*pi*x[0])*sin(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 20*pi*sin(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2])) + 20*pi*cos(2*pi*x[0])*sin(2*pi*x[1])*cos(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2]));
auto cf31 = (double[] x) => -100*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 4*(pi^^2)*sin(2*pi*x[0])*cos(2*pi*x[1])*sin(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 20*pi*sin(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2])) + 20*pi*cos(2*pi*x[0])*cos(2*pi*x[1])*sin(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2]));
auto cf12 = (double[] x) => -100*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 4*(pi^^2)*sin(2*pi*x[0])*sin(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 20*pi*sin(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2])) + 20*pi*cos(2*pi*x[0])*sin(2*pi*x[1])*cos(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2]));
auto cf22 = (double[] x) => -4*(pi^^2)*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) - 100*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 40*pi*cos(2*pi*x[0])*sin(2*pi*x[1])*cos(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2]));
auto cf32 = (double[] x) => -100*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 4*(pi^^2)*cos(2*pi*x[0])*sin(2*pi*x[1])*sin(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 20*pi*cos(2*pi*x[0])*sin(2*pi*x[1])*cos(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2])) + 20*pi*cos(2*pi*x[0])*cos(2*pi*x[1])*sin(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2]));
auto cf13 = (double[] x) => -100*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 4*(pi^^2)*sin(2*pi*x[0])*cos(2*pi*x[1])*sin(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 20*pi*sin(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2])) + 20*pi*cos(2*pi*x[0])*cos(2*pi*x[1])*sin(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2]));
auto cf23 = (double[] x) => -100*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 4*(pi^^2)*cos(2*pi*x[0])*sin(2*pi*x[1])*sin(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 20*pi*cos(2*pi*x[0])*sin(2*pi*x[1])*cos(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2])) + 20*pi*cos(2*pi*x[0])*cos(2*pi*x[1])*sin(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2]));
auto cf33 = (double[] x) => -4*(pi^^2)*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) - 100*cos(2*pi*x[0])*cos(2*pi*x[1])*cos(2*pi*x[2])*cos(10*(x[0] + x[1] + x[2])) + 40*pi*cos(2*pi*x[0])*cos(2*pi*x[1])*sin(2*pi*x[2])*sin(10*(x[0] + x[1] + x[2]));
writeln("Hessian: ", [[cf11(point), cf21(point), cf31(point)],
[cf12(point), cf22(point), cf32(point)],
[cf13(point), cf23(point), cf33(point)]]);
which gives the output:
Hessian: [[-100.733, -51.4672, -59.5472],
[-51.4672, -5.07353, -14.9286],
[-59.5472, -14.9286, -27.6556]]
so the numerical and analytical gradient vectors and curvature matrices agree with each other.
I hope that the contents of this article has persuaded the reader that writing numerical code in D is a reasonably easy and straightforward process. D’s template and meta-programming system lends itself easily to building such systems. Performance optimization considerations have been left out, and they will perhaps be visited some other time. I have had fun writing the code for this article and I hope to revisit this and other topics in numerical analysis in the future.
Thank you for reading.