Implementing bracketing functions in D

Author: Dr Chibisi Chima-Okereke Created: October 12, 2020 00:27:05 GMT Published: October 12, 2020 11:46:00 GMT

Introduction

Bracketing (also known as sectioning) is a technique used in numerical optimization procedures to obtain an interval containing a local monotonic minimum (maximum) point. This technique is used in functions containing only one variable. The application of this technique in the wider optimization context is in optimizing a line search parameter used in multivariable optimization problems. In this article we will focus on techniques for bracketing functions with a single variable, and we show how these techniques are implemented in the D language.

Two texts were used to derive the code in this article:

  1. Algorithms for Optimization by M. J. Kochenderfer and T. A. Wheeler, Chapter 3
  2. Optimization Theory and Methods, Nonlinear Programming, W. Sun and Y. Yuan Chapter 2.

Some algorithms are direct translations from the Julia implementations given in Kochenderfer and Wheeler’s book, while others are from algorithmic descriptions in Sun and Yuan’s book.

Finding initial brackets

The two techniques presented here are the bracket minimum technique [1] and the forward-backward technique [2]. The implementations take a function and a single point from either side of the minimum and returns a range for that function containing the minimum point. In the bracket minimum implementation below[1], x is the initial point, s is the step size, k is the expansion factor, and u are a variadic parameters to be passed to fun the function that we would like to minimize if necessary.

T[2] bracketMinimum(alias fun, T, U...)(T x, T s, T k, U u)
if(isFloatingPoint!(T))
{
  T a = x; T b = a + s; T c, yc;
  T ya = fun(x, u); T yb = fun(b, u);
  if(yb > ya)
  {
    swap(a, b);
    swap(ya, yb);
    s = -s;
  }
  debug long i = 0;
  while(true)
  {
    c = b + s;
    yc = fun(b + s, u);
    if(yc > yb)
    {
      debug writeln("NIterations from Bracket-Minimum: ", i);
      return a < c ? [a, c] : [c, a];
    }
    a = b; ya = yb; b = c; yb = yc;
    s *= k;
    debug ++i;
  }
}

and the forward backward method [2] is given by:

T[2] forwardBackward(alias fun, alias step0 = 1.0E-1, alias k = 2.0, T, U...)(T x, U u)
if(isFloatingPoint!(T) && is(T == typeof(k)))
{
  T a = x; T fa = fun(a);
  T b, fb; T step = step0;
  debug long i = 0;
  while(true)
  {
    b = a + step;
    fb = fun(b);
    if(fb < fa)
    {
      step *= k;
      x = a;
      a = b;
      fa = fb;
      ++i;
    }else{
      if(i == 0)
      {
        step = -step;
        a = b;
      }else
      {
        a = min(x, b); b = max(x, b);
        debug writeln("NIterations for Forward-Backward Method: ", i);
        return [a, b];
      }
    }
    debug ++i;
  }
}

Some lines have been marked with debug because they exist only to provide information on the process and do not affect the calculation otherwise. It is an example of conditional compilation in D and in the DMD compiler, they can be included with the -debug flag.

Next we produce some outputs for the following functions:

$$ f_{1}(x) = (x + 2)^{2}; \qquad \frac{\partial f_{1}}{\partial x} = 2x + 4 $$

$$ f_{2}(x) = e^{(x - 2)} - x; \qquad \frac{\partial f_{2}}{\partial x} = e^{(x - 2)} - 1 $$

The minimum of the functions are: \(f_{1}(x^{*}) = -2\) and \(f_{2}(x^{*}) = 2\).

auto f1 = (double x) => pow(x + 2.0, 2.0);
auto df1 = (double x) => 2*x + 4.0;
auto f2 = (double x) => exp(x - 2) - x;
auto df2 = (double x) => exp(x - 2) - 1;

/**
  Bracketing Minimum
**/
auto x1 = bracketMinimum!(f1)(-3.0, 1E-2, 2.0);
writeln("Bracket Minimum for f1: ", x1);
auto x2 = bracketMinimum!(f2)(-6.0, 1E-2, 2.0);
writeln("Bracket Minimum for f2: ", x2);
/**
  Forward-backward Method
**/
writeln("Forward-Backward for f1: ", forwardBackward!(f1)(-3.0));
writeln("Forward-Backward for f2: ", forwardBackward!(f2)(-6.0));

output:

NIterations from Bracket-Minimum: 7
Bracket Minimum for f1: [-2.36, -0.44]
NIterations from Bracket-Minimum: 9
Bracket Minimum for f2: [-3.44, 4.24]

NIterations for Forward-Backward Method: 6
Forward-Backward for f1: [-2.7, -1.5]
NIterations for Forward-Backward Method: 12
Forward-Backward for f2: [-2.9, 6.7]

Algorithms for refining brackets

Following are algorithms that give us refinements on the initial brackets we obtained using the methods already presented.

The code for the Fibonacci search as given in [1] is given by the code below:

T[2] fibonnaciSearch(alias fun, alias eps = 0.01, T, U...)(T a, T b, long n, U u)
if(isFloatingPoint!(T) && is(typeof(eps) == T))
{
  enum s = S!(T); T c, yc;
  T rho = fibonacciRatio!(T)(n);
  T d = rho*b + (1 - rho)*a;
  T yd = fun(d, u);
  long m = n - 1;
  for(long i = 1; i < n; ++i)
  {
    if(i == m)
    {
      c = eps*a + (1 - eps)*d;
    }else{
      c = rho*a + (1 - rho)*b;
    }
    yc = fun(c, u);
    if(yc < yd)
    {
      swap(b, d); swap(d, c); swap(yd, yc);
    }else{
      swap(a, b); swap(b, c);
    }
    rho = fibonacciRatio!(T)(n - i);
  }
  return a < b ? [a, b] : [b, a];
}

The Golden Ratio [1]is an approximation to the successive Fibonacci numbers and the search is given below:

T[2] goldenRatioSearch(alias fun, T, I)(T a, T b, I n)
if(isFloatingPoint!(T) && isInteger!(I))
{
  enum rho = GR!(T) - 1;
  T d = rho*b + (1 - rho)*a;
  T yd = fun(d); T c, yc;
  for(getSigned!(I) i = 1; i < n; ++i)
  {
    c = rho*a + (1 - rho)*b;
    yc = fun(c);
    if(yc < yd)
    {
      swap(b, d); swap(d, c); swap(yd, yc);
    }else{
      swap(a, b); swap(b, c);
    }
  }
  return a < b ? [a, b] : [b, a];
}

Below we show some outputs for Fibonacci and the Golden ratio procedures on function \(f_{2}\)

writeln("\nFive iterations of Fibonnaci search: ", fibonnaciSearch!(f2)(x2[0], x2[1], 5));
writeln("Ten iterations of Fibonnaci search: ", fibonnaciSearch!(f2)(x2[0], x2[1], 10));

writeln("Five iterations of Golden Ratio search: ", goldenRatioSearch!(f2)(x2[0], x2[1], 5));
writeln("Ten iterations of Golden Ratio search: ", goldenRatioSearch!(f2)(x2[0], x2[1], 10));

output:

Five iterations of Fibonnaci search: [1.36, 2.32]
Ten iterations of Fibonnaci search: [1.99554, 2.0827]

Five iterations of Golden Ratio search: [1.3065, 2.427]
Ten iterations of Golden Ratio search: [1.96041, 2.06145]

Bisection method

In the Bisection method [1]:

$$ x^{*} = \frac{(a + b)}{2} $$

each new candidate point is an average of the two range points and it replaces either the leftmost or rightmost point depending on which side of the inflection point the average point sits. Note also that the bisection uses the derivative dfun below rather than the function in contrast to the previous methods.

T[2] bisection(alias dfun, alias eps = 1E-6, T, U...)(T a, T b, U u)
if(isFloatingPoint!(T) && is(T == typeof(eps)))
{
  // Makes sure that the points are in the right order
  if(a > b)
  {
    swap(a, b);
  }
  // If either a or b are the inflection point stop
  T da = dfun(a, u); T db = dfun(b, u);
  if(da == 0)
  {
    b = a;
  }
  if(db == 0)
  {
    a = b;
  }
  debug long i = 0;
  while((b - a) > eps)
  {
    T x = (a + b)/2;
    T d = dfun(x, u);
    if(d == 0)
    {
      a = x; b = x;
    }else if(sign(d) == sign(da))
    {
      a = x;
    }else{
      b = x;
    }
    debug ++i;
  }
  debug writeln("NIterations from bisection interpolation: ", i);
  return [a, b];
}

We can run the method using the following:

writeln("Bisection: ", bisection!(df2)(x2[0], x2[1]));

output:

NIterations from bisection interpolation: 23
Bisection: [2, 2]

Sectioning based on interpolation

The following methods are sectioning methods based on interpolation.

Quadratic interpolation based on three points

This quadratic interpolation method is based on three points and their function values. The implementation below was translated from reference [1]:

$$ x^{*} = \frac{1}{2} \frac{f(a)(b^{2} - c^{2}) + f(b)(c^{2} - a^{2}) + f(b)(a^{2} - b^{2})}{f(a)(b - c) + f(b)(c - a) + f(b)(a - b)} $$

we have the opportunity to specify the number of iterations that is performed (n):

T[2] quadraticOne(alias fun, T)(T a, T c, long n)
if(isFloatingPoint!(T))
{
  T b = (a + c)/2;
  T ya = fun(a); T yb = fun(b); T yc = fun(c);
  T x, yx;
  for(long i = 0; i < n; ++i)
  {
    x = 0.5*(ya*(pow(b, cast(T)2.0) - pow(c, cast(T)2.0)) + 
            yb*(pow(c, cast(T)2.0) - pow(a, cast(T)2.0)) +
            yc*(pow(a, cast(T)2.0) - pow(b, cast(T)2.0)))/
            (ya*(b - c) + yb*(c - a) + yc*(a - b));
    yx = fun(x);
    if(x > b)
    {
      if(yx > yb)
      {
        swap(c, x); swap(yc, yx);
      }else{
        swap(a, b); swap(ya, yb); 
        swap(b, x); swap(yb, yx);
      }
    }else if(x < b)
    {
      if(yx > yb)
      {
        swap(a, x); swap(ya, yx);
      }else{
        swap(c, b); swap(yc, yb);
        swap(b, x); swap(yb, yx);
      }
    }
  }
  return [a, c];
}

usage:

writeln("Five iterations of Quadrative interpolation version 1: ", quadraticOne!(f2)(x2[0], x2[1], 5));
writeln("Ten iterations of Quadrative interpolation version 1: ", quadraticOne!(f2)(x2[0], x2[1], 10), "\n");

outputs:

Five iterations of Quadrative interpolation version 1: [1.49451, 4.24]
Ten iterations of Quadrative interpolation version 1: [1.90772, 4.24]

Quadratic interpolation based on two points

This is one of two methods for quadratic interpolation for two points implemented from algorithms described in reference [2].

$$ x^{*} = x_{k} - \frac{1}{2}\frac{(x_{k} - x_{k-1})f'(x_{k})}{f'(x_{k}) - \frac{f(x_{k}) - f(x_{k - 1})}{x_{k} - x_{k - 1}}} $$

Two points are required, from which the function values are calculated and one of their gradients, so not only is the function (fun) required but also its derivative (dfun) or some way of calculating the gradient as in our previous article on numerical differentiation:

T[2] quadraticTwo(alias fun, alias dfun, alias eps = 1E-6, T, U...)(T a, T b, U u)
if(isFloatingPoint!(T) && is(T == typeof(eps)))
{
  if(a > b)
  {
    swap(a, b);
  }
  T db = dfun(b, u);
  T fa = fun(a, u); T fb = fun(b, u);
  if(db == 0)
  {
    a = b;
  }
  debug long i = 0;
  while((b - a) > eps)
  {
    T x = b - ((b - a)*(db*0.5))/(db - ((fb - fa)/(b - a)));
    T d = dfun(x, u);
    if(d == 0)
    {
      a = x; b = x;
    }else if(sign(d) == sign(db))
    {
      b = x; fb = fun(b, u); db = d;
    }else{
      a = x; fa = fun(a, u);
    }
    debug ++i;
  }
  debug writeln("NIterations from quadraticTwo interpolation: ", i);
  return [a, b];
}

usage:

writeln("Quadratic interpolation version 2: ", quadraticTwo!(f2, df2)(x2[0], x2[1]));

output:

NIterations from quadraticTwo interpolation: 8
Quadratic interpolation version 2: [2, 2]

From the output, this implementation of Quadratic interpolation is preferable to the previous from [1].

Cubic interpolation

Here we use cubic interpolation based on two points, but the function values of both points and their derivatives must also be calculated. The equation used is given below [2]:

$$ s = 3\frac{f(b) - f(a)}{b - a} $$

$$ z = s - f'(a) - f'(b) $$

$$ w^2 = z^2 - f'(a)f'(b) $$

$$ x^{*} = a + (b - a)\frac{w - f'(a) - z}{f'(b) - f'(a) + 2w} $$

As with the above quadratic interpolation, the derivative (dfun) is required along with the function itself (fun).

T[2] cubic(alias fun, alias dfun, alias eps = 1E-6, T, U...)(T a, T b, U u)
if(isFloatingPoint!(T) && is(T == typeof(eps)))
{
  if(a > b)
  {
    swap(a, b);
  }
  T da = dfun(a, u); T db = dfun(b, u);
  T fa = fun(a, u); T fb = fun(b, u);
  if(db == 0)
  {
    a = b;
  }
  debug long i = 0;
  while((b - a) > eps)
  {
    T s = (3*(fb - fa))/(b - a);
    T z = s - da - db;
    T w = sqrt(z*z - da*db);
    T x = a + ((b - a)*(w - da - z))/(db - da + 2*w);
    T dx = dfun(x, u);
    if(dx == 0)
    {
      a = x; b = x;
    }else if(sign(dx) == sign(db))
    {
      b = x; fb = fun(b, u); db = dx;
    }else{
      a = x; fa = fun(a, u); da = dx;
    }
    debug ++i;
  }
  debug writeln("NIterations from Cubic interpolation: ", i);
  return [a, b];
}

usage:

writeln("Cubic interpolation: ", cubic!(f2, df2)(x2[0], x2[1]));

output:

NIterations from Cubic interpolation: 6
Cubic interpolation: [2, 2]

The cubic interpolation is clearly better than the other bracketing or sectioning methods, only taking six iterations to fully converge.

Conclusion

Sectioning is an important step in many optimization algorithms. Several simple methods were presented here and others are contained in the references specified. As numerical optimization is an interesting topic, we hope to visit more implementation topics in optimization in the future, such as line search and fully fledged optimization algorithms.

Thank you for your time.