# Speed Chain Ladder analysis with Rcpp

Author: Dr Chibisi Chima-Okereke Created: June 23, 2014 00:00:00 GMT Published: June 23, 2014 00:00:00 GMT

Some time ago we compared the use of R and Python for statistical applications. In that blog entry we also discussed the ease of integrating C++ code with each programming application. In this blog entry we present a simple example of calling C++ Chain Ladder code with R using the Rcpp and RcppArmadillo packages. If you are looking to integrate C++ libraries with R, Rcpp is by far the easiest way to do this, RcppArmadillo is a C++ wrapper for the linear algebra library Armadillo that provides an interface between Rcpp and Armadillo objects.

Some time ago, we looked at Chain Ladder calculations in R for large data sets. This time round we will compare what happens to the calculation time when we carry out the chain ladder matrix calculations in C++ and call these functions from R using Rcpp.

## Preparations

The C++ code we would like to call sits in a file called cL.cpp, on the R script we load the Rcpp package.

require(Rcpp)


In the cL.cpp file, we load the libraries

// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
#include <iostream>
#include <vector>
#include <iterator>
#include <algorithm>  // most of the algorithms
#include <numeric> // some numeric algorithm
using namespace std;
using namespace arma;
using namespace Rcpp;


## Simulating the claims traingle

To simulate the claims traingle we will use the same R code as in our previous blog repeated here:

# Age-To-Age Factors
ageFact <- seq(1.9, 1, by = -.1)
# Inflation Rate
infRate <- 1.02
# Function to reverse matrix columns
revCols <- function(x){
x[,ncol(x):1]
}
# Similar to jitter
shake <- function(vec, sigmaScale = 100)
{
rnorm(n = length(vec), mean = vec, sd = vec/sigmaScale)
}
# Row generation funtion
GenerateRow <- function(iDev, dFactors = cumprod(ageFact), dInflationRate = 1.02, initClaim = 154)
{
shake(initClaim)*shake(c(1, dFactors))*(dInflationRate^iDev)
}
# Function to generate a claims matrix
GenerateTriangle <- function(iSize, ...)
{
indices = 1:iSize
mClaimTri = t(sapply(indices, GenerateRow, ...))
# Reverse columns to get the claims triangle
mClaimTri = revCols(mClaimTri)
# Assign nan to lower triangle
mClaimTri[lower.tri(mClaimTri)] = NA
mClaimTri = revCols(mClaimTri)
return(mClaimTri)
}


## R Chain Ladder R code

The code for Chain Ladder analysis on the R side is is essentially the same as in the previous blog but is repeated here for your convenience. One of the functions GetFactorR() is used to calculate the age-to-age factors and the other GetChainSquareR() is used to complete the traingle to a “square”.

# Get claims factor at a particular column index
GetFactorR <- function(index, mTri)
{
fact = matrix(mTri[-c((nrow(mTri) - index + 1):nrow(mTri)), index:(index + 1)], ncol = 2)
fact = c(sum(fact[,1]), sum(fact[,2]))
return(fact[2]/fact[1])
}
# Function to carry out Chain Ladder on a claims triangle
GetChainSquareR <- function(mClaimTri)
{
nCols <- ncol(mClaimTri)
dFactors = sapply(1:(nCols - 1), GetFactorR, mTri = mClaimTri)
dAntiDiag = diag(revCols(mClaimTri))[2:nCols]
for(index in 1:length(dAntiDiag))
{
mClaimTri[index + 1, (nCols - index + 1):nCols] = dAntiDiag[index]*cumprod(dFactors[(nCols - index):(nCols - 1)])
}
mClaimTri
}


## R Chain Ladder C++ code

There are four C++ functions, the first calculates the age-to-age factor:

double GetFactor(int index, mat mTri)
{
int nRow = mTri.n_rows, nCol = mTri.n_cols;
mat subMat = mTri.submat(0, index, nRow - (index + 2), index + 1);
rowvec colSums = arma::sum(subMat, 0);
double inFact = colSums(1)/colSums(0);
return inFact;
}


The second calculated the factors over the whole matrix

vec GetFactors(mat mTri)
{
int nRow = mTri.n_rows, nCol = mTri.n_cols;
vec dFactors(nCol - 1);
for(int i=0; i < nCol - 1; ++i)
{
dFactors(i) = GetFactor(i, mTri);
}
return dFactors;
}


The third calculates the cumulative product over a vector

// cummulative product
vec cumprod(vec mvec)
{
int nElem = mvec.n_elem;
double cprod = mvec(0);
for(int i = 1; i < nElem; ++i)
{
cprod *= mvec(i);
mvec(i) = cprod;
}
return mvec;
}


The fourth function completes our claims triangle

// [[Rcpp::export]]
SEXP GetChainSquareCpp(SEXP mClaimTri)
{
NumericMatrix nMat(mClaimTri);
int nRow = nMat.nrow(), nCol = nMat.ncol();
mat armMat(nMat.begin(), nRow, nCol, FALSE);

vec dFactors = GetFactors(armMat);
mat revMat = fliplr(armMat);
vec dAntiDiag = diagvec(revMat);
dAntiDiag = dAntiDiag.subvec(1, nCol - 1);
double dMult;
vec prodVec;
for(int index = 0; index < dAntiDiag.n_elem; ++index)
{
dMult = dAntiDiag(index);
prodVec = dFactors.subvec(nCol - index - 2, nCol - 2);
prodVec = cumprod(prodVec);
armMat(index + 1, span(nCol - index - 1, nCol - 1)) = dMult*prodVec.st();
}
return wrap(armMat);
}


## R vs. C++ speed test

We now compile the C++ code and incorporate the functions in our R environment and load the microbenchmarks package:

sourceCpp("cL.cpp")
require(microbenchmark)


We generate a triangle and run the benchmark:

x <- GenerateTriangle(11)
microbenchmark(GetChainSquareR(x), GetChainSquareCpp(x), times = 10000L)

Unit: microseconds
expr     min      lq   median      uq      max neval
GetChainSquareR(x) 177.885 187.923 193.4030 205.044 3087.584 10000
GetChainSquareCpp(x)   5.121   6.045   8.3935   9.128  179.754 10000


Evidently a significant speed-up is achived by calling C++ from R.

## Conclusion

There are times when not much needs to be said about the benefits of using certain packages/techniques. It is common to use C/C++ to speed up calculations or to call other C++ libraries for various applications from some domain specific language. Rcpp is fantastic for this purpose. It provides an easy to use interface that allows seamless integration between R and C++. It gives you the benefits of C++ speed but makes it easy to integrate this with R.