Calling R from other applications: ZeroMQ, R, PHP, Python

Active Analytics Ltd: posted 24 May 2014 by Chibisi Chima-Okereke

Introduction

In last week's blog entry we built a small app that calculated summary statistics of Single Home Prices using Fannie Mae data. There were two approaches to presenting the data, the first queried a MySQL database and the second called R to update a chart, both did this by calling a PHP script using AJAX. There were lots of different programming languages accomplishing different tasks in that application, and in this and subsequent blogs, we will be picking apart the various facets of the application and working with them in more detail.

This week's action

In this blog entry, we will be looking more closely at calling R from PHP, and we will also be calling R from Python. Both tasks will be done using the ZeroMQ library.

Last week we simply used a system call to an R batch process that produced a PNG plot to the web page. Calling R in this way is okay for a "quick hack" but is ultimately quite limited, and it would be tricky to build an interactive session using this technique. Here we take a different approach, by using socket technology.

For some time now, the ZeroMQ library has been making waves in the world of computing and is set to be important in messaging systems in all kinds of applications. ZeroMQ has bindings for all major programming languages which is useful. It plays right into our hands when we try to modularize components and make them exchangeable. I will be calling R from Python and PHP using the same R server code each time - this kind of modularization is important when building applications. We can replace a component with an equivalent one without having to re-write anything else.

As always, these examples are written in Linux (Ubuntu), so you'll have to make the suitable adjustments to run them in other operating systems. These examples are not ment for production code, they are given as a learning tool.

Calling an R server from the clients

The plan

There are basically two stages in the communication process between the R server and the PHP and Python clients.

  1. Initialization: The client makes a system call to start up the R server.
  2. Request-Reply Loop: A standard request and reply loop used for communication between the client and R.

Of course instead of using an R script to define the server code (as in this example), you could write an R package to do this.

Initialization

To carry out the system call to R we used a .command file. We have chosen to run R in terminal mode so that we can observe the interactive session.

#!/bin/bash
clear
gnome-terminal -e R

We also use a local .Rprofile script to source our server script and start the process.

.First <- function(){
	eval(parse(text = "source('r_server.r')"), envir = .GlobalEnv)
	eval(parse(text = "startServer(port = 5557)"), envir = .GlobalEnv)
}

Next we will look at the R server script.

The R Server Script

The first part of the R server script loads the required libraries along with the summarized data created in the previous blog. It is worth mentioning that the jsonlite R package will be used for serializing datasets to JSON for transmission to the clients, and the rzmq package is the ZeroMQ binding for R. Both are available on CRAN.

require(base)
require(methods)
require(graphics)
require(grDevices)
require(stats)
require(rzmq)
require(jsonlite)
load("sumData.RData")

It makes sense to present the function that starts the R server next.

# Function to start the server
startServer <- function(port){

	on.exit(q("no"))
	context = init.context()
	socket = init.socket(context, "ZMQ_REP")
	bind.socket(socket, paste0("tcp://*:", port))	
	
	persist <- TRUE
	while(persist){
		message = receive.socket(socket, FALSE);
		message = rawToChar(message)
		output <- rEval(message)
		send.raw.string(socket, output[["data"]])
		persist <- output[["keep.alive"]]
	}
	
	q("no")
}

The function has one argument for the port we will connect to, we create and bind the REP socket then we enter a while loop that makes the server ready to repeatedly receive commands, execute them and send replies. The output of the rEval() function is a list with two components. The first (data) is a string message to be sent back to the client, and the second (keep.alive) is a bool for whether the while loop should be kept going. Notice that if the while loop drops out, the R server will be shut down. Therefore the next function we present exitR() uses this feature to drop out of the while loop and shut down the server.

# Function to exit R
exitR <- function()
{
	out <- vector("list", 2)
	names(out) <- c("keep.alive", "data")
	out[["data"]] <- "Now exiting R\n"
	out[["keep.alive"]] <- FALSE
	return(out)
}

The R server works by using the rEval() function to evaluate the string expressions sent to R. If the expression is to exit R, then it sends the exitR() signal otherwise the code is evaluated and wrapped in the tryCatch() function ensuring that if there is an error, we can still send the error message back to the client without crashing the server. Of course this is not foolproof but it gives some indication that exception handling is important in these cases.

# Function to eval/parse external calls to R
rEval <- function(message){
	if(gsub("\\s", "", message) != "exitR()"){
		output <- tryCatch(
			eval(parse(text = message), envir = .GlobalEnv),
			error = function(e)paste0(e$message, "\n")
		)
		output <- validateOuput(output)
		out <- vector("list", 2)
		names(out) <- c("keep.alive", "data")
		out$data <- output
		out$keep.alive <- TRUE
		
	}else{
		out <- exitR()
	}
	return(out)
}

To ensure that we do not try to send a message that is not a single string back, we do a little validation step when we receive the answer to the expression

# Output validation
validateOuput <- function(output){
	if(class(output) != "character" || length(output) != 1){
		output <- "Command executed but no suitable string output\n"
	}
	return(output)
}

Last but not least, the whole point of doing this exercise in the first place was to return annual subsets from our summary table of Fannie Mae data. This is accomplished using the getData() function.

# Function to get data for a particular year
getData <- function(year = 2003){
	summ <- get("summ", envir = .GlobalEnv)
	summ <- summ[summ$year == year, ]
	out <- serializeJSON(summ, 1)
	return(out)
}

So that's the R server code, it was certainly not alot of code, but it is the largest script we will present today, everything is even easier from here on in. Next we will look at the PHP client side code.

PHP client code

We have written a single PHP class called rClient to communicate with R. There are three methods in this class:

  1. The constructor launches R, creates the socket, and connects.
  2. We have an rEval() which sends string commands to R to be evaluated and returns reply
  3. Then we have a disconnect funtion that sends the exitR() command to close down the R server and disconnects.

Essentially that's it for the client side programming. All we have to do is call the class.

<?php
# File: php_client.php
class rClient
{
	public $port;
	public $reply;
	public $socket;
	public $dsn;
	
	public function __construct($port) {
		system("./r.command");
		$this->socket = new ZMQSocket(new ZMQContext(), ZMQ::SOCKET_REQ);
		$this->port = $port;
		$this->dsn = "tcp://localhost:" . $this->port;
		$this->socket->connect($this->dsn);
	}
	
	public function rEval($command){
		$this->socket->send($command);
		$this->reply = $this->socket->recv();
		echo $this->reply;
	}
	
	public function disconnect() {
		$command = "exitR()";
		$this->socket->send($command);
		$this->reply = $this->socket->recv();
		echo $this->reply;
		$this->socket->disconnect($this->dsn);
	}
}
?>

We call our rClient class in the PHP code below. You can run interactively in PHP shell using $ php -a in terminal.

<?php
require_once("php_client.php");
$client = new rClient(5557);
$client->rEval("getData(2006)");
$client->rEval("getData(2007)");
$client->rEval("plot(rnorm(200), pch = 21, bg = 1:200)");
$client->rEval("dev.off()");
$client->disconnect();
?>

That's it on the PHP side. Next we look at the equivalent code in Python.

Python client code

The Python code is just as straightforward (probably more so). We create a Python class rClient for connecting with the R server with exactly the same methods as the PHP script:


#!/usr/bin/env python
# -*- coding: utf-8 -*-

# Filename: py_client.py

import os
import zmq

class rClient:
	"""Class for connecting with R server"""
	def __init__(self, port):
		os.system("./r.command")
		self.context = zmq.Context()
		self.socket = self.context.socket(zmq.REQ)
		self.dns = "tcp://localhost:" + str(port)
		self.socket.connect(self.dns)
	
	def rEval(self, command):
		self.socket.send(command)
		reply = self.socket.recv()
		print reply
	
	def disconnect(self):
		command = "exitR()"
		self.socket.send(command)
		reply = self.socket.recv()
		print reply

Then the python script for running a similar process as was called in the PHP script:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

# These are general scripts that call the server/client functions

execfile("py_client.py")
client = rClient(5557)
client.rEval("getData(2003)")
client.rEval("getData(2004)")
client.rEval("plot(rnorm(200), pch = 21, bg = 1:200)")
client.rEval("dev.off()")
client.disconnect()

It's really as straightforward as that.

Conclusion

ZeroMQ is a very interesting library, and has many applications (as socket technologies tends to). We hope to revisit it later for big data and performance computing applications. In this example it certainly plays into our desire to have modular and easily exchangeable components in our applications. If you are interested in socket technologies in R, you should certainly look at the svSocket library and search ?connections in R.

That's it, you can go back to work now.

Data Science Consulting & Software Training

Active Analytics Ltd. is a data science consultancy, and Open Source Statistical Software Training company. Please contact us for more details or to comment on the blog.

Dr. Chibisi Chima-Okereke, R Training, Statistics and Data Analysis.