Tuesday, August 19, 2008

TCP/IP Socket Communications in MATLAB

I often see people asking about network communications on the MATLAB Newsgroup. Often this is for the communication between instances of MATLAB.

Using the ability to call Java directly from within MATLAB, I'm going to provide a short example of a client/server written solely in MATLAB and usable from Release 14 onwards (possibly even earlier).

The example is available on the Mathworks File Exchange: Simple TCP/IP Socket Comms Example

I'm working on a little TCP/IP comms library at the moment using these techniques. It will provide a nice layer of abstraction and allow you to use Sockets as you would in other programming languages (as well as one can in a single thread). Keep an eye out for it on the File Exchange.

Interpreted Java?

Amazingly we can execute Java code, even from within the Command Window without the need to compile. For example, the traditional example:
>> import java.lang.*
>> System.out.println('Hello World')
Hello World
To perform socket communications, we utilise the Java Socket and Input/OutputStream classes to pass data around via TCP/IP sockets.

On the server side we use (unsurprisingly) a ServerSocket, which once a client has been accepted, provides a Socket around which we wrap a DataOutputStream to which we can write data.

On the client side we use a Socket to connect to the specified host and port which provides us an InputStream which we wrap in a DataInputStream to read data from.

The code for the example server and client is outlined below.

client.m
% CLIENT connect to a server and read a message
%
% Usage - message = client(host, port, number_of_retries)
function message = client(host, port, number_of_retries)

import java.net.Socket
import java.io.*

if (nargin <>
number_of_retries = 20; % set to -1 for infinite
end

retry = 0;
input_socket = [];
message = [];

while true

retry = retry + 1;
if ((number_of_retries > 0) && (retry > number_of_retries))
fprintf(1, 'Too many retries\n');
break;
end

try
fprintf(1, 'Retry %d connecting to %s:%d\n', ...
retry, host, port);

% throws if unable to connect
input_socket = Socket(host, port);

% get a buffered data input stream from the socket
input_stream = input_socket.getInputStream;
d_input_stream = DataInputStream(input_stream);

fprintf(1, 'Connected to server\n');

% read data from the socket - wait a short time first
pause(0.5);
bytes_available = input_stream.available;
fprintf(1, 'Reading %d bytes\n', bytes_available);

message = zeros(1, bytes_available, 'uint8');
for i = 1:bytes_available
message(i) = d_input_stream.readByte;
end

message = char(message);

% cleanup
input_socket.close;
break;

catch
if ~isempty(input_socket)
input_socket.close;
end

% pause before retrying
pause(1);
end
end
end

server.m
% SERVER Write a message over the specified port
%
% Usage - server(message, output_port, number_of_retries)
function server(message, output_port, number_of_retries)

import java.net.ServerSocket
import java.io.*

if (nargin <>
number_of_retries = 20; % set to -1 for infinite
end
retry = 0;

server_socket = [];
output_socket = [];

while true

retry = retry + 1;

try
if ((number_of_retries > 0) && (retry > number_of_retries))
fprintf(1, 'Too many retries\n');
break;
end

fprintf(1, ['Try %d waiting for client to connect to this ' ...
'host on port : %d\n'], retry, output_port);

% wait for 1 second for client to connect server socket
server_socket = ServerSocket(output_port);
server_socket.setSoTimeout(1000);

output_socket = server_socket.accept;

fprintf(1, 'Client connected\n');

output_stream = output_socket.getOutputStream;
d_output_stream = DataOutputStream(output_stream);

% output the data over the DataOutputStream
% Convert to stream of bytes
fprintf(1, 'Writing %d bytes\n', length(message))
d_output_stream.writeBytes(char(message));
d_output_stream.flush;

% clean up
server_socket.close;
output_socket.close;
break;

catch
if ~isempty(server_socket)
server_socket.close
end

if ~isempty(output_socket)
output_socket.close
end

% pause before retrying
pause(1);
end
end
end
Opening up two instances of Matlab:
% Instance 1
>> message = char(mod(1:1000, 255)+1);
>> server(message, 3000, 10)
Try 1 waiting for client to connect to this host on port : 3000
Try 2 waiting for client to connect to this host on port : 3000
Try 3 waiting for client to connect to this host on port : 3000
Try 4 waiting for client to connect to this host on port : 3000
Client connected
Writing 1000 bytes

% Instance 2 (simultaneously)
% NOTE: If the 'server' was runnning on a non local machine, substitute its IP address
% or host name here:
% data = client('10.61.1.200', 2666); % To connect to server at IP 10.61.1.200:2666
>> data = client('localhost', 3000)
Retry 1 connecting to localhost:3000
Retry 2 connecting to localhost:3000
Connected to server
Reading 1000 bytes

data =



 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~     

 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~     

 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~     

 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~    
This code can be expanded to read/write arbitrary data types, and SHOULD be expanded to properly deal with errors (ie not getting all of the buffer on receive end), but it serves as a simple example of how to get communication between MATLAB and other applications / instances of MATLAB.

Tuesday, August 12, 2008

Optimisation through MEX files

Whilst MATLAB is an excellent expressive tool, it can occasionally run a little bit slow for our liking. However, the folks at Mathworks have provided an interface that can be used to speed up code execution in particular circumstances.

MEX Files

MATLAB allows for compilation of C or Fortran sub-routines into a DLL (or equivalent) such that it can be called from within MATLAB as per any other function.

I'll be using a simple example I came across a while ago when attempting to read in large GPS logs containing on the order of a million GPS position records. As expected, the process of parsing these files took some time. What was unexpected was where the code was using up CPU time.

A quick run of the MATLAB Profiler revealed that approximately 50% of my processing time was spent in the calculation of the NMEA checksum (defined here). The MATLAB calculateChecksum function used is outlined below.
%===============================================================================
% Description : Calculate the NMEA Checksum for the supplied string. Calculated
% as the successive bitwise exclusive OR of all characters
%===============================================================================
function checksum = calculateChecksum(sentence)

% Initialise checksum

checksum = uint8(0);


for i_char = 1:length(sentence)

checksum = bitxor(checksum, uint8(sentence(i_char)));

end


checksum = dec2hex(checksum, 2);


end

To demonstrate the CPU usage of the above code snippet, a short test function was created:
function test_Checksum

nmea_sentence = 'GPGGA,195237,4308.639,S,07744.402,E,1,03,3.2,365.3,M,-34.5,M,1001,';
cs = '';

tic
for i = 1:50000
cs = calculateChecksum(nmea_sentence);
end
toc

% Verify Checksum
if (~strcmp(cs, '7F'))
error('Incorrect Checksum calculated');
end
end
The result of this function when executed several times:
Elapsed time is 4.990806 seconds.
Elapsed time is 4.978824 seconds.
Elapsed time is 5.029520 seconds.
Looking at the profiler output (run independently):

The majority of the time is spent performing the iterative XOR and the conversion from decimal to hexadecimal.

Using this example from Mathworks as a guide I created a simple MEX compatible C function that would calculate the 2 character hexadecimal checksum from a supplied string.
#include "mex.h"
#include <stdio.h>


void calculateChecksumFunction(const char* in_string, char *out_string)
{
int checksum_as_int = 0;
int i, str_length = strlen(in_string);

for (i = 0; i < style="color: rgb(51, 102, 255);">int
)*(in_string++);
}

checksum_as_int &= 0xFF;
sprintf(out_string, "%02X", checksum_as_int);
}

//****************************************************************
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
char *input_buf, *output_buf;
int buflen, status;

/* Check for proper number of arguments. */
if (nrhs != 1)
mexErrMsgTxt("One input required.");
else if (nlhs > 1)
mexErrMsgTxt("Too many output arguments.");

/* Input must be a string. */
if (mxIsChar(prhs[0]) != 1)
mexErrMsgTxt("Input must be a string.");

/* Input must be a row vector. */
if (mxGetM(prhs[0]) != 1)
mexErrMsgTxt("Input must be a row vector.");

/* Get the length of the input string. */
buflen = (mxGetM(prhs[0]) * mxGetN(prhs[0])) + 1;

/* Allocate memory for input and output strings.
* output string should be 2 ASCII characters (plus terminator) */
input_buf = mxCalloc(buflen, sizeof(char));
output_buf = mxCalloc(3, sizeof(char));

/* Copy the string data from prhs[0] into a C string
* input_buf. */
status = mxGetString(prhs[0], input_buf, buflen);
if (status != 0)
mexWarnMsgTxt("Not enough space. String is truncated.");

/* Calculate checksum and store result in output_buf */
calculateChecksumFunction(input_buf, output_buf);

/* Format return as a mex-string */
plhs[0] = mxCreateString(output_buf);

return;
}

This MEX compatible C file was then compiled using the 'mex' command from the MATLAB command window:
mex calculateChecksumMEX.c
This created a DLL in the same directory named calculateChecksumMEX.dll.

Substituting a call to calculateChecksumMEX in the test function redirects the processing to the created DLL.

The speed improvement is immediately noticeable:
Elapsed time is 0.423503 seconds.
Elapsed time is 0.425224 seconds.
Elapsed time is 0.430266 seconds.
An order of magnitude speed improvement was gained through the simple technique of identifying and isolating portions of code which were using the most CPU time and performing these operations in an an efficient C sub-routine.

Now MEX is not the silver bullet for every slow performing MATLAB function, but can prove to be useful. I would always recommend running the MATLAB Profiler over your code at least once to identify regions of poor performance. Poorly written MATLAB can run orders of magnitude slower than well written MATLAB.