[prog] Passing data between modules

Meredydd meredydd at everybuddy.com
Mon Jul 28 23:01:59 EST 2003


On Monday 28 July 2003 10:34, Conor Daly wrote:
> Hi all.
>
> I'm planning a suite of data quality control programs in C to run
> under *NIX.  I want to keep everything as modular as possible in
> order to make the suite flexible and futureproof.  I'm concerned with
> passing data between the modules and how to do this in a portable,
> accessible manner.

OK, this is what I'd recommend. Take it or leave it, it's not as if I've 
implemented a system like this before :^)

These modules don't necessarily need to be separate processes. It's 
possible to make dynamically loadable modules very easily - that way, 
you don't have shared-memory problems, but you still get to pass around 
your actual data structures. If you do go for separate processes, I'd 
use either UNIX or TCP sockets (the latter would help scalability over 
multiple machines if it ever got that big - depends on the app, of 
course) to communicate between each process. You may want to keep 
certain modules running in the background so they can "warm start" 
rather than the overhead of loading up (again, depends on the startup 
overhead of the app). Sockets in listen mode help again here.

One big problem is going to be exchanging data between your modules. 
It's easiest to keep it in memory (which is why I'd recommend 
in-process modules), and just pass around pointers to well-thought-out 
data structures. Otherwise, you're going to have to encode each piece 
of data you want to transfer into a stream of bytes, and decode at the 
other end. This is boring and error-prone. read() and write() do a good 
job for simple structures, but the moment you have a pointer in there, 
it stops working and you have to start explicitly following them.

As for language, I think C will do the job - won't be pretty, but it's 
not a horrible monster, and it has the speed. My personal choice would 
be Java because, if you do want to go the separate-processes route, 
there's a very neat set of "serialisation" mechanisms, allowing you to 
just send an object along a bytestream and have the VM at the other end 
reconstruct it for you. Of course, the overhead (possibly learning it, 
VM performance penalties, non-Free stuff if that's important to you) 
could well be prohibitive, but it's the solution that *I* would go for, 
with my own skillset. Oh yeah, and it has bounds checking, and 
NullPointerExceptions are considerably friendlier than SIGSEGVs...

Meredydd



More information about the Programming mailing list