[techtalk] switch function in C (or how to read commandline args?)

Jeff Dike jdike at karaya.com
Mon Jul 2 20:17:58 EST 2001


conor.daly at oceanfree.net said:
> Now that's interesting.  So how do I tokenise the strings? 

OK, let's say you have an interpreted language, and a program being 
interpreted has this in an inner loop:
	foo;
	bar;
	baz;

If you represent pieces of this program with strings, you'll end up with an 
array like [ "foo", "bar", "baz" ] for that loop body.

The interpreter will need to do something like this for each iteration of that 
loop:
	if(!strcmp(instruction, "foo")) do_foo();
	else if(!strcmp(instruction, "bar")) do_bar();
	else if(!strcmp(instruction, "baz")) do_baz();
	else if(!strcmp(instruction, "hoo")) do_hoo();
	else if(!strcmp(instruction, "ha")) do_ha();

which will be slow.

So, what's normally done instead is that when the program is read in, a 
conversion like this happens:
	"foo;" -> FOO_INSTR
	"bar;" -> BAR_INSTR
	"baz;" -> BAZ_INSTR

where *_INSTR are #defined constants or enum constants.

So, now the inner loop is represented as this array : 
	[ FOO_INSTR, BAR_INSTR, BAZ_INSTR]

And the interpreter can be implemented using a switch:

	switch(instruction){
	case FOO_INSTR:
		do_foo();
		break;
	case BAR_INSTR:
		do_bar();
		break;
	case BAZ_INSTR:
		do_baz();
		break;
	case HOO_INSTR:
		do_hoo();
		break;
	case HA_INSTR:
		do_ha();
		break;
	...
	}

which will be a lot faster.

This pays off because the cost of tokenizing the strings once at startup time 
can be amortized across a large number of passes over the data as the program 
is interpreted.  If there's just one pass over it, as in the case of parsing a 
command line or a config file, then this probably isn't worth it.

				Jeff






More information about the Techtalk mailing list