[Courses] [C] Beginner's Lesson: How do I...?

James Davis james.davis at st-peters.oxford.ac.uk
Fri Jun 13 09:20:29 EST 2003


On Fri, 13 Jun 2003, Anand R wrote:

> And then I took a look at this one. There are many programs like wuftpd,
> apache etc which use config files. I was wondering if somebody from the team
> could share a document or experience on reading from a config file and using
> the values as program arguments. How do I read from a config file and use
> the read values as program arguments. Even some sample code would be great.

You need to write a parser for the language of the configuration file.
Given that your configuration file is probably very simple the parser
should be quite easy to write by hand, however I assume you are having
problems matching with the strings in the file.

Although probably not as efficient as hand writing the code you need,
there are some tools you can use to help you with this stage (and others!)
of the process. The task of matching strings to lexemes or tokens is
called lexical analysis. Very roughly, a lexeme is the smallest unit of a
language. For example your configuration file may have a line

logfile = /var/log/foo;

which might translate into the stream of tokens
'LOGFILE,EQUALS,/var/log/foo,SEMICOLON'. Note that the lexical analyser
may produce the same token stream from

lOgFiLe = /var/log/foo;

depending on how you have defined your tokens to be represented as
strings.  The good think is that after lexical analysis instead of one
large string you now have a stream of tokens, which should be a lot easier
to handle. As an example code such as

if (tok==SEMICOLON) { ... }

might now be perfectly valid. Now the good part: you don't have to write
your own lexical analyser, you can use a tool called lex. Given enough
details about the strings representing your tokens, lex will produce a
lexical analyser for you to use. I can't go into a description about how
to use Lex but I think the following web pages will give you a good start.

http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?lexical+analysis
http://www.combo.org/lex_yacc_page/lex.html
http://directory.google.com/Top/Computers/Programming/Compilers/Lexer_and_Parser_Generators/
http://dinosaur.compilertools.net/

O'Reilly publish a book called 'Lex and Yacc' that should will give the
best overview of how this all works along with some clear examples. If
your configuration file starts to get more complex you may also wish to
investigate Yacc.

The definitive textbook on compilers is "Compilers: Principles,
Techniques, and Tools" by Ago, Sethi and Ullman of which chapter three
covers lexical analysis. It's expensive and personally I only use it as a
reference. I don't recommend buying it if you're only curious about this.

Returning to my first paragraph: using lex and other compiler tools may be
overkill for your configuration files. However you may not feel like
reinventing the wheel today or want to test and debug a lexical analyser.
It's also worth remembering that the hardest part of the whole process is
not writing the code to lex and parse the input; it's formalizing the
lexemes, grammar and semantics of your input. I hope I've not confused you
and I've managed to explain myself well. Let me know if I haven't and I'll
try to help.

James

-- 
James Davis           \                  james.davis at unix.net
St. Peter's College     \                     jamesd at jml.net
PGP Key ID : 0x7E1F718A   \  http://users.ox.ac.uk/~spet1067/




More information about the Courses mailing list