[prog] lex and yacc

Daniel. cristofd at hevanet.com
Thu Jan 22 07:42:12 EST 2004


>It is a language from a "Theory of Computing" text book by Brookshear.  the
>language consist on incr, decr, while name <> do end; alphanumeric variable
>names and integers.  Integers are the only type of variable so no type
>checking is required.
>
>statements include:
>	incr name;
>	decr name;
>	while name <> 0 do;
>		.
>		.
>	end;
>
>Brookshear then allowe the language to be extended by
>	clear name;
>	name1 <- name2
>
>This language is capable of doing everything that a turing machine can do.
>Though it isn't intended to be a programming language for use, just to make a
>point about theory.
>If you like assembler you should like this one.  If I ever get this compiler
>working you can have a copy if you like!

Just a few comments.
Lex and yacc only help you make half a compiler, the "front end", 
whose job is to analyze the programs in the source language. For the 
"back end" which synthesizes an equivalent program in the target 
language, you're on your own. The boundary between lex's 
responsibilities and yacc's responsibilities is somewhat 
flexible--maybe "abusable" would be a better word. For this language, 
you probably want to have lex recognize the keywords "incr", "decr", 
"end", etc. as well as the semicolon, "<-", and variable names. Units 
at around this level are called "tokens". Lex's job (or a lex 
program's job, rather) is to break the text stream into tokens, pass 
the tokens to yacc one at a time (numeric codes for them [better yet, 
descriptive #defined names for numeric codes], not text strings, 
except in the case of variable names, where a text string is passed 
as well as the code that means "name"), and discard the whitespace 
(though it may use it to spot token boundaries). Yacc's job is to 
take the tokens and figure out how they relate, what each means in 
context, and what the structure of the program is. For instance, if 
you were processing C, it would be yacc's job to figure out that the 
multiplication is done first in "6+a*n", to match "(" with ")" and 
"{" with "}", to relate the parts of a "while" statement or a 
function definition to each other, to tell the difference between 
what a variable name means in a declaration and what it means in a 
statement, and so on. In this tiny language you're working with, yacc 
mostly has to match "while" with "end" and keep a list of variable 
names; and it doesn't even have to match "while" with "end" if your 
target language is C or the like--it could just pass that problem on 
to the C compiler. You may have to do a harder language to get a good 
idea of yacc.

Incidentally, I've written at least a dozen programs in a tiny 
language that's very closely related to the language you're dealing 
with. It's an interesting challenge. (I wonder if Yaroslav has seen 
this one yet. See the sample program and URL in my sig. Sorry about 
the vulgar name.)

Anyway. Good luck.
-Daniel Cristofani.
-- 
  >>>++[<++++++++[<[<++>-]>>[>>]+>>+[-[->>+<<<[<[<<]<+>]>[>[>>]]]<[>>[-]]>[>[-
  <<]+<[<+<]]+<<]<[>+<-]>>-]<.[-]>>]http://www.hevanet.com/cristofd/brainfuck/


More information about the Programming mailing list