What is Lex?
Lex is officially known as a "Lexical Analyser".
It's main job is to break up an input stream into more usable elements.
Or in, other words, to identify the "interesting bits" in a text file.
For example, if you are writing a compiler for the C programming language, the symbols { } ( ) ;
all have significance on their own. The letter a
usually appears as part of a keyword or variable name, and is not interesting on it's own. Instead, we are interested in the whole word. Spaces and newlines are completely uninteresting, and we want to ignore them completely, unless they appear within quotes "like this"
All of these things are handled by the Lexical Analyser.
What is Yacc?
Yacc is officially known as a "parser".
It's job is to analyse the structure of the input stream, and operate of the "big picture".
In the course of it's normal work, the parser also verifies that the input is syntactically sound.
Consider again the example of a C-compiler. In the C-language, a word can be a function name or a variable, depending on whether it is followed by a (
or a =
There should be exactly one }
for each {
in the program.
YACC stands for "Yet Another Compiler Compiler". This is because this kind of analysis of text files is normally associated with writing compilers.
However, as we will see, it can be applied to almost any situation where text-based input is being used.
For example, a C program may contain something like:
{ int int; int = 33; printf("int: %d\n",int); }
In this case, the lexical analyser would have broken the input sream into a series of "tokens", like this:
{ int int ; int = 33 ; printf ( "int: %d\n" , int ) ; }
Note that the lexical analyser has already determined that where the keyword int
appears within quotes, it is really just part of a litteral string. It is up to the parser to decide if the token int
is being used as a keyword or variable. Or it may choose to reject the use of the name int
as a variable name. The parser also ensures that each statement ends with a ;
and that the brackets balance.
Lex and yacc are tools for building programs. Their output is itself code, which needs to be fed into a compiler; typically, additional user code is added to use the code generated by lex and/or yacc. Some simple programs can get by on almost no additional code; others use a parser as a tiny portion of a much larger and more complicated program.
No comments:
Post a Comment