[SOLVED] lex and yacc

rit · 08-20-2012, 11:08 AM

hello everyone,
i am new to lex and yacc and finding difficulty in learning.
please suggest me some tutorial so that i can learn it from scratch.

devnull10 · 08-20-2012, 03:57 PM

It's difficult to say really without knowing how much knowledge you have on the fundamentals for which lex and yacc are used?
Do you have any experience and knowledge of regular expressions (lex) and of grammars (yacc)?

My suggestion to start off would be to write yourself a very small formal grammar. Something as simple as an "Adding Machine" which takes a set of numbers and adds then together with the + symbol. Although that sounds extremely simple, it will be more than enough of a challenge to a beginner. You can use lex to find the numbers and the addition symbol then push those into yacc to define an equation etc etc...

The whole field around this area can be pretty mind-boggling though - it depends what you want to use it for.

tronayne · 08-20-2012, 04:01 PM

There is a useful book, Lex & Yacc, 2nd ed., ISBN 978-1565920002 available at Amazon.com http://www.amazon.com/lex-yacc-Doug-...s=lex+%26+yacc in both paperback and Kindle editions.

I have the first edition and found it useful for learning the in's and out's of these tools.

Hope this helps some.

Tinkster · 08-20-2012, 06:00 PM

Moved: This thread is more suitable in <PROGRAMMING> and has been moved accordingly to help your thread/question get the exposure it deserves.

sundialsvcs · 08-20-2012, 07:21 PM

I will second the notion that this book is a very good place to start. If you, say, basically know nothing at this point about regular expressions and/or the vagaries of LALR(1) grammars and parsing theory, then ... "you're going to take a sip from the firehose, and any firehose will do, but this book is a good one if you prefer ... as I do ... to sit down with a good book (and a good glass of fine wine or equally-fine Scotch, to sip

either one).

Don't get buried in the theory, which comes too-quickly. Grab the "big picture." (The source-code is incomprehensible. ... but it works.)

If you have further comments or questions after doing your initial studies, feel free to ask here. We'll know.

rit · 08-21-2012, 08:36 AM

well yes i have knowledge of regular expressions(used in sed and grep).
Actually i tried learning from O'reilly but i found difficulty beacuse examples are very big(everything is not explained).

tronayne · 08-21-2012, 09:20 AM

Well, regular expressions are... regular expressions; they're the same across pretty much every editor, shells (BASH, Bourne, Korn) and lex. So, you're already half-way there. This stuff we so love to use all came out of Bell Labs' smart guys who invented it to make the job of building and maintaining Unix easier and more predictable and reliable.

Keep in mind that many compilers, notably C, have been written using lex and yacc. Essentially, you use lex to create the logic for evaluating statements then hand the output of lex to yacc to generate C code that is compiled (the make utility, for example, "knows" how to generate an executable program from lex and yacc). There is heavy use of lex and yacc in, for example, AWK. If you obtain a the source code for AWK (from Brian Kernighan's (the K in AWK) at http://www.cs.princeton.edu/~bwk/, you can study how lex and yacc are used to generate the grammar and syntax; learn from the best, eh?

Getting over the hump of grasping just what lex and yacc are all about can be a challenge -- they're almost alien concepts and structures when compared to top-down programming perhaps in a way that going from a "normal" calculator to an RPN calculator can be a challenge (you don't really "think" in RPN, you think more in 2 x 2 = 4). lex is kind of like that. If you make the effort to study the book, the reward may be more than worth the work you put in -- these two tools can open up the world for you.

Anyway best of luck!

theNbomr · 08-21-2012, 09:58 AM

This Lex and Yacc Tutorial by Tom Niemann has always been a favorite of mine. The use of syntax trees in the examples there can contribute greatly to understanding, once you've started to develop a feel for the important concepts.

--- rod.

sundialsvcs · 08-21-2012, 10:28 AM

Well, the above description is not strictly true ...

"Lex" is a scanner, sometimes called a "lexer." It grabs characters from a stream and forms them into "tokens." For example, the string:

if (a == b) { c = 4; }

... contains a total of (if I have counted properly...) 12 tokens:

'if' '(' <ident> '==' <ident> ')' '{' <ident> '=' <int_const> ';' '}'

"Lex" uses regular-expression technology.

"Yacc" is a parser generator. Using the stream of tokens as its input, the Yacc-generated parser ... which is a "C" subroutine ... would recognize the structure of this stream of 12 tokens, recognizing such things as <if_statemet>, <logical_expression>, <assignment_statement>. And also recognizing that the entire stream of tokens is syntactically valid for the grammar of "C."

"Yacc" allows that parser to be generated through the definition of rules, that is to say a "grammar," which formally defines what the incoming language is allowed to be, and what is to be done when various parts of it are recognized. The generated parser is known to be fast, and the parser-generating capability provided by Yacc is generic.

Yacc isn't the only parser out there. There are many others, including for example Bison.

Yacc is a shift/reduce parser, as most such algorithms are, and perhaps the first step for you ought to be to study the approach that's taken by this important algorithm. By that I mean, "when Yacc tells you that you have a 'shift/reduce conflict', what is it actually saying to you?" Likewise, the nature of the algorithm imposes certain strictures upon the language itself ... "but why?" (And don't just say "LALR(1).")

rit · 08-22-2012, 09:11 AM

thank you guys.....