ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
It's difficult to say really without knowing how much knowledge you have on the fundamentals for which lex and yacc are used?
Do you have any experience and knowledge of regular expressions (lex) and of grammars (yacc)?
My suggestion to start off would be to write yourself a very small formal grammar. Something as simple as an "Adding Machine" which takes a set of numbers and adds then together with the + symbol. Although that sounds extremely simple, it will be more than enough of a challenge to a beginner. You can use lex to find the numbers and the addition symbol then push those into yacc to define an equation etc etc...
The whole field around this area can be pretty mind-boggling though - it depends what you want to use it for.
I will second the notion that this book is a very good place to start. If you, say, basically know nothing at this point about regular expressions and/or the vagaries of LALR(1) grammars and parsing theory, then ... "you're going to take a sip from the firehose, and any firehose will do, but this book is a good one if you prefer ... as I do ... to sit down with a good book (and a good glass of fine wine or equally-fine Scotch, to sip either one).
Don't get buried in the theory, which comes too-quickly. Grab the "big picture." (The source-code is incomprehensible. ... but it works.)
If you have further comments or questions after doing your initial studies, feel free to ask here. We'll know.
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Well, regular expressions are... regular expressions; they're the same across pretty much every editor, shells (BASH, Bourne, Korn) and lex. So, you're already half-way there. This stuff we so love to use all came out of Bell Labs' smart guys who invented it to make the job of building and maintaining Unix easier and more predictable and reliable.
Keep in mind that many compilers, notably C, have been written using lex and yacc. Essentially, you use lex to create the logic for evaluating statements then hand the output of lex to yacc to generate C code that is compiled (the make utility, for example, "knows" how to generate an executable program from lex and yacc). There is heavy use of lex and yacc in, for example, AWK. If you obtain a the source code for AWK (from Brian Kernighan's (the K in AWK) at http://www.cs.princeton.edu/~bwk/, you can study how lex and yacc are used to generate the grammar and syntax; learn from the best, eh?
Getting over the hump of grasping just what lex and yacc are all about can be a challenge -- they're almost alien concepts and structures when compared to top-down programming perhaps in a way that going from a "normal" calculator to an RPN calculator can be a challenge (you don't really "think" in RPN, you think more in 2 x 2 = 4). lex is kind of like that. If you make the effort to study the book, the reward may be more than worth the work you put in -- these two tools can open up the world for you.
This Lex and Yacc Tutorial by Tom Niemann has always been a favorite of mine. The use of syntax trees in the examples there can contribute greatly to understanding, once you've started to develop a feel for the important concepts.
"Yacc" is a parser generator. Using the stream of tokens as its input, the Yacc-generated parser ... which is a "C" subroutine ... would recognize the structure of this stream of 12 tokens, recognizing such things as <if_statemet>, <logical_expression>, <assignment_statement>. And also recognizing that the entire stream of tokens is syntactically valid for the grammar of "C."
"Yacc" allows that parser to be generated through the definition of rules, that is to say a "grammar," which formally defines what the incoming language is allowed to be, and what is to be done when various parts of it are recognized. The generated parser is known to be fast, and the parser-generating capability provided by Yacc is generic.
Yacc isn't the only parser out there. There are many others, including for example Bison.
Yacc is a shift/reduce parser, as most such algorithms are, and perhaps the first step for you ought to be to study the approach that's taken by this important algorithm. By that I mean, "when Yacc tells you that you have a 'shift/reduce conflict', what is it actually saying to you?" Likewise, the nature of the algorithm imposes certain strictures upon the language itself ... "but why?" (And don't just say "LALR(1).")
Last edited by sundialsvcs; 08-21-2012 at 10:35 AM.