LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Yacc conflicts - How to understand grammar conflict (https://www.linuxquestions.org/questions/programming-9/yacc-conflicts-how-to-understand-grammar-conflict-812094/)

dayalan_cse 06-04-2010 03:06 AM

Yacc conflicts - How to understand grammar conflict
 
Hello,

I am getting shift/reduce and reduce/reduce conflict in my yacc grammar code.

yacc generates (i.e yacc -d -v parser.y) its grammar and state information in the parser.output file.

How can i interpret the below information and understand which grammar state is conflicting with other grammar state? Please help me to understand.


State 159 conflicts: 1 shift/reduce
State 1863 conflicts: 1 shift/reduce
State 1865 conflicts: 1 shift/reduce
State 1960 conflicts: 1 shift/reduce

ForzaItalia2006 06-04-2010 12:52 PM

Hey,

Quote:

Originally Posted by dayalan_cse (Post 3992156)
yacc generates (i.e yacc -d -v parser.y) its grammar and state information in the parser.output file.

Why don't you use bison? I think this is more common nowadays in Linux environments.

Quote:

Originally Posted by dayalan_cse (Post 3992156)
How can i interpret the below information and understand which grammar state is conflicting with other grammar state? Please help me to understand.

Well, I'm not sure how familiar you are with grammars and the specifics of shift/reduce parsers, but this error tries to tell you that there is one (or several states) wherein the parser can't determine if it should reduce the symbols on the stack or if it should shift the current symbol.

Quote:

Originally Posted by dayalan_cse (Post 3992156)
State 159 conflicts: 1 shift/reduce
State 1863 conflicts: 1 shift/reduce
State 1865 conflicts: 1 shift/reduce
State 1960 conflicts: 1 shift/reduce

If I remember correctly from bison debugging output, you should check first of all which rules are in the states mentioned above and then check which rule specifically causes the problems.

What kind of grammar did you build? A programming language?

Andi

ArthurSittler 06-05-2010 02:01 AM

examine state table using -v switch
 
Use the -v switch will make yacc dump the entire state table. When you do this, redirect the output of yacc into a file so that you can examine the state table.

yacc and bison create state tables with a state for each possible step of progress through each production in the grammar. In those cases where there are multiple productions which can lead through the same sequence of nonterminals and tokens, those sequences will be merged. yacc will then look at the next token to decide which available alternative of different grammar productions it should chooses.

Each state is assigned an arbitrary but unique number. Each state is represented by a table of state numbers to which the parser will transition indexed by the token number of the next (i.e., the lookahead) token. Associated with each state is a sequence of nonterminals and tokens that yacc has shifted. The numbers in the error messages are the state numbers of the states where yacc is unable to choose whether it should shift the next token or reduce the shifted tokens to a nonterminal.

Remember that yacc is lalr(1). That is, it can only look at one single lookahead token from the scanner to decide whether to shift the token or reduce the sequence of shifted tokens to a nonterminal. The lookahead token could be the initial token in some sequence that could reduce to something that follows some nonterminal to which some sequence on the top of the sequence could reduce. Or it could be the next token of some other sequence into which some other sequence of nonterminals and tokens at the top of the parse stack could be reduced.

If, for some state, for some single lookahead token, there is both a possibility to shift that token and a possibility to reduce to a nonterminal, then there is a shift/reduce conflict for that state.

If, for some state, for some single lookahead token, there are two different nonterminals into which the parser could reduce some sequence of nonterminals and terminals from the top of the stack, then there is a reduce/reduce conflict.

Once you know what sequence of tokens and nonterminals yacc has seen to get into the state with the shift/reduce conflict, you can look through your grammar to find what productions share the sequence of tokens and nonterminals.

The classical reference for compiler writing using lex and yacc is Compilers Principles, Techniques, and Tools, by Aho, Sethi, and Ullman ISBN 0-201-10088-6. This is commonly known as "The Dragon Book". Though it was published back in 1986, it is still a standard reference for the subject. As you will find by searching the web, there are more modern tools including lalr(k) parser generators with arbitrary (usually very small) k.

dayalan_cse 06-14-2010 12:20 AM

Quote:

Originally Posted by ForzaItalia2006 (Post 3992612)
Hey,



Why don't you use bison? I think this is more common nowadays in Linux environments.



Well, I'm not sure how familiar you are with grammars and the specifics of shift/reduce parsers, but this error tries to tell you that there is one (or several states) wherein the parser can't determine if it should reduce the symbols on the stack or if it should shift the current symbol.



If I remember correctly from bison debugging output, you should check first of all which rules are in the states mentioned above and then check which rule specifically causes the problems.

What kind of grammar did you build? A programming language?

Andi


Hi Andi,

thanks for your information.

What is the difference between yacc and bison? which one is latest?

Thanks,
Deenadayalan

dayalan_cse 06-14-2010 12:23 AM

Quote:

Originally Posted by ArthurSittler (Post 3993092)
Use the -v switch will make yacc dump the entire state table. When you do this, redirect the output of yacc into a file so that you can examine the state table.

yacc and bison create state tables with a state for each possible step of progress through each production in the grammar. In those cases where there are multiple productions which can lead through the same sequence of nonterminals and tokens, those sequences will be merged. yacc will then look at the next token to decide which available alternative of different grammar productions it should chooses.

Each state is assigned an arbitrary but unique number. Each state is represented by a table of state numbers to which the parser will transition indexed by the token number of the next (i.e., the lookahead) token. Associated with each state is a sequence of nonterminals and tokens that yacc has shifted. The numbers in the error messages are the state numbers of the states where yacc is unable to choose whether it should shift the next token or reduce the shifted tokens to a nonterminal.

Remember that yacc is lalr(1). That is, it can only look at one single lookahead token from the scanner to decide whether to shift the token or reduce the sequence of shifted tokens to a nonterminal. The lookahead token could be the initial token in some sequence that could reduce to something that follows some nonterminal to which some sequence on the top of the sequence could reduce. Or it could be the next token of some other sequence into which some other sequence of nonterminals and tokens at the top of the parse stack could be reduced.

If, for some state, for some single lookahead token, there is both a possibility to shift that token and a possibility to reduce to a nonterminal, then there is a shift/reduce conflict for that state.

If, for some state, for some single lookahead token, there are two different nonterminals into which the parser could reduce some sequence of nonterminals and terminals from the top of the stack, then there is a reduce/reduce conflict.

Once you know what sequence of tokens and nonterminals yacc has seen to get into the state with the shift/reduce conflict, you can look through your grammar to find what productions share the sequence of tokens and nonterminals.

The classical reference for compiler writing using lex and yacc is Compilers Principles, Techniques, and Tools, by Aho, Sethi, and Ullman ISBN 0-201-10088-6. This is commonly known as "The Dragon Book". Though it was published back in 1986, it is still a standard reference for the subject. As you will find by searching the web, there are more modern tools including lalr(k) parser generators with arbitrary (usually very small) k.

Hi,

Thanks for your inputs and it helped me to understand.

If we have such a big grammar for a given language, how to trace and figure out the conflicting between two non-terminals or a token (shift) and non-terminal?

Any debugging tool available to debug this shift/shift & reduce/reduce problems except going through the <yyparse>.output whole grammar?

Thank you for your inputs in advance.

Thanks,
Deenadayalan

ArthurSittler 06-15-2010 02:05 AM

look at progress string for conflicting state
 
Bison is the GNU free replacement for yacc. It has some additional features and possibly improved performance. It also had a somewhat restrictive license in that, if you use bison, any program generated by using that tool must also be made as freely available as the tool. Byacc is another version which requires that any program generated using it must acknowledge the copyright holders. I do not find any of the license restrictions onerous.

The states dumped by the -v command each have associated parse progress indicator comments and state numbers. The parse progress indicator shows what productions you might be trying to reduce in that state. The error messages tell you which states to look at for the problem. You may be able to look at the states and recognize which productions are causing the conflicts.

The brute-force method to debugging yacc conflicts is also the gentlest approach: start with some small but reasonable subset of the grammar. For compilers, this could be expressions, for instance. Also build a test input file which includes only the subset of the language accepted by the parser. You can add actions to the parser as you accumulate working grammar, or add the actions later. Then as you add more and more productions, you will start encountering conflicts. Along with the conflict error messages, this incremental development of the grammar will show you which productions cause the conflicts.

I start the whole process with a scanner generated using flex, the replacement for lex. I build a file including every token that the scanner should recognize and test the scanner with a test driver main() function. I enclose the main() in preprocessor conditionals so I can turn it off after the scanner accepts the test language file.

A makefile which invokes lex, yacc, cc, etc. helps me with the build process. I consider it indispensable. I do not believe I could ever actually remember all the command line switches. Don't forget to link in the libraries associated with flex and bison. But you must have gotten past that if you are up to the point of conflicts.

MTK358 06-15-2010 08:04 AM

He was probably using Bison (it comes with most Linux distros). "yacc" is a symlink to "bison" in most Linux installations.


All times are GMT -5. The time now is 02:57 PM.