LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-15-2016, 08:13 AM   #1
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Rep: Reputation: 51
Thumbs down Book "Compiler Construction using Flex and Bison" - problems, discussions, steps, ...


Hello. I started reading the book "Compiler Construction using Flex and Bison", freely available at http://research.microsoft.com/en-us/...l/compiler.pdf. Same file mirrored here: http://balacobaco.insomnia247.nl/dedec0/compiler.pdf .

There are a few errors but was able to fix them until chapter 3. Now, in chapter 4, I have reached a point where I cannot make some modifications the text directs to, without reaching a compilation error from bison.

A working bison file (it compiles with no output), with the lines that causes problem when added/changed is:

Code:
%start program

/*

 We should create this union, although it does not have more than one declaration in it:

%union {   /* SEMANTIC RECORD * /
char *id;  /* For returning identifiers * /
}

  We should change IDENTIFIER token to get the name of declared variables:

%token IDENTIFIER

   becomes:

%token <id> IDENTIFIER /* Simple identifier * /

*/

%token <id> IDENTIFIER /* Simple identifier */
%token IDENTIFIER
%token LET INTEGER INT IN
%token SKIP IF THEN ELSE FI END WHILE DO READ WRITE
%token NUMBER
%token ASSGNOP
%left '-' '+'
%left '*' '/'
%left '<' '>' '=' '' /* were missing in the book */
%right '^ '

%{

#include <stdlib.h> /* For malloc in symbol table */
#include <string.h> /* For strcmp in symbol table */
#include <stdio.h> /* For error messages */
#include "st.h" /* The Symbol Table Module */
#define YYDEBUG 1 /* para depuração */

int install( char* sym_name)
{
    symrec *s;
    s = getsym(sym_name);
    if (s == 0)
	s = putsym (sym_name);
    else
    {
	errors++;
	printf("%s is already defined\n", sym_name);
	return 0;
    }
    return 1;
}

int context_check(char* sym_name)
{
    if ( getsym( sym_name ) == 0 )
    {
	printf("%s is an undeclared identifier\n", sym_name);
	return 0;
    }
    return 1;
}


%}

%%

 /* Grammar rules and actions */

program : LET declarations IN commands END ;

declarations : /* empty */
    | INTEGER id_seq IDENTIFIER '.' { install( $3 ); }
;

id_seq : /* empty */
    | id_seq IDENTIFIER ','	    { install( $2 ); }
;
commands : /* empty */
    | commands command ';'
;
command : SKIP
    | READ IDENTIFIER		    { context_check( $2 ); }
    | WRITE exp
    | IDENTIFIER ASSGNOP exp	    { context_check( $2 ); }
    | IF exp THEN commands ELSE commands FI
    | WHILE exp DO commands END
;
exp : NUMBER
			    /* book said $2 for this, wrong*/
   | IDENTIFIER			    { context_check( $1 ); }
   | exp '<' exp
   | exp '=' exp
   | exp '>' exp
   | exp '+' exp
   | exp '-' exp
   | exp '' exp
   | exp '/' exp
   | exp '^ ' exp
   | '(' exp ')'
;

%%

 /* C subroutines */

 /* no output, implied parse tree */
int main( int argc, char *argv[] )
{
    extern FILE *yyin;
    ++argv; --argc;
    yyin = fopen( argv[0], "r" );
    yydebug = 1;
    errors = 0;
    yyparse ();
    return 0;
}
int yyerror (char *s) /* chamada por yyparse() com erros */
{
    printf ("%s\n", s);
    return 1;
}
The file as above works. Making the said changes the error reported is:

Code:
$ bison -vd ch4.y 
ch4.y:82.54-55: $2 from `command' has no declared type
$
Chapter 4 starts in PDF page 19.

I need this .y file working before I proceed to next session, where the corresponding scanner (a flex file) is changed. May you help me understanding what is wrong and fixing it?

Last edited by dedec0; 08-21-2016 at 07:01 AM.
 
Old 08-15-2016, 08:15 AM   #2
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
Forgot to post it. The "st.h" file is:

Code:
typedef struct symrec
{
    char *name;		    /* symbol name*/
    struct symrec *next;    /* link field */
} symrec;

symrec *sym_table = (symrec *)0;
symrec* putsym(char *);
symrec* getsym(char *);

symrec* putsym( char *sym_name)
{
    symrec *ptr;
    ptr = (symrec *) malloc( sizeof(symrec) );
    ptr->name = (char *) malloc( strlen(sym_name) + 1 );
    strcpy( ptr->name, sym_name);
    ptr->next = (symrec*) sym_table;
    sym_table = ptr;
    return ptr;
}

symrec* getsym( char* sym_name)
{
    symrec *ptr;
    for (
	ptr = sym_table;
	ptr != (symrec *) 0;
	ptr = (symrec *)ptr- >next
	)
	if( strcmp( ptr->name, sym_name) == 0 )
	    return ptr;
    return 0;
}

Last edited by dedec0; 08-15-2016 at 08:17 AM.
 
Old 08-15-2016, 10:23 AM   #3
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Are you sure you have copied over all the changes as required? I noticed a reference to "exp : INT" which I do not see in your file. There could be others. I would suggest going back over all entries.
 
Old 08-15-2016, 11:19 AM   #4
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
The "exp: INT" you mention is in the end of PDF page 21? I have changed it to INTEGER.

I had a few doubts and went through a few errors in chapters 1-3. I had to make changes to what is written in the book to be able to compile. I have files made for each chapter, with their given code. Until chapter 3 they work.

Is "INT" something that makes sense? Or would it be a short "typo" for INTEGER, like I thought? And there is also the NUMBER token.

Note that, in the file above, I have added declarations for all of them (or bison gives error for undeclared token).

In the file below I have changed all INT and NUMBER tokens to INTEGER. Then I removed declarations for both, the error is the same. Without the union declaration and the IDENTIFIER with <id> (exchanged) bison compiles it silently (assumed to be good). With the union declaration and <id> in IDENTIFIER, as added in chapter 4, the error appears.

In the code below it is easy to make the changes I mentioned. It is just to cut/paste a few lines from/to multiline comments above each part:

Code:
%start program

 /* SEMANTIC RECORD */
 /* char *id: For returning identifiers */
 /*
Place to easily pasting/cutting the union declaration
%union {
char *id;
}

 */

/* Simple identifier */
 /*
Place to exchange the IDENTIFIER token declarations
%token <id> IDENTIFIER
 */
%token IDENTIFIER

%token LET IN
/* tem os dois, INT e INTEGER?
%token INT
%token NUMBER
*/
%token INTEGER

 /* tava faltando o FI */
%token SKIP IF THEN ELSE FI END WHILE DO READ WRITE
     /* tava faltando o ASSGNOP */
%token ASSGNOP 
%left '-' '+'
%left '*' '/'
%left '<' '>' '=' '' /* não tinham no livro, precisa acrescentar */
%right '^ '

%{

#include <stdlib.h> /* For malloc in symbol table */
#include <string.h> /* For strcmp in symbol table */
#include <stdio.h> /* For error messages */
#include "st.h" /* The Symbol Table Module */
#define YYDEBUG 1 /* para depuração */

int install( char* sym_name)
{
    symrec *s;
    s = getsym(sym_name);
    if (s == 0)
	s = putsym (sym_name);
    else
    {
	errors++;
	printf("%s is already defined\n", sym_name);
	return 0;
    }
    return 1;
}

int context_check(char* sym_name)
{
    if ( getsym( sym_name ) == 0 )
    {
	printf("%s is an undeclared identifier\n", sym_name);
	return 0;
    }
    return 1;
}


%}

%%

 /* Grammar rules and actions */

program : LET declarations IN commands END ;

declarations : /* empty */
    | INTEGER id_seq IDENTIFIER '.' { install( $3 ); }
;

id_seq : /* empty */
    | id_seq IDENTIFIER ','	    { install( $2 ); }
;
commands : /* empty */
    | commands command ';'
;
command : SKIP
    | READ IDENTIFIER		    { context_check( $2 ); }
    | WRITE exp
    | IDENTIFIER ASSGNOP exp	    { context_check( $2 ); }
    | IF exp THEN commands ELSE commands FI
    | WHILE exp DO commands END
;
exp : INTEGER
				    /* no livro está $2, errado */
   | IDENTIFIER			    { context_check( $1 ); }
   | exp '<' exp
   | exp '=' exp
   | exp '>' exp
   | exp '+' exp
   | exp '-' exp
   | exp '' exp
   | exp '/' exp
   | exp '^ ' exp
   | '(' exp ')'
;

%%

 /* C subroutines */

/* não tem saída, a árvore de recon. fica implícita */
int main( int argc, char *argv[] )
{
    extern FILE *yyin;
    ++argv; --argc;
    yyin = fopen( argv[0], "r" );
    yydebug = 1;
    errors = 0;
    yyparse ();
    return 0;
}
int yyerror (char *s) /* chamada por yyparse() com erros */
{
    printf ("%s\n", s);
    return 0;
}
Code:
$ bison -vd ch4.y 
ch4.y:91.54-55: $2 de `command' não tem tipo declarado

Last edited by dedec0; 08-15-2016 at 11:28 AM.
 
Old 08-15-2016, 11:59 AM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
INT appears in your tokens. Been a while since I have played with this stuff, but I am sure one of the others will be able to help you further
 
Old 08-15-2016, 12:22 PM   #6
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
Do not miss it: in this last post INT is only inside a multiline comment (that we should use it to easily change from a working to an erroneous file, or vice versa). There is no other occurrences of it.

I hope to have given enough details of my problem so anyone could easily and quickly reproduce it. Thank you for your goodwill, grail. Do you have a book to recommend? This is the second one that I got with these not so small problems.
 
Old 08-15-2016, 12:26 PM   #7
smallpond
Senior Member
 
Registered: Feb 2011
Location: Massachusetts, USA
Distribution: Fedora
Posts: 4,140

Rep: Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263
Not seeing any definition for ASSGNOP.

Also I don't think this is right:

Code:
%token <id> IDENTIFIER
It looks like it should be:

Code:
%type <id> IDENTIFIER
 
1 members found this post helpful.
Old 08-15-2016, 01:13 PM   #8
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
Quote:
Originally Posted by smallpond View Post
Not seeing any definition for ASSGNOP.
It is missing in the book, I noted it. But I have added it before starting this thread. It is there on line 30 (and it is not inside a comment): %token ASSGNOP .

Quote:
Originally Posted by smallpond View Post
Also I don't think this is right:

Code:
%token <id> IDENTIFIER
It looks like it should be:

Code:
%type <id> IDENTIFIER
Should both type and token exist? It is not clear for me (I tried to read something in https://www.gnu.org/software/bison/m...emantic-Tokens too). My results:

1. Without union and IDENTIFIER is a simple token: it compiles (a previous result).

2. With union declared, "%token IDENTIFIER" removed, line "%type <id> IDENTIFIER" added:

Code:
$bison -vd ch4.y 
ch4.y:18.12-21: symbol IDENTIFIER used, but not defined as a token and has no rules
ch4.y:91.54-55: $2 from `command' has no declared type
3. Continuing from try 2, simply add one more line with "%token IDENTIFIER". Result is the same error from the last big post:

Code:
$bison -vd ch4.y 
ch4.y:91.54-55: $2 from `command' has no declared type
Did I try everything?

For now I am keeping the "%type <id> IDENTIFIER" line, but the error continues.

Last edited by dedec0; 08-15-2016 at 01:20 PM.
 
Old 08-15-2016, 01:24 PM   #9
smallpond
Senior Member
 
Registered: Feb 2011
Location: Massachusetts, USA
Distribution: Fedora
Posts: 4,140

Rep: Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263
Code:
ch4.y:91.54-55: $2 from `command' has no declared type
The error says you have not fully defined the 2nd word of line 91 which it has expanded from "command".
 
1 members found this post helpful.
Old 08-15-2016, 01:56 PM   #10
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
Line 91 is
Code:
command : SKIP
    | READ IDENTIFIER		    { context_check( $2 ); }
    | WRITE exp
    | IDENTIFIER ASSGNOP exp	    { context_check( $2 ); } /* line 91 */
    | IF exp THEN commands ELSE commands FI
    | WHILE exp DO commands END
;
Should it be $3 in line 91? The error repeats. I do not know what else to do with this information. Please give me a clearer hint.

Last edited by dedec0; 08-15-2016 at 02:02 PM.
 
Old 08-15-2016, 02:02 PM   #11
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
NO! It should be $1, right?? So context_check will check if the variable is declared or not. Right??
 
Old 08-15-2016, 06:53 PM   #12
smallpond
Senior Member
 
Registered: Feb 2011
Location: Massachusetts, USA
Distribution: Fedora
Posts: 4,140

Rep: Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263
How much clearer can I get? I already told you in comment 7 there is no definition for ASSGNOP and you didn't believe me.
 
Old 08-15-2016, 08:31 PM   #13
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,263
Blog Entries: 24

Rep: Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194
I will admit first, that I have not fully followed your example code nor read the PDF (that URL is blocked by my local firewall rules).

But I do see that you are confused about the use of %token and %type, so perhaps I can offer some helpful comment on those.

Quote:
Originally Posted by dedec0 View Post
Should both type and token exist? It is not clear for me (I tried to read something in https://www.gnu.org/software/bison/m...emantic-Tokens too).
%token and %type are two different things.

%token is used to declare terminals, and may include or require a <type> assignment when a %union has been declared, but not otherwise. So for terminal symbols (which IDENTIFIER seems to be), either of these would be correct depending on usage...

Code:
%token IDENTIFIER
  or
%token <id> IDENTIFIER
... but NOT...

Code:
%type <id> IDENTIFIER
%type is used to declare the types of non-terminals (commands, command, exp, etc... from your example code).

So something like this might be appropriate, again depending on usage...

Code:
%type <id> commands command exp ...
So in simplistic terms...

When %union is declared you will need to declare terminals (tokens) with a type as...

Code:
%token <id> IDENTIFIER
... and non-terminals with a type as...

Code:
%type <id> command commands exp ...
And your code must be consistent with regard to those types as values traverse the parse tree. That is, when a typed right-hand value is assigned to a left-hand (non-terminal) symbol, they must be of the same type or the compiler will complain (this is what the type declarations are used for).

Quote:
Originally Posted by dedec0 View Post
For now I am keeping the "%type <id> IDENTIFIER" line, but the error continues.
That is not correct IF IDENTIFIER is a terminal symbol, as appears to be the case.

It may be that the '$2 from command' error you are seeing is complaining about exp, not IDENTIFIER (but, also not clear to me).

Last edited by astrogeek; 08-15-2016 at 08:45 PM.
 
1 members found this post helpful.
Old 08-15-2016, 08:37 PM   #14
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
Quote:
Originally Posted by smallpond View Post
How much clearer can I get? I already told you in comment 7 there is no definition for ASSGNOP and you didn't believe me.
It is not what I did. Don't mix understanding with believing. ASSGNOP was defined, there is a token declaration for it, as I said above. You expected me to understand that:

"since there is no definition for ASSGNOP, I should look at line 91 and see that I need to change the argument because it is incorrect, it is pointing to ASSGNOP instead of IDENTIFIER"

?

No way. Your words there are not clear at all, not for me. I bet that not for others too. The title of the thread contains the word "learning" because I know very little of Bison, if I can say I know something about it at all.

Anyway, thank you. Our conversation helped me to solve another "easy" problem in this book.
 
Old 08-15-2016, 08:57 PM   #15
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
astrogeek, thank you for your explanations. I had not yet seen %type, so I had no clue what was it. I just used it as something more to try, almost blindly (as my trial and error report shows).

As I undertand now, IDENTIFIER is the token that corresponds to the variable name, not its value or symbol. In C we could have an integer atribution:

boxOfFruits = 26;

For this line, there would be 4 tokens: IDENTIFIER (with a string "boxOfFruits"); EQUAL_SIGN; INTEGER (with value 26); END_OF_LINE. So, IDENTIFIER is a terminal symbol or not (I think it is not). The language of the book it not C, but it is something similar for that expression.
 
  


Reply

Tags
bison



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
flex bison question sujandasmahapatra Programming 2 11-17-2012 08:30 AM
[Flex & Bison] How to check which state Flex is in? courteous Programming 0 06-03-2011 11:46 AM
Is there any support for bison-bridge and bison-locations in flex on windows systems? rami alkhateeb Linux - Software 0 12-29-2010 09:10 AM
flex and bison saurav.nith Linux - General 1 04-06-2010 06:38 AM
bison / flex zaman Programming 1 08-16-2005 10:19 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 11:30 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration