LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-20-2011, 03:11 PM   #1
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
A few questions about making an interpreter


The first is about implementing function calls. The way I currently have it is that functions are called with a C++ std::vector of nodes as the parameters. How would I turn a comma-seperated list of expressions into a C++ vector in the grammar?

Second, how do you implement left-associative operators in a parser that does not allow left recursion?

And third, what would be the best internal representation of integers? A C++ int seems simplest, but limited. Using GMP seems more versatile, but I'm afraid it might seriously slow down the interpreter compared to C++ ints.
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 01-20-2011, 04:43 PM   #2
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by MTK358 View Post
How would I turn a comma-seperated list of expressions into a C++ vector in the grammar?

Second, how do you implement left-associative operators in a parser that does not allow left recursion?
If you are asking YACC questions, your thread title should say you are asking YACC questions, not questions about "making an interpreter".

I took the stupid formal grammar portion of a computer science course far too long ago to remember much beyond how unrealistic it all was. So maybe there are answers to your questions within the realm of grammar that don't assume the specific tool (YACC or some specific other). But even if so, I suspect it would be too abstract an answer to be useful.

When you just code a parser from scratch (no YACC, nor ANTLR nor whatever), none of those issues are ever problems. Having used YACC and ANTLR (and other tools) and coded from scratch, each multiple times, my own experience is that coding from scratch feels tedious and wasteful as you are doing it (lots of doing almost the same thing over and over) but none of it is ever hard and debugging the result (if that ever is necessary) is trivial.

VS. using a tool starts out quick and easy (after some delay for reading about the tool) but then one detail after another that ought to be trivial turns out to be hard. Then you need a lot of debugging and the debugging is a nightmare.

The best feature of YACC is that it isn't too hard to use for a good language, but becomes a nightmare for a bad language. That is a very positive feature in academic use where you typically invent the language and the parser together. If a language feature is hard to parse, must be a stupid language feature, so change the language spec, rather than struggle with the parser.

Outside of academia, YACC is garbage. In the real world the language is typically spec'ed by an idiot and that spec is forced onto the programmer (too often me) responsible for the parser with no path to push back against bad language design.

I assume you are inventing your own language. So I still don't like YACC, but I won't say you are absolutely wrong to choose it.

Just if you want to ask a YACC question, explicitly ask a YACC question. (and wait for someone other than me to answer it).

Quote:
And third, what would be the best internal representation of integers? A C++ int seems simplest, but limited. Using GMP seems more versatile, but I'm afraid it might seriously slow down the interpreter compared to C++ ints.
Don't ask that!

How do we know the design goals and tradeoffs of your project? Take some responsibility for your design.

That said, there is a serious amount of overhead to all but the most perfect interpreter designs. The higher the overhead is, the less noticeable the real work will be. In a compiled language unnecessarily using GMP when int or long long would have been enough would be outrageously inefficient. Percentage wise, the same choice in an interpreter isn't as important.

Last edited by johnsfine; 01-20-2011 at 05:05 PM.
 
3 members found this post helpful.
Old 01-20-2011, 07:14 PM   #3
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
I was using leg, which generates a recursive descent parser.

And the basic structure of a function call would be like this:

Code:
Expr OPAREN (Expr (COMMA Expr)*)? CPAREN
The problem is how do I convert the parenthesized list of Exprs into a C++ vector? Note that the first item is an expression, not an identifier. This is because this language has first-class functions.

Last edited by MTK358; 01-20-2011 at 07:26 PM.
 
Old 01-21-2011, 03:45 PM   #4
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
I have a partially completed leg grammar now:

Code:
%{
#include <string>
#define YYSTYPE union {Node *node; std::string &str; std::vector<Node*> &vector}
%}

Program = - b:Block !. { b.node->eval(); } ;

Block = l:StmtList { $$ = new ScopeNode(l.node); } ;

Stmt = e:Expr EOL  { $$ = e.node; } ;

StmtList = s:Stmt l:StmtList { $$ = new StmtListNode(s.node, l.node); }
         | s:Stmt            { $$ = s; }
         ;

Expr = n:NAME                                    { $$ = new VariableNode(n.str); }
     | e:Expr DOT n:NAME                         { $$ = new MemberAccessNode(e.node, n.str); }
     | n:NAME ASSIGN e:Expr                      { $$ = new AssignmentNode(n.str, e.node); }
     | e:Expr OPAREN l:CommaSeparatedList CPAREN { $$ = new FuncCallNode(e.node, l.vector); }
     | x:INTEGER                                 { $$ = x; }
     | OPAREN x:Expr CPAREN                      { $$ = x; }
     ;

CommaSeparatedList = expr:Expr COMMA list:CommaSeparatedList { list.vector.insert(0, expr.node); $$ = list; }
                   | expr:Expr { $$ = std::vector<Node*>(expr.node); }
                   ;

INTEGER = <([0-9]+ | 0b[01]+ | 0o[0-7]+ | 0x[0-9a-fA-F]+)([Ee][0-9]+)?> - { $$ = new IntegerNode(yytext); } ;

NAME    = <[A-Za-z_][A-Za-z_0-9]*> - { $$ = std::string(yytext); } ;

IF        = 'if'    - ;
ELSEIF    = 'ei'    - ;
ELSE      = 'else'  - ;
ENDIF     = 'endif' - ;
WHILE     = 'while' - ;
ENDWHILE  = 'loop'  - ;
DO        = 'do'    - ;
FUNC      = 'func'  - ;
ASSIGN    = '='     - ;
OPAREN    = '('     - ;
CPAREN    = ')'     - ;
DOT       = '.'     - ;
COMMA     = ','     - ;

EOL = '\n' | ';'    - ;

- = [ \t]* ;
But it shows an error:

Code:
$ leg parser.leg -o parser.cpp
parser.leg:28: syntax error before text "%{"
I have no idea what it could be.

Last edited by MTK358; 01-21-2011 at 03:46 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Few questions abbout making backups Micik Linux - Newbie 9 07-01-2009 09:32 AM
Questions about making package mocqueanh Slackware 5 02-08-2008 02:00 PM
Thinking about making the switch to Linux, a couple of questions?? musiclover7 Linux - General 5 07-21-2006 10:36 AM
Some questions about making your OpenBSD CD... Quartzophobia *BSD 7 02-10-2004 08:33 AM
Making some programs default and another questions... Mega Man X Linux - General 4 07-19-2003 05:12 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 02:34 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration