ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am attempting to write an interpreter (not a compiler) in Java. I am going to invent my own very basic language and the Java program will interpret a file written in that basic language. My "language" is going to be called SPL (Small Programming Language) and it is basically going to be useless (simple mathematical operations, simple variable assignment/deceleration). What I am trying to get out of this exercise is a good understanding of how one might parse code using Java in such a way that it can be executed. Here is what I am thinking right now...
My dad gave me the idea of using a recursive algorithm that searches through an expression for the operator of least precedence and then runs the algorithm again on either side of the operator until it gets to the base case. The base case would be an expression with no operator (either a number or a variable). In the case of a variable, a simple method would return its value.
EX: x * 5 + y *2
algorithm (x * 5 + y * 2) ---> algorithm (x *5) + algorithm (y * 2) ---> algorithm (x) * algorithm (5) + algorithm (y) * algorithm (2)
You can see at the end that the algorithm has it its base case and the expression will be calculated.
My idea is to parse expressions into a tree.
EX: x * 5 + y * 2
Code:
+
/ \
* *
/ \ / \
x 5 y 2
This is similar to my dads idea, and probably less elegant, but I prefer it because it is my own :-D.
So I thought I would just bounce these ideas off the community and see if anyone had ideas of their own. Please, keep discussion on a conceptual level! This is not some homework assignment, just an interesting problem :-)
EDIT: The tree notation is messed up, I hope it is still clear :-(
EDIT EDIT: Thanks GrapefruiTgirl for the tip on how to fix the notation!
Last edited by timmit; 10-02-2007 at 11:56 PM.
Reason: Tree notation
Well, compiling and parsing algorithms have been well-discussed, and I'm sure that there are Java implementations. Look at tools (not in Java) such as yacc, bison, and lex.
The general approach that is most often used is one that is called shift/reduce. The parser considers one symbol at a time. It can either shift ("push") the symbol onto a stack of symbols that it maintains, or it can reduce ... which means grabbing a set of symbols from the stack and replacing it with just one. Along the way, it calls routines in the back-end of your compiler.
A grammar tells the parser what the language looks like, and it serves as a sort of "road map" that guides the parser in its decision of what to do next: shift, reduce, or go into error-recovery.
The advantage of this approach, vs. recursive-descent, is that the complexity is pushed to the grammar, which is simply a data-structure. Your code remains simple. Whether the language is simple or the language is complex, the compiler itself remains simple.
---
Incidentally, both compilers and interpreters usually work this way. Even an interpreter will "semi-compile" the source-code into something that can be efficiently executed. The representation might be a "p-code" of some sort, or it might be a tree.
Enjoy!
Last edited by sundialsvcs; 10-03-2007 at 08:17 AM.
Thanks for your reply! Let me make sure I understand the shift/reduce idea that you bring up. So the interpreter has a stack of operators that it has previously encountered (let say the stack is simply two + operators). If the interpretor encounters an operator of higher precedence (let's say a multiplication symbol), it will shift the stack and insert then multiplication symbol? And then if it encounters an operator of equal precedence (let's say another addition operator) it will reduce all the + operators into one step on the stack?
Also a question about grammar. Using the methods I suggest, both the compiler and the grammar could possibly be complex. But using the shift/reduce method, only grammar might be complex?
Thanks for your reply! Let me make sure I understand the shift/reduce idea that you bring up. So the interpreter has a stack of operators that it has previously encountered (let say the stack is simply two + operators). If the interpretor encounters an operator of higher precedence (let's say a multiplication symbol), it will shift the stack and insert then multiplication symbol? And then if it encounters an operator of equal precedence (let's say another addition operator) it will reduce all the + operators into one step on the stack?
Also a question about grammar. Using the methods I suggest, both the compiler and the grammar could possibly be complex. But using the shift/reduce method, only grammar might be complex?
This is a very large subject. If you have no idea what a LALR parser or abstract syntax tree is, I suggest you buy a book on the subject. Arguably, one of the best on the subject: http://www.amazon.com/gp/product/0321486811/. It is a very heavy book (in terms of comprehension, not weight), so be prepared to spend a bit of time w/ it. As for the other post, talking about lex/yacc, there are equivalents in Java. I've used JFlex and JavaCup before, so I'll suggest them. Good luck!
This is a very large subject. If you have no idea what a LALR parser or abstract syntax tree is, I suggest you buy a book on the subject. Arguably, one of the best on the subject: http://www.amazon.com/gp/product/0321486811/. It is a very heavy book (in terms of comprehension, not weight), so be prepared to spend a bit of time w/ it. As for the other post, talking about lex/yacc, there are equivalents in Java. I've used JFlex and JavaCup before, so I'll suggest them. Good luck!
Thanks for the link! That looks like quite a book, unfortunately it has a price to go along with it :-(. Most of the things in that book are probably outside the scope of what I am trying to do right now though, I just want to interpret a very simple programming language I will design. Seems like really interesting reading though. I will have to see if my University Library has a copy.
I am not really sure what a LALR parser is, but I have an interesting look at abstract syntax trees using this.
Thanks for the link! That looks like quite a book, unfortunately it has a price to go along with it :-(. Most of the things in that book are probably outside the scope of what I am trying to do right now though, I just want to interpret a very simple programming language I will design. Seems like really interesting reading though. I will have to see if my University Library has a copy.
I am not really sure what a LALR parser is, but I have an interesting look at abstract syntax trees using this.
Well, you're interested, and that is, by far, the most important part! I take it you are a bit younger then? I know this sounds nerdy, but I used to ask for computer books for birthday/xmas presents when I was a kid, lol. Take a look at some tutorials on lex & yacc. It should help you out (at least to understand the fundamentals and create something useful). Read somet tutorials, if you have any problems, write back on here, I'm sure some more people will help you out as well!
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.