Visit Jeremy's Blog.
 Home Forums HCL Reviews Tutorials Articles Register Search Today's Posts Mark Forums Read
 LinuxQuestions.org Could someone help explain this code?
 Programming This forum is for all programming questions. The question does not have to be directly related to Linux and any language is fair game.

Notices

 03-31-2010, 03:35 PM #1 MTK358 LQ 5k Club   Registered: Sep 2009 Posts: 6,443 Blog Entries: 3 Rep: Could someone help explain this code? http://llvm.org/docs/tutorial/LangImpl2.html I tried implementing that before and went pretty far, but I got stuck at the op precedence part. Anyway, is that a recursive descent parser? And can someone help explain how the operator precedence works?
03-31-2010, 03:58 PM   #2
johnsfine
LQ Guru

Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep:
Quote:
 Originally Posted by MTK358 Anyway, is that a recursive descent parser?
If you compared it against any rigorous definition of "recursive descent parser", it doesn't match.

But the basic concepts are the same. So it is something similar to a "recursive descent parser".

Quote:
 can someone help explain how the operator precedence works?
How operator precedence "works" in the language made up for that tutorial?
Or how the ParseBinOpRHS function in that tutorial implements the rules of operator precedence?

That function is called whenever the parser has an expression that might be the left hand operand of some larger expression.

Assuming there is a binary operator as the next token, making it possible for the current expression to be its left hand operand:

The function first must decide whether that expression is instead the right hand operand of some larger expression being processed by the (recursive) caller. It does so by comparing the precedence passed in by that caller to the next token's precedence.

If the expression is the left operand of the next token, the function must then find the right operand of that token. To do so, it advances past the token and parses a primary expression. That primary expression might be the desired right operand or it might left operand of a subexpression and that whole subexpression would be the desired right operand. So the function uses itself recursively to construct its right operand.

Last edited by johnsfine; 03-31-2010 at 04:07 PM.

03-31-2010, 04:01 PM   #3
MTK358
LQ 5k Club

Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Original Poster
Rep:
Quote:
 Originally Posted by johnsfine How operator precedence "works" in the language made up for that tutorial? Or how the ParseBinOpRHS function in that tutorial implements the rules of operator precedence?
I am talking about how the ParseBinOpRHS function figures out the precedence.

03-31-2010, 04:22 PM   #4
johnsfine
LQ Guru

Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep:
Quote:
 Originally Posted by MTK358 I am talking about how the ParseBinOpRHS function figures out the precedence.
I was editing that part of my reply when you posted, so reread above if you haven't yet.

 03-31-2010, 08:04 PM #5 MTK358 LQ 5k Club   Registered: Sep 2009 Posts: 6,443 Blog Entries: 3 Original Poster Rep: I'm still having difficulty understanding it, but it slightly makes sense. I was wondering if it could be easier just to do it this way? And no matter what method, how do you handle left/right accociativity? Code: ```expr ::= expr2 "+" expr2 | expr2 "-" expr2 | expr2 expr2 ::= expr3 "*" expr3 | expr3 "/" expr3 | expr3 expr3 ::= NUMBER | VARIABLE | "-" expr3 | ( expr )```
04-01-2010, 05:44 AM   #6
Sergei Steshenko
Senior Member

Registered: May 2005
Posts: 4,481

Rep:
Quote:
 Originally Posted by MTK358 http://llvm.org/docs/tutorial/LangImpl2.html I tried implementing that before and went pretty far, but I got stuck at the op precedence part. Anyway, is that a recursive descent parser? And can someone help explain how the operator precedence works?
Oh, now C++ .

...

Here are some hopefully useful links regarding expression parsing:

http://www.smccd.net/accounts/hasson...icParsing.html
http://en.wikipedia.org/wiki/Operator-precedence_parser
.

I'm impatiently waiting for your reaction regarding the fact that in both articles the algorithm is explained in English, and code samples are given too. And no, I wasn't picking articles by hand.

...

http://en.wikipedia.org/wiki/Scannerless_parsing

...

At all, http://en.wikipedia.org/wiki/Categor...ing_algorithms .

1 members found this post helpful.
04-01-2010, 07:09 AM   #7
MTK358
LQ 5k Club

Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Original Poster
Rep:
Quote:
 Originally Posted by Sergei Steshenko Oh, now C++ .

If you insist that Perl will really be better, then maybe... , because I really am intrigued by the idea of storing AST nodes as hashes.

Quote:
 Originally Posted by Sergei Steshenko Here are some hopefully useful links regarding expression parsing: http://www.smccd.net/accounts/hasson...icParsing.html http://en.wikipedia.org/wiki/Operator-precedence_parser
Still I don't think I understand it really well.

Last edited by MTK358; 04-01-2010 at 07:12 AM.

04-01-2010, 07:18 AM   #8
Sergei Steshenko
Senior Member

Registered: May 2005
Posts: 4,481

Rep:
Quote:
 Originally Posted by MTK358 Because I had already partially wrote it before. If you insist that Perl will really be better, then maybe... Still I don't think I understand it really well.

I am not saying Perl is really better, but in Perl everything is "elastic" by default, and there is no type strictness which in this case is good - again, you do not have to plan ahead, say, fields in a data structure describing a tree node, you just add hash keys.

Regarding "I don't think I understand it really well" - here is a very simple question for you: how (step by step) as a human do you calculate the following expression:

Code:
`1 + 2 + 3 + 4 * 5 ^ 2`
?

'^' is power operation (i.e. 2 ^ 3 is 8) which has higher precedence than other arithmetic operations. I emphasize the point - step by step.

04-01-2010, 07:25 AM   #9
MTK358
LQ 5k Club

Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Original Poster
Rep:
Quote:
 Originally Posted by Sergei Steshenko I am not saying Perl is really better, but in Perl everything is "elastic" by default, and there is no type strictness which in this case is good - again, you do not have to plan ahead, say, fields in a data structure describing a tree node, you just add hash keys.
I guess it might be worth a try.

Quote:
 Originally Posted by Sergei Steshenko Regarding "I don't think I understand it really well" - here is a very simple question for you: how (step by step) as a human do you calculate the following expression: Code: `1 + 2 + 3 + 4 * 5 ^ 2` ? '^' is power operation (i.e. 2 ^ 3 is 8) which has higher precedence than other arithmetic operations. I emphasize the point - step by step.
I still don't really understand the kind of answer you expect from this kind of question, but here goes...

I look at all the operators and find the '^', that has the highest precedence. I solve it and replace "5 ^ 2" with "25".

Then I see the '*', which has higher precedence than the rest. I replace "4 * 25" with 100.

Then, because the rest is addition and it doesn't matter which way you add them, I just add all the remaining numbers up.

04-01-2010, 07:41 AM   #10
Sergei Steshenko
Senior Member

Registered: May 2005
Posts: 4,481

Rep:
Quote:
 Originally Posted by MTK358 I guess it might be worth a try. I still don't really understand the kind of answer you expect from this kind of question, but here goes... I look at all the operators and find the '^', that has the highest precedence. I solve it and replace "5 ^ 2" with "25". Then I see the '*', which has higher precedence than the rest. I replace "4 * 25" with 100. Then, because the rest is addition and it doesn't matter which way you add them, I just add all the remaining numbers up.
The item in bold is the evidence I've been expecting "forever". And this evidence really proves my suspicions about things you do not understand. And I'll try to help you to understand them right.

So now let's change the task by "scaling". Again, let's take the same expression:

Code:
`1 + 2 + 3 + 4 * 5 ^ 2`
and let's imagine that its "head" of "pluses", i.e.

Code:
`1 + 2 + 3 + 4`
is very very long, like thousand(s) of members. Other than that the expression is the same.

Let's also assume that each operand is written, say, as 30cm/1foot high/wide digit (approximately), and it's written on a wall. Similarly, operation signs ('+', '*', '^' in this case) are also written kinda in 30cm x 30cm cells, and your eyes distance from the wall is also about 30cm/1foot, i.e. at a time you can see just one operand or one operation sign. You also have a notebook (I mean paper one), a pencil and an eraser, so you can take notes and read them as needed.

You are put at the left side of the expression and because of the above conditions, as I said, you can only see what is now the leftmost '1'.

How would you calculate the expression under such circumstances ? You are not allowed (and do no need) to run to the other end in order to see the whole expression. Remember, the expression occupies at least 1000 feet (300m) on the wall, so, again, there is no chance you see its tail.

I reiterate, you can see either one operand or one operation sign at a time, and you can take notes.

And let's assume you are constructively lazy, i.e. you want to perform minimum amount of moves along the wall.

Again, step by step actions you'll perform as a human.

 04-01-2010, 09:37 AM #11 MTK358 LQ 5k Club   Registered: Sep 2009 Posts: 6,443 Blog Entries: 3 Original Poster Rep: I am not sure, I guess I might do it like this: I would go past the first number, the plus, and the next number. Then I check the next operator and it's another plus. Since it's left-associative, I see if it's precedence is > than the previous plus. It's not, so I calculate the first +-expression and move on... Then I get to the multiplication. It's left-associative, so I check if it's precedence > than of the plus. It is, so I skip to the multiplication. But I first see what the next operator is. It's a ^, and it's right-accociative, so I see if it's precedence ≥ the * operator. It is, so I see the next one. But there is no next one, so I do the ^, then fall back to the *, and to the +.
04-01-2010, 09:55 AM   #12
Sergei Steshenko
Senior Member

Registered: May 2005
Posts: 4,481

Rep:
Quote:
 Originally Posted by MTK358 I am not sure, I guess I might do it like this: I would go past the first number, the plus, and the next number. Then I check the next operator and it's another plus. Since it's left-associative, I see if it's precedence is > than the previous plus. It's not, so I calculate the first +-expression and move on... Then I get to the multiplication. It's left-associative, so I check if it's precedence > than of the plus. It is, so I skip to the multiplication. But I first see what the next operator is. It's a ^, and it's right-accociative, so I see if it's precedence ≥ the * operator. It is, so I see the next one. But there is no next one, so I do the ^, then fall back to the *, and to the +.
You are on the right track.

Now regarding the item in bold - I have a "bad news" for you - no matter how you formulate, but in fact it really means:

"if current operator (after which there is an operand) is followed by by an operator with higher precedence, then I do this and that". Likewise, similar statements with followed by can be formulated with equal and lower precedence.

Back to our sheep - assume now that your notebook has sheets with pretty big cells in it. The cell is big in a sense you can write some name in it - like 'operand', 'operation', etc., but the cell is still small in a sense it can contain just one item. So, for example, in a cell named 'operation' you can have '+' or '*' or '^', but no more than that. And in a cell named 'operand' you can have '1', '2', etc - just any one of the operands present in the expression.

In the beginning you may have as many cells as you like, but ultimately try to minimize number of needed cells.

So, now fully document your steps for the

Code:
`1 + 2 + 3 + 4 * 5 ^ 2`
expression (not that many steps) including all the decision making (your 'if ... else' statements above). Fully document means introduce some cell names and describe what you write into the cells, as well as what checks you make. Remember, you also have an eraser, i.e. you can, if appropriate, change value in a cell - not only write value into it

Having all this before your and my eyes will really help to draw further algorithmic conclusions and get closer to real code.

Of course, your final calculated result should be correct.

04-01-2010, 10:04 AM   #13
johnsfine
LQ Guru

Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep:
Quote:
 Originally Posted by MTK358 I was wondering if it could be easier just to do it this way?
I assume you mean the grammar rules you showed.

First, those grammar rules are wrong. They couldn't even handle A+B+C. Probably you meant:

Code:
```expr ::= expr "+" expr2
| expr "-" expr2
| expr2

expr2 ::= expr2 "*" expr3
| expr2 "/" expr3
| expr3

expr3 ::= NUMBER
| VARIABLE
| "-" expr3
| ( expr )```
Second, grammar rules do not imply a method to implement them.

Notice these corrected rules define an infinite recursion if you take a simplistic recursive approach to parsing. Try to parse an expr and the first thing to check is whether it starts with an expr. That is infinite recursion before you even look at anything.

Quote:
 And no matter what method, how do you handle left/right accociativity?
The corrected rules show the basis for left right associativity. Consider A+B+C:

A+B could be an expr or B+C could be an expr.
If A+B is an expr then that expr + C is another expr. But if B+C is an expr, then A plus that expr is not represented by this grammar.

Last edited by johnsfine; 04-01-2010 at 10:10 AM.

 04-01-2010, 10:16 AM #14 MTK358 LQ 5k Club   Registered: Sep 2009 Posts: 6,443 Blog Entries: 3 Original Poster Rep: I see the "1". Then the '+'. Then the '2'. I make this tree: (+, 1, 2) I check the next operator. It's left associative, and its precedence is not greater than that of the previous operator. I then get the next number and continue the tree like this: (+, (+, 1, 2), 3) Now the the next +. The next operator after it is left-associative and has a higher precedence. So I recursively start from there. The next operator (^) is right-associative and is greater than or equal to the previous one. I again go further recursively. There is no next operator. I return back to the multiplication with this tree: (^, 5, 2) The multiplication function finds there is no next operator now. It returns this tree: (*, 4, (^, 5, 2)) The first level now finds there are no more operators, and returns this: (+, (+, (+, 1, 2), 3), (*, 4, (^, 5, 2))) EDIT: I did not see the reply by johnsfine Last edited by MTK358; 04-01-2010 at 10:17 AM.
04-01-2010, 10:23 AM   #15
Sergei Steshenko
Senior Member

Registered: May 2005
Posts: 4,481

Rep:
Quote:
 Originally Posted by MTK358 I see the "1". Then the '+'. Then the '2'. I make this tree: (+, 1, 2) I check the next operator. It's left associative, and its precedence is not greater than that of the previous operator. I then get the next number and continue the tree like this: (+, (+, 1, 2), 3) Now the the next +. The next operator after it is left-associative and has a higher precedence. So I recursively start from there. The next operator (^) is right-associative and is greater than or equal to the previous one. I again go further recursively. There is no next operator. I return back to the multiplication with this tree: (^, 5, 2) The multiplication function finds there is no next operator now. It returns this tree: (*, 4, (^, 5, 2)) The first level now finds there are no more operators, and returns this: (+, (+, (+, 1, 2), 3), (*, 4, (^, 5, 2))) EDIT: I did not see the reply by johnsfine
Let's first implement tree-less calculation on the fly. The point is to better show/feel how much of already processed part of expression you really need for calculations/decision making.

Secondly, the tree is a "book" one, but not necessarily the most convenient one. For many practical tasks a tree like this:

Code:
`(+, 1, 2, 3, (*, 4, (^, 5, 2)))`
is much more convenient.

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is Off HTML code is Off Forum Rules

 Similar Threads Thread Thread Starter Forum Replies Last Post trist007 Programming 2 05-12-2009 01:50 PM nillgump Linux From Scratch 1 10-11-2008 11:09 AM frankie_DJ Programming 10 11-25-2006 01:35 PM alaios Programming 4 11-28-2005 11:27 AM alaios Programming 1 06-03-2005 05:32 AM

LinuxQuestions.org

All times are GMT -5. The time now is 01:01 PM.

 Contact Us - Advertising Info - Rules - LQ Merchandise - Donations - Contributing Member - LQ Sitemap -