LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-14-2012, 10:52 PM   #1
Rickert
LQ Newbie
 
Registered: Nov 2011
Posts: 18

Rep: Reputation: Disabled
How to go backward to a certain position in Gnu-Flex?


For example, my lexer recognizes a function call pattern:

Code:
//i.e. hello(...), foo(...), bar(...)
FUNCALL     [\-\_a-zA-Z0-9]*[a-zA-Z0-9]+[\-\_a-zA-Z0-9]*\(
Now that flex recognizes the pattern, but it goes passed the last character in the pattern (i.e. after stored foo(...) inside yytext, the lexer will point to the next character after foo(...))

How can I reset the lexer pointer back to the beginning of the function pattern? i.e. after recognizing foo(..), I want to the lexer to point to the start of foo(..), so I can start tokenizing it.

I need to do this because for each regex pattern, only one token can be returned for each pattern. i.e. after matching foo(...), I can only return either foo or ( or ) with return statement but not all.

Last edited by Rickert; 05-16-2012 at 10:48 PM.
 
Old 05-16-2012, 07:21 PM   #2
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,784

Rep: Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083
I'm pretty sure you can't go backwards. You shouldn't match a function call expression with a lexer, regular languages can't match arbitrarily nested things, so you would have with things like: func((a*(b+c)), x).

Time to look at a parser, since you've started with flex, perhaps bison.
 
1 members found this post helpful.
Old 05-16-2012, 09:30 PM   #3
Rickert
LQ Newbie
 
Registered: Nov 2011
Posts: 18

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by ntubski View Post
I'm pretty sure you can't go backwards. You shouldn't match a function call expression with a lexer, regular languages can't match arbitrarily nested things, so you would have with things like: func((a*(b+c)), x).

Time to look at a parser, since you've started with flex, perhaps bison.
Hi,

Thanks for the answer. However, I found the answer a few days ago and intended to post it back here, but I forgot. To make the lexer go backward, so it won't ignore any character, we use yyless(). If my lexer found the string like init(, using yyless(yyleng-1) (yyleng is the length of yytext) will put back one character starting from the right most side. In this case, init( will become init, and ( will be put back into the stream to be tokenized later. Similarly, using yyless(yyleng-2), only ini will remain in yytext, and the character t and ( will be put back into the stream.

In my case, my expression will only try to match for a function-name token and return it. It has to have a way to detect what a function name is. For example, in function init(), init is the function-name. My expression will match any sequence of characters which contain ( in it as a hint of a function, and attempt to take any characters prior to ( as a function name, and return to the lexer.

I will use bison later. But currently, I have to create the lexer first using flex. This is one of the assignment I'm doing for the Compilers course from Coursera: Compilers. Coursera also offers lots of useful courses as well.

EDIT: I fixed my original regular expression (in original post) to detect only init( in init(...) (... is parameters if exist). The patern will recognize function name like foo or _foo_ or _foo_foo_ etc...

Code:
//i.e. hello(...), foo(...), bar(...)
FUNCALL     [\-\_a-zA-Z0-9]*[a-zA-Z0-9]+[\-\_a-zA-Z0-9]*\(

Last edited by Rickert; 05-16-2012 at 10:51 PM.
 
1 members found this post helpful.
Old 05-18-2012, 12:01 AM   #4
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,784

Rep: Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083
Quote:
Originally Posted by Rickert View Post
I will use bison later. But currently, I have to create the lexer first using flex. This is one of the assignment I'm doing for the Compilers course from Coursera: Compilers.
A quick look at the reference manual for COOL (the language for that course) says
Quote:
The lexical units of COOL are integers, type identifiers, object identifiers, special notation, strings, key-words, and white space.
Function call is not a lexical element.
 
Old 05-18-2012, 02:23 AM   #5
Rickert
LQ Newbie
 
Registered: Nov 2011
Posts: 18

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by ntubski View Post
Function call is not a lexical element.
You are right. After struggling with recognizing function call using start condition which complicates later lexing process (currently I am having many start conditions, which makes it really hard to manage), I decided to use another way to recognize the function name in a function. After all, function name is just an object identifier which does not contain not alnum characters (except for \- and \_ ).
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[Flex & Bison] How to check which state Flex is in? courteous Programming 0 06-03-2011 11:46 AM
backward compatability iamthe Linux - Hardware 1 06-02-2005 06:37 PM
Backward compatibility lenucks General 4 06-27-2004 02:34 AM
Going backward to go forward Wonderer Slackware 1 01-31-2004 07:14 AM
RedHat Linux9-Going Backward ? Dineth Red Hat 2 01-04-2004 01:37 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:45 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration