Quote:
Originally Posted by ntubski
I'm pretty sure you can't go backwards. You shouldn't match a function call expression with a lexer, regular languages can't match arbitrarily nested things, so you would have with things like: func((a*(b+c)), x).
Time to look at a parser, since you've started with flex, perhaps bison.
|
Hi,
Thanks for the answer. However, I found the answer a few days ago and intended to post it back here, but I forgot. To make the lexer go backward, so it won't ignore any character, we use
yyless(). If my lexer found the string like
init(, using
yyless(yyleng-1) (yyleng is the length of
yytext) will put back one character starting from the right most side. In this case,
init( will become
init, and
( will be put back into the stream to be tokenized later. Similarly, using
yyless(yyleng-2), only
ini will remain in
yytext, and the character
t and
( will be put back into the stream.
In my case, my expression will only try to match for a function-name token and return it. It has to have a way to detect what a function name is. For example, in function
init(),
init is the function-name. My expression will match any sequence of characters which contain
( in it as a hint of a function, and attempt to take any characters prior to
( as a function name, and return to the lexer.
I will use bison later. But currently, I have to create the lexer first using flex. This is one of the assignment I'm doing for the Compilers course from Coursera:
Compilers. Coursera also offers lots of useful courses as well.
EDIT: I fixed my original regular expression (in original post) to detect only
init( in
init(...) (... is parameters if exist). The patern will recognize function name like
foo or
_foo_ or
_foo_foo_ etc...
Code:
//i.e. hello(...), foo(...), bar(...)
FUNCALL [\-\_a-zA-Z0-9]*[a-zA-Z0-9]+[\-\_a-zA-Z0-9]*\(