[SOLVED] parsing with gnu-flex and bison fails for space and brace
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
There are few problem, so that I need to generalize both the lexer and parser:
1) It can not read a sentence, i.e. if the RHS of Author="Some Value", it only shows "Some. i.e. space is not handled. Dont know how to do it.
2) If I enclose the RHS with {} rather then "", it gives syntax error. Looking for help for this 2 situation.
Kindly help.
You can see the final quote in "Some2VALUE" being lexed as a separate token, which is probably not what you want. I suggest having separate lexing rules for quoted and unquoted tokens:
Enclosing the RHS with {} will be a bit more tricky as you are already enclosing the Entry value in {}. If you don't need any further nesting of {}s, you could probably handle it in the lexer using Start Conditions, but it may be a better idea to handle it in the parser.
it may be a better idea to handle it in the parser.
Ntubski,
Thanks for your reply. I already have a lexer+parser that can parse it correctly(when the strings are quoted). But, in more general condition, where the strings may be braced, and even nested braces are common.
So, the last line of your reply is my actual goal. As you can see from my parser and lexer, I am trying to parse it using the grammer. But have not acheived much.
Help needed for the grammer.
Ah, I looked at the BibTex Format Description: because the stuff within braces has to include everything including white space, you have to tell the lexer about it. Here is a parser that just prints out the Entries (I didn't bother freeing memory, it's very leaky):
Hi Ntubski and all,
sorry to open an solved thread once again, but its probably best thing for sake of completeness.
I have adopted the code as Ntubski provided, just put it in a gtk treeview and hashtable.
now it looks like: the lexer:
This is an minimal example
The problem is, even when we ignore any gtk things, just the printf statement (line #86 of parser) prints garbage value for Key={Value}; neither for Key="Value" nor Key="{Value}".
Also, the garbage comes only when I read the file 2nd time, not for the first.(This resembles the case when I open the file using a gtkwidget, thats why I have added the 2nd yyparse).
The output of printf looks like:
��b<Rudra } when the actual thing is {Rudra}
and emits warning:
Quote:
Pango-WARNING **: Invalid UTF-8 string passed to pango_layout_set_text()
I am confused why the warning was only when the string starts with {, not with ".
A little tutorial?
In the case of the " quoted string the concat() function was not used to construct the value so there is no problem. A " quoted string is lexed as a VALUE so the second alternative of Value is chosen. A {} quoted string is lexed as '{' VALUE '}' so the BraceVs alternative of Value is chosen. Both BraceVs and BraceV use concat() in the semantic expressions to create a value for $$.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.