ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
There is no facility in the C/C++ languages to do that, so you would have to write your own handlers for doing it.
You can hand-write lexers/parsers, many people do, it is a well understood but complex task. Unless your grammer is extremely simple with few symbols/tokens, it would likely take quite a bit of work.
What you are describing then, is a general purpose parser then, where you give it a grammer (the BNF) and a string and ask it to parse the string into a sentence, an AST according to that grammer.
That is not a new idea, but I am not aware of a successful implementation. It is a very complex proposition!
As it happens, I am working on a grammer/parser project at this time and in my recent reading I saw some in depth discussion of that, a general purpose parser, somewhere... ah, here and here (Both excellent books on the subject by the way!).
I'll try to have a fresh look at that tomorrow and post any useful nuggets I find. But as I recall it poses several very big, if not insurmountable difficulties for all but the most trivial grammers.
The author of those two books is pretty much the authority on the subject I think.
He makes an earlier edition of one of them available for download here. If you are not already very familiar with the subject, this would be an excellent place to start reading. This book does not include any code, but you will understand the BNF and the basic problems of parsing to a syntax tree after reading it!
There certainly are parsers out there, such as Perl's Parse::RecDescent war-horse, that can "parse on the fly," but the ordinary procedure is to first compile the grammar, then compile the source-code generated by the grammar processor. Since in most applications the grammar is fixed, this produces a very efficient parsing engine.
And ... you really don't want to "do it yourself in 'C'" ... unless you are a graduate student.
Not really complex. Something that can be represented by a BNF, since I'd like to have BNF as one of the in parameters.
Can BNF be represented as BNF?
It sounds like you're asking for a lexer and parser that would lex and parse a BNF, and then output another lexer and parser from that. I don't think the BNF has enough information for that.
if you want tree representations and to do parsing work with them, try:
AntLR: powerful graphic interface to parsing, now supported/used by Apple, GPL free - it's a good bet if you have something big in mind you should consider using it
bnf2xml uses simple bnf and using that scans input text makes a text xml "tree" (not a highly developed app, just a small typical unix "text filter"). one should maybe use awk unless the parsing BNF spec is too complicated for awk/grep OR for C if regex output tree is too large for those to handle reasonably well or is cumbersome (regex stores multiple results haphazardously - unless you know regex well it might take many tries to get how it deals with storing multiple hits and where they will be in memory)
there are MANY bnf parsers out there that handle extended EBNF sytax. (most all) require you build them into your C application and use a custom interface to glean the results found by them.
BNF grammar is simple, and was used in many books for describing C and C++ syntax
but its NOT necessarily the best language see AntLR about languages and complications arising from the choice of their use
Cduce is another interesting parser - this class of parser is geared toward HTML+XML web server use (it's output is HTML not XML an i think requires intengration into C). there are many new such products.
Last edited by X-LFS-2010; 05-02-2016 at 01:17 PM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.