Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
For all you compiler hackers out there: this is a question about writing a specification for flex, the scanner generator. I am using flex 2.5.35.
My problem is to write a specification that will break up a NMEA-0183 AIS string into recognizable components. In case you don't know, AIS strings look like this:
That is, roughly:
!AIVDM COMMA NUMBER COMMA NUMBER COMMA OPTIONAL_NUMBER COMMA [A | B] COMMA JUNK_TO_BE_DISCUSSED COMMA ZERO SPLAT HEXDIGIT HEXDIGIT CR LF
The problem is in the JUNK part. This is ASCII-ized binary crud similar to Base64 encoded data. It can contain basically anything except delimiters like (,!$*) etc.
The problem I am having is that my recognizers for decimal numbers are hitting sequences in the beginning of that junk, or at the end sometimes, so the junk sequence "15M>16?P00G?j9nKAFcV1ww:20Su" might hit on a number 15, followed by junk.
So the basic problem is how do you construct a specification that will filter a really "promiscuous" field out of more "restricted" data? What is causing this (it seems) is the fact that the JUNK field can contain a lot of, well, junk that is easily mistaken for almost anything else.
Any ideas what to do about this? I can post a lex file and data input if anyone cares.
Your data is broken into fields that are distinct by position, and so using commas as field separators relieves the need to parse based on the content. You can build in a counter that incrments on every comma delimited field and identify the purpose of the field from the count. Reset the counter when you see a newline.
Since your definition seems to indicate that commas are used exclusively as delimiters, perhaps an easier approach would be to break out the good old strtok() function.
--- rod.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.