LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 03-13-2009, 04:36 PM   #1
RileyTheWiley
Member
 
Registered: Dec 2007
Posts: 59

Rep: Reputation: 15
Question Forming a lex (or flex) regexp


For all you compiler hackers out there: this is a question about writing a specification for flex, the scanner generator. I am using flex 2.5.35.

My problem is to write a specification that will break up a NMEA-0183 AIS string into recognizable components. In case you don't know, AIS strings look like this:

!AIVDM,1,1,,B,15M5c<0000G?j?HK@;F005U<04KH,0*4E
!AIVDM,1,1,,B,15M>16?P00G?j9nKAFcV1ww:20Su,0*29
!AIVDM,1,1,,B,15N@wP0P00o?ruLK?UMMbOw>04KH,0*31
!AIVDM,1,1,,B,15Mj2u001vo?tV8K?<ub>8;@0D1<,0*17
!AIVDM,2,1,3,B,55P5TL01VIaAL@7WKO@mBplU@<PDhh000000001S;AJ::4A80?4i@E53,0*3E
!AIVDM,2,2,3,B,1@0000000000000,2*55

That is, roughly:
!AIVDM COMMA NUMBER COMMA NUMBER COMMA OPTIONAL_NUMBER COMMA [A | B] COMMA JUNK_TO_BE_DISCUSSED COMMA ZERO SPLAT HEXDIGIT HEXDIGIT CR LF

The problem is in the JUNK part. This is ASCII-ized binary crud similar to Base64 encoded data. It can contain basically anything except delimiters like (,!$*) etc.

The problem I am having is that my recognizers for decimal numbers are hitting sequences in the beginning of that junk, or at the end sometimes, so the junk sequence "15M>16?P00G?j9nKAFcV1ww:20Su" might hit on a number 15, followed by junk.

So the basic problem is how do you construct a specification that will filter a really "promiscuous" field out of more "restricted" data? What is causing this (it seems) is the fact that the JUNK field can contain a lot of, well, junk that is easily mistaken for almost anything else.

Any ideas what to do about this? I can post a lex file and data input if anyone cares.

Thanks

Eric
 
Old 03-13-2009, 10:35 PM   #2
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,395
Blog Entries: 2

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
Hard to say what you need to satisfy the bigger picture, but here's something to start with
Code:
%%

[^,\n]+,    { printf( ">>>>  %s  <<<<\n", yytext ); return(1); }
\n          { printf( "\nNew record....\n" ); }

%%

#include <stdio.h>

int yywrap(void){
	return 1;
}


int main( int argc, char * argv[] ){

   while( yylex() );

}
Your data is broken into fields that are distinct by position, and so using commas as field separators relieves the need to parse based on the content. You can build in a counter that incrments on every comma delimited field and identify the purpose of the field from the count. Reset the counter when you see a newline.

Since your definition seems to indicate that commas are used exclusively as delimiters, perhaps an easier approach would be to break out the good old strtok() function.
--- rod.

Last edited by theNbomr; 03-13-2009 at 10:40 PM.
 
  


Reply

Tags
bison, flex, lex, parser, yacc


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
lex and flex version problem proteus2002 Linux - Software 2 10-09-2006 09:33 PM
FLEX/LEX input question ankit4u1 Programming 16 07-14-2006 05:37 AM
Flex or Lex Mandrivia maurovezz Linux - Distributions 2 07-04-2005 03:09 AM
any lex/yacc/flex mailig list sibtay Linux - Software 0 12-22-2004 03:24 AM
flex your lex chens_83 Linux - General 2 06-02-2002 03:18 AM


All times are GMT -5. The time now is 03:17 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration