LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 01-05-2008, 09:53 AM   #1
MikeyCarter
Member
 
Registered: Feb 2003
Location: Orangeville
Distribution: Fedora
Posts: 492

Rep: Reputation: 31
Question Software to parsing date and address


I've got an interesting challenge.

I get text ads which have a date and address somewhere in the body. The date could be anything. 01/01/08 or Sat 5th or Sat Jan 5th or Sat & Sun 5 & 6... you name it.


Currently I've been maintaining a php script which looks for address and date patterns. It's about 80% accurate but must be monitored closely. I'm thinking of redesigning it with some type of AI behind it.

Before I went coding I just wanted to check here to see if anyone knew of some linux software which did this (or part of it) already.
 
Old 01-06-2008, 06:58 PM   #2
PatrickNew
Senior Member
 
Registered: Jan 2006
Location: Charleston, SC, USA
Distribution: Debian, Gentoo, Ubuntu, RHEL
Posts: 1,148
Blog Entries: 1

Rep: Reputation: 48
How does your current script work? Seems to me that regular expressions might be the tool you need. Just google up on them and you'll find all you need.
 
Old 01-07-2008, 09:13 AM   #3
MikeyCarter
Member
 
Registered: Feb 2003
Location: Orangeville
Distribution: Fedora
Posts: 492

Original Poster
Rep: Reputation: 31
Quote:
Originally Posted by PatrickNew View Post
How does your current script work? Seems to me that regular expressions might be the tool you need. Just google up on them and you'll find all you need.
Currently works by regular expressions. About 60 of them all ranked. I even have check against the current date. (ie if the ad says Sun) is that this Sunday or last Sunday. Or Sun 6. There is a high degree of probability that there is only one or two Sun 6th in the given year. (At least within the range of a few months.)

The problem is I always get these ads where there is a slight deviation to the pattern. (not to mention spelling mistakes)


(ie Orangeville, 57 Broadway vs Orangeville., 57th Broadway. I have to include the [.,]{1,2} and [t]?[h]? and also filter in case someone has squished the number to the street name, which starts with th.)


Hence my question. If anyone knows of software which currently does the job. So I'm not re-inventing the wheel.
 
Old 01-07-2008, 02:15 PM   #4
jlinkels
LQ Guru
 
Registered: Oct 2003
Location: Bonaire, Leeuwarden
Distribution: Debian /Jessie/Stretch/Sid, Linux Mint DE
Posts: 5,195

Rep: Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043
--Removed--

Now I see what you mean.. this is not user input this is some kind of pattern finding.

jlinkels

Last edited by jlinkels; 01-07-2008 at 02:17 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Setting system date and time affecting the clock and date on BIOS satimis Ubuntu 7 09-21-2007 08:02 AM
old linux distro with up-to-date software, possible? cf13 Linux - Newbie 9 09-20-2007 09:43 PM
Need of Ip address software satish Linux - Networking 1 08-19-2006 02:06 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 09:15 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration