-   Linux - Software (
-   -   Processing data from a 'foreign' database with mysql, or tools to pre-process data. (

linker3000 08-13-2007 08:42 AM

Processing data from a 'foreign' database with mysql, or tools to pre-process data.

I am working on a scheme to bring some data across from an old, but still in use, system so that we can process it for use with our (Linux-based) mailing list software (relax, it's part of our customer base - not yet more spam!).

The data is hosted on a server running an operating system called THEOS and I have no problems pulling over the files by FTP. The file structure is described as 'THEOS ISAM' and as far as I can see from a quick glance, comprises fixed length records with various field delimiters.

I need to either get the data into tab-delimited format or perhaps directly into a mysql database so it can be manipulated. In the past when I have done this in the MS world, I'd use a combination of BASIC and Excel, but since the target app is Linux-based, I'd like to go straight from THEOS -> Linux.

I can hack around in perl, PHP etc. to varying degrees, but wondered if there were any import pre-processing front ends for use with mysql that make it easy to define and test record structure definitions that can then be used as import filters? Any tools worth having a look at before I do some manual data structure dissection?

Hope that make sense!?


jlinkels 08-14-2007 08:36 PM

This kind of work I do with AWK. For current project I need to convert the output of a huge Paradox export to something which I can import in MySQL.

What the script does:

- Filter records matching certain criteria
- Check values for null values, either generate an error message or replace it by \N if null is allowed
- Concatenate strings
- Replace strings with numbers if they must become enumerated values for my database
- Print each record in a tab delimited format.

The Paradox files are sent to me in a regular fashion, I call awk from a bask script, and loads the result in the database.

It takes you about a day or so to understand awk and crank out the basic code.

If your export is a CSV file with quoted strings, do not try to process this with AWK. Simple cases are fine, but if you find a string like:

onevalue, "Sherlock asked: "What do you think, Watson?"", nextvalue
AWK chokes on that and many unsuccesful attempts have been made to implement this. Find a CSV preprocessor for that. If you can't find it, I have a good public domain application. It used to be available on the internet but it seems to have disappeared. If you need it, let me know.


All times are GMT -5. The time now is 03:47 AM.