LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   indexing in awk.... (https://www.linuxquestions.org/questions/linux-software-2/indexing-in-awk-665041/)

visitnag 08-24-2008 12:10 PM

indexing in awk....
 
I am a newbie in programming. I have learnt awk thru internet. I came to know that in cobol programmers use index files to get quick results(?) what is the use of index files(even i saw this indexing concept in FoxPro) can i use this method in Awk.. kindly explain me.

tronayne 08-25-2008 08:26 AM

The AWK programming language doesn't really have any convenient way to do this; it probably can be done, but I think it would be a struggle that may not be worth the effort. There is a discussion at http://www.unix.com/shell-programmin...ndex-file.html about AWK index files and an article introducing linked lists and index files at http://www.linuxjournal.com/article/1156 that may prove interesting.

An index file would contain information about the location (in another file, say) of indexed values. Essentially, the index would contain the address of one or more patterns in a data file that match the index criteria; that's a complicated way of saying "the stuff you want is at these addresses." In this case, "address" means an offset from the beginning of the data file; e.g., how many bytes in from the beginning of the file. Bear in mind that all files are just collections of bytes irrespective of what kind of data are stored (text, numeric, etc.). You can view a file in an editor, with a dump utility or by some other means -- you probably will be able to read the content of a text file but all others will look like gobbledygook. Different data types (character, integer, floating point) are stored in different sized fields from a single byte (for character) up to multiple bytes for floating point.

If I understand what it is you're trying to do, you may want to take a look at one of the relational data base management systems (DBMS); MySQL, PostrgeSQL and SQLite come to mind. Your system may already have MySQL installed, possibly PostgreSQL, maybe even SQLite; if not, it may be worth your while to install them from your distribution.

SQLite, http://www.sqlite.org, may be worth a look. It pretty much runs on anything, isn't a big chore to install, and gives you a fairly easy entry into the DBMS world. You can play around with creating tables and indexes without administrative duties to complicate things and, as time goes on, you can migrate what you've learned to MySQL or PostgreSQL (which do require administrative effort); one step at a time might be the way to go.

Once you struggle with "rolling your own" index method (been there, did that, don't want to do it again), you'll really appreciate what a RDBMS does for you. Give it some thought. Too, AWK is an interpreter, not a compiled language, and can be slow when working with large data sets (couple of hundred, not bad, couple of thousand, well...).

Hope this helps some.

matthewg42 08-25-2008 10:28 AM

There is quite a big leap from indexed files to a full relational database. If all you want to do is relate some keys to values in a large set stored in a file, you are probably looking for something like the classic Unix DBM file format or one of the related implementations like GDBM.

Using these from awk is something I have never encountered, but it is commonly done in Perl with the "tie" feature. See "perldoc -f tie" for more info.


All times are GMT -5. The time now is 10:52 AM.