LinuxQuestions.org - Finding data in large no. of files

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - Finding data in large no. of files (https://www.linuxquestions.org/questions/linux-newbie-8/finding-data-in-large-no-of-files-4175415766/)

cooker97

07-09-2012 04:09 PM

Finding data in large no. of files

I need to find some data in a large no. of files. The data is in the following format :

VALUE A VALUE B VALUE C VALUE D
10 4 65 1
12 4.5 65.5 2
10.75 5.1 87 3
9.8 4 67 4

All the files have data is the same format (above). I need to write a script that copies those files (to a subdirectory, which also the script should create) that have ANY row satisfying the search criteria that : 10.5<VALUE A<11.5 && 4.5<VALUE B<5.5 && 80<VALUE C<90, and then also displays the VALUE D for those particular rows in the selected file which fulfill the above criteria.
The files are in "bin/models" which has two subdirectories,"model 1"and "model 2", each of which contain 10 data files. The files end in ".track". The new subdirectories are to be named "new_sub", and are to be created both in "model 1" and "model 2".

Thanks a TON in advance; I really need to know this one quick for a project!! Please help!

chrism01

07-09-2012 06:00 PM

Why don't you show us what you've done so far and we'll help.

cooker97

07-10-2012 12:41 AM

Actually nothing so far. I'm a complete noob, but need to write that script for a project.

jschiwal

07-10-2012 12:48 AM

Since the files are organized into fields, and contain floating point numbers, look at using awk.

cooker97

07-10-2012 01:51 AM

@jschiwal - yeah but how exactly? I couldn't glean much from the man pages. Could you please post a sample script?

cooker97

07-10-2012 02:32 AM

Quote:

Originally Posted by jschiwal (Post 4723595)

Since the files are organized into fields, and contain floating point numbers, look at using awk.

@jschiwal - yeah but how exactly? I couldn't glean much from the man pages. Could you please post a sample script?

jschiwal

07-10-2012 03:02 AM

Something like:

Code:

$1 > 10.5 && $1 < 11.5 && $2 < 5.5 && $2 > 4.5 && $3 < 90 && $3 > 80 { exit}

The END block will be executed, and there is a variable called FILENAME which you can use to print out the filename with a match, or even a command with the cp or mv command to move the file into the subdirectory.

I didn't follow what you want to do with the 4th field. That seems to imply that you may also want a file produced containing all matching lines, in which case, don't use exit, but set a flag, and print out the 4th column. Check this flag in the END block to determine if there were any matches.

Look at a user guide such as http://www.gnu.org/software/gawk/manual/gawk.html#top instead of the man page. Most distros also have a gawk doc package that supplies a book "Gawk: Effective AWK Programming".

It may be best to simply print out only matching filenames to stdout, and use this output in a bash for loop to move the files.

cooker97

07-10-2012 03:48 AM

Thanks a ton.

I think your idea of simply using the std o/p in a for loop makes better sense. I followed the link, and I think it'll prove helpful.
By the way, what is 'END block' and how do I put my filename in the FILENAME variable?

All times are GMT -5. The time now is 02:02 AM.