Process all files in directory

cgcamal · 01-19-2009, 05:37 PM

Hi all,

Can somebody help me with this problem pls.

I need to extract one specific line from each files in a folder and put
the all lines extracted in a unique output file in the following format.

line extracted, respective name of file, date of file.

I´, trying the part to extract the desired line with

Code:

awk '/Category/ {getline; print}' infile.txt

But I dont know how to do a loop over all files and send all lines
to one outfile.txt with the format desired.

Example:
If the line extracted from one specific file with

Code:

awk '/Category/ {getline; print }' infile.txt

is "District school", the desire result would be:

Code:

District school,  file date,  file name

Thanks in advance for any help

raconteur · 01-19-2009, 06:39 PM

We're usually a bit reluctant to do someone's homework for them without asking them to at least give it a try...
This does look like homework (though it certainly may not be), and it also looks like you gave it a try, so those prerequisites appear to have been met.
I'd approach this by using the find command to locate the files, and grep to get the line data.
You'll need to replace the find path, infile, outfile, and search_str for your needs.

I've done only the barest of formatting of the output, and I've written this to be longer than it needs to be for clarity.
There are several shortcuts you could make to get this down to only a couple lines of code, if you wish.
Take a look at the parameters for find, for example, for some ideas in that regard.

Code:

#!/bin/bash

infile=schools.txt
outfile=/root/categories.txt
search_str="Category"

#delete the output file if it exists, this will append to it
if [ -f "$outfile" ]; then
  rm -f $outfile
fi

for file in `find . -name $infile -print`; do
  for line in `egrep $search_str $file`; do
    category=`echo $line | awk '{print $1}'`
    filedate=`ls -l $file | awk '{print $6}'`
    filetime=`ls -l $file | awk '{print $7}'`
    if [ -n "$category" ]; then
      echo "$category $filedate $filetime $file" >> $outfile
    fi
  done
done

cgcamal · 01-19-2009, 11:58 PM

Hi raconteur,

Well, seing better my post looks like homework because of the example I use, actually I was graduated of college more than 5 years ago, this is a task that I want to reach a procedure to do faster in order to save and win several hours of time in my job instead of doing this semi manually file by file. With your help I begin to reach that goal.

Well, I´m not sure, I run the code, it ends but no output is generated.

The real word, in fact is a phrase to find within the files is "Traffic NInbound NOutbound" and the folder is C:\Traffic Class. I replace this in the script but I dont know what I´m doing wrong.

in addition:

Why in the script exist the line infile="file.txt"? or was an example?
I mean, the script would find the word below search_str for all files in folder, independently their names. Anyway, I´ve tested using infile=*.txt.

Please tell me in what I wrong.

Many thanks in advance for your help, best regards

Code:

#!/bin/bash

infile=*.txt
outfile=C:\Traffic Class\
search_str="Traffic NInbound NOutbound"

#delete the output file if it exists, this will append to it
if [ -f "$outfile" ]; then
  rm -f $outfile
fi

for file in `find . -name $infile -print`; do
  for line in `egrep $search_str $file`; do
    category=`echo $line | awk '{print $1}'`
    filedate=`ls -l $file | awk '{print $6}'`
    filetime=`ls -l $file | awk '{print $7}'`
    if [ -n "$Traffic NInbound NOutbound" ]; then
      echo "$Traffic NInbound NOutbound $filedate $filetime $file" >> $outfile
    fi
  done
done

chrism01 · 01-20-2009, 12:31 AM

For wildcards in the find cmd, use single quotes eg

Code:

find . -name '*.txt' -print

Note that if all the files are in the one dir, you can cd into the dir and just use ls instead of 'find' eg

Code:

cd $targetdir
for file in `ls`
do
...

Try reading this: http://www.tldp.org/LDP/abs/html/

sundialsvcs · 01-20-2009, 09:43 AM

Generally speaking, I tend to reach for Perl when doing tasks like this one. The structure of a script in that language is of course similar to the bash-script here presented, but now you have a full-featured programming language at your disposal... and literally thousands of well-tested (CPAN) modules to go with it. I happen to find "bash scripts" very difficult to write, and cumbersome to maintain, perhaps because I feel rather like I've got one arm tied behind my back in terms of what I can easily "do."

The bottom line is, as the Perl folks like to say, "TMTOWTDI = There's More Than One Way To Do It." And, as you observe, the real-world benefits of even a "small and simple" script can be huge.

raconteur · 01-23-2009, 12:20 PM

I assumed from your original post that you were looking for files in multiple directories where the file name was the same.
That isn't the case, so Chris' suggestion for using find is the right way to go with this.
Let me know if you can't get it working to your liking, we'll get it fixed up.