LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Bash file filtering (B. Newbie) (https://www.linuxquestions.org/questions/programming-9/bash-file-filtering-b-newbie-510457/)

fopetesl 12-14-2006 12:48 PM

Bash file filtering (B. Newbie)
 
I have this code:
Code:

#!/bin/bash
for file in `ls`
 do
  if [ $file == *.dta ]
  then
#  cp -f $file /var/www/html/scandata.dta
  echo $file " is dta"
  fi
done

I have different types of files in a directory but only want to copy specific ones,(there a quite a few).

I ran the bash debugger "~# bash -x ./pData" which tells me why there are "too many arguments" in line 4 :- the code attempts to copy EVERY file EVERY loop. Actually echo the name to the screen for testing.

How do I get Bash to filter only those files I want to copy?

(Yes. I read through a LOT of posts but nothing has quite the right answer. That I found)

matthewg42 12-14-2006 01:31 PM

Several points:
  1. The string equality operator is = not ==
    Code:

    if [ "$file" = "myfile" ]; then
        echo "file is myfile"
    else
        echo "file is not myfile"
    fi

  2. You can't use glob patterns in a test for equality using the = operator - it is just for literal strings. You can however use it in a case statements:
    Code:

    case "$file" in
    *.dta)
        echo "yes, $file matches our pattern"
        ;;
    *)
        echo "no, $file doesn't match our pattern"
        ;;
    esac

  3. Instead of `ls`, you can use just a file pattern:
    Code:

    for file in *.dta
    do
      ...
    done

  4. Lastly, you can specify multiple files to cp without having to use a loop at all, and the use -v switch to print verbose messages as you copy each one (assuming scandata.dta is a directory):
    Code:

    cp -v *.dta /var/www/html/scandata.dta

fopetesl 12-15-2006 02:55 AM

Matthew, thanks.
Firstly, I need to process each *.dta file individually so your last option is informative but not applicable here.

Some of your syntax is confusing so I need to RTFM - I have only used C and Assembler so far.
e.g. you use the " to enclose a variable without prefix $
so I guess Bash picks it up whatever.

Your example #3 works just fine for me.;)

jschiwal 12-15-2006 03:00 AM

Since you only want to process *.dta files, using:
for file in *.dta; do
would work out better.

However, check what you want to do with the file, because you are overwriting the /var/www/html/scandata.dta for each file in the list.

matthewg42 12-15-2006 03:11 AM

Quote:

Originally Posted by fopetesl
Matthew, thanks.
Firstly, I need to process each *.dta file individually so your last option is informative but not applicable here.

Some of your syntax is confusing so I need to RTFM - I have only used C and Assembler so far.

The bash manual page contains everything you need to know, but the manual format isn't really ideal for such a large document. Once you get to know the rough format it's really a valuable reference though.

Shell is quite different from C and assembler. It's a lot cruder than C in many ways. As with any programming, it's just a matter of doing lots of different things an getting a feel for it.

The good thing about shell is that you can use the same commands in the terminal as you do in scripts (mostly). This makes a very nice way to test out commands and little loops etc.

Quote:

Originally Posted by fopetesl
e.g. you use the " to enclose a variable without prefix $
so I guess Bash picks it up whatever.

Your example #3 works just fine for me.;)

The quoting rules in bash are a little bit weird, but not too bad once you're used to them. "double quotes" let the shell expand lots of things for you like $variable_values, $(sub-shell command executions) and so on. 'single quotes' don't let the shell do anything to the stuff inside the quotes - it's treated as a literal string.

When you see something like *.dta, you should know that it's the shell expanding the pattern first, then using it (e.g. using the list n a loop, or passing the list to a command).

So whet you do
Code:

ls *.dta
...the shell expands the list fist and passes the list to ls. ls never sees the meta-character, *. I only mention it because for a lot of DOS veterans like myself, this was the other way round. It's a moment of epiphany for a lot of people to realise how it's working. Was for me at any rate :)

fopetesl 12-15-2006 03:57 AM

Now we're getting there:
Code:

#!/bin/bash
for file in *.dta
do
 cp -f $file /var/www/html/scandata.dta
 cd ..          # go up one dir to /var/www/html
 ./hbinterpret  # process and save computation
 cd -            # back to original directory
 echo $file      # the file have we just processed
done

Works great.
OK I know that I'm over writing 'scandata' every time but that's exactly what I need.
'hbinterpret' processes the dta info and records it. Then I can analyse later.
There are many dta files.

I find the fact that I can use command line code inside this shell script makes the job much easier. As you say Matthew, I am getting my head round it!:)

matthewg42 12-15-2006 08:19 AM

Depending on whether or not this hbinterpret program cares about the current working directory setting when it is invoked, you might be able to replace these lines
Code:

cd ..          # go up one dir to /var/www/html
 ./hbinterpret  # process and save computation
 cd -            # back to original directory

with
Code:

../hbinterpret
Shell scripting is ugly and not really as full featured a language as one might want, but it's dead useful.

For this sort of task it is really ideal. If you find yourself doing data processing an any significant way, you'd probably be better off using sed, perl, awk or something like that. Sed has quite a small instruction set, but is really powerful despite this. I discovered awk first, but then found perl did the same stuff faster, plus a whole lot more.

fopetesl 12-15-2006 08:43 AM

Matthew, thanks again.
hbinterpret accesses other files in its own directory so fails on not finding them. When I run ../hbinterpret it junks out.

I'll certainly have a look at perl in the same way I'd like to look at Tcl. I don't know yet whether either or both can have their source protected which I feel is necessary for the proprietary source.:scratch:

matthewg42 12-15-2006 09:12 AM

Tcl is quite a nice language. Really minimal syntax - you can learn to use it quite OK within in a day. :)

What's really nice about TCL is expect. expect is well worth a look. It uses TCL as the basic language and adds some really cool stuff. You can automate your whole job with it :D

There are bindings for expect in other languages too, but I really like the simplicity of TCL/expect.

fopetesl 12-15-2006 09:56 AM

Wit a slight leap of faith I note you use Drupal on your site.
That looks good also, especially as Linux Format gives it the thumbs up:jawa:
See you there.


All times are GMT -5. The time now is 06:56 AM.