LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Need help comparing output of grep -c to a value using -gt and -lt in bash (https://www.linuxquestions.org/questions/linux-newbie-8/need-help-comparing-output-of-grep-c-to-a-value-using-gt-and-lt-in-bash-4175526054/)

kmkocot 11-21-2014 12:24 AM

Need help comparing output of grep -c to a value using -gt and -lt in bash
 
Hi all,

I have a folder of .fa files. I want to get rid of files that have more than 50 or fewer than four occurrences of the greater-than symbnol (">"). Here is the code I've come up with so far but I don't think the output of grep -c is being read as a number. I feel like bc needs to come into play somehow but I can't figure out how to make bash see this variable as a number. Any help would be greatly appreciated!

Code:

mkdir more_than_50_seqs
mkdir fewer_than_4_seqs
for FILENAME in *.fa
do
NUMBER_OF_OGs=`grep -c \> $FILENAME`
echo $NUMBER_OF_OG
if [$NUMBER_OF_OGs -gt 50]
then
        mv $FILENAME ./more_than_50_seqs/
elseif [$NUMBER_OF_OGs -lt 4]
then
        mv $FILENAME ./fewer_than_4_seqs/
else
        echo
fi
done

Thanks!
Kevin

SAbhi 11-21-2014 02:13 AM

Quote:

Code:

mkdir more_than_50_seqs
mkdir fewer_than_4_seqs   
for FILENAME in *.fa 
do
NUMBER_OF_OGs=`grep -c \> $FILENAME` 
echo $NUMBER_OF_OG
if [$NUMBER_OF_OGs -gt 50]
then
        mv $FILENAME ./more_than_50_seqs/
elseif [$NUMBER_OF_OGs -lt 4]
then
        mv $FILENAME ./fewer_than_4_seqs/
else
        echo
fi
done


# making dir but not checking if they already exist... not a wise idea

Are you checking the count of duplicate .fa files or counting all .fa files ?
in both cases i dont see grep -c or the code you wrote is much of a use here rather using wc -l while listing files can give you count of occurences of the file.

better see man page first.

here is what you can do once you decide what you want to count:
get the count , compare it and check if teh dir you want exists
if not make one else move the file to a dir depending on the count.

grail 11-21-2014 02:22 AM

I see a few issues:

1. No space either side of [ and ], remember that [ is a command, so you would not write lsdir, you would write ls dir

2. elseif is incorrect in bash, you want to use elif

3. If you use (()) instead of [] you can perform arithmetic comparisons with familiar symbols
Code:

if (( NUMBER_OF_OGs > 50 ))
4. Personal choice, but when using non-alphanumeric characters I prefer to place quotes around strings to help protect it against shell expansion
Code:

NUMBER_OF_OGs=$(grep -c '\>' $FILENAME)
5. I do not believe the escape is required inside the single quotes either

6. $() is clearer than `` and can be nested easily

7. Based on your description, I am not 100% sure you are performing the correct test?? You say you want files with greater than 50 occurrences of '>', however the -c option for grep will return the number of lines
that match your criteria and not the number of '>' that appear, ie. you could have a file with one line and 51 '>' but this would not return as the -c will return only 1

kmkocot 11-21-2014 06:22 PM

Thanks guys!

Grail, thanks for noticing that I'm only counting lines with a greater than symbol but FYI the file format that I'm using will always have one or zero greater than symbols per line. This code with your suggestions did the trick:

Code:

mkdir more_than_50_seqs
mkdir fewer_than_4_seqs
for FILENAME in *.fa
do
NUMBER_OF_OGs=$(grep -c '>' $FILENAME)
echo $NUMBER_OF_OG
if (( $NUMBER_OF_OGs > 50 ))
then
        mv $FILENAME ./more_than_50_seqs/
elif (( $NUMBER_OF_OGs < 4 ))
then
        mv $FILENAME ./fewer_than_4_seqs/
else
        echo $FILENAME has >4 but <50 sequences
fi
done

Thanks!
Kev

nbritton 11-21-2014 10:45 PM

Code:

#!/bin/bash

if ! test -d "./more_than_50_seqs"; then
    mkdir ./more_than_50_seqs;
fi

if ! test -d "./less_than_4_seqs"; then
    mkdir ./less_than_4_seqs;
fi

for i in *.fa; do
    if [[ $(grep -c ">" $i) -gt 50 ]]; then
        mv $i ./more_than_50_seqs/;
    elif [[ $(grep -c ">" $i) -lt 4 ]]; then
        mv $i ./less_than_4_seqs/;
    fi
done


nbritton 11-21-2014 10:54 PM

You shouldn't use upper case variable names.

http://wiki.bash-hackers.org/scripti...variable_names
http://wiki.bash-hackers.org/scripting/style


All times are GMT -5. The time now is 04:17 PM.