LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   shell script question (http://www.linuxquestions.org/questions/programming-9/shell-script-question-769926/)

psynce_friction 11-18-2009 07:36 AM

shell script question
 
hello all

I have a tiny script which I am trying to get working. It takes a list of file names e.g XXXXXXXXXXXXXX_right, XXXXXXXXXXXXXX_left

The script reads through the list strips the _left and _right identifies uniq names and prints a list of unique names with the _left, _right reattached
Do to this i have

for i in $(cat input.file|sed 's/_[a-z]*//' |sort |uniq -c |grep -v '2 ');
do grep $i input.file ;done

this should produce a list of approx 14600 file names however what i actually get is a list of 223512606 file names

can anyone see where i have gone wrong?

Thanks in advance

indiajoe 11-18-2009 08:43 AM

Hi
Hint: You did the loop for each 14600 files 14600 times. The number of files you got is square of the number of files.
The problem is in how piping works.
-Cheers
indiajoe

Disillusionist 11-18-2009 08:49 AM

Have you thought about using a while loop?

Code:

$(cat input.file|sed 's/_[a-z]*//' |sort |uniq -c |grep -v '2 ')|while read i
do
 grep $i input.file
done


indiajoe 11-18-2009 09:00 AM

Hi
Just check whether the following script will work.
Code:

IFSOLD=$IFS
IFS=\n
for i in $(cat input.file|sed 's/_[a-z]*//' |sort |uniq -c |grep -v '2 '| cut -b 9-); do grep $i input.file ;done
IFS=$IFSOLD

Replace 9- with the number of characters you want remove to get the file name without any white spaces or number.
Somebody pls point out a neater way of doing the above step without all pipings. There should be a better way.
-Cheers
indiajoe

psynce_friction 11-18-2009 09:18 AM

Hi indiajoe

ran your script and it says
zsh: argument list too long: grep

doing some error checking and i ran

head step2b.out|sed 's/_[a-z]*//' |sort |uniq -c |sed 's/\t1//' |grep -v '2 '

which produces an out put

1 FE4758201C23O2
1 FE4758201CV20O
1 FE4758201D3YER
1 FE4758201DCUH9
1 FE4758201DN2UI
1 FE4758201EFEIT
1 FE4758201EKY4O
1 FE4758201ENBE3
1 FE4758201EQESX
1 FE4758201EUG4B

is the space and one before the file name causing a problem. I have tried to add another grep cmd to remove it but can't get it to work

psynce_friction 11-18-2009 09:28 AM

yep that was it thanks all much appreciated

indiajoe 11-18-2009 09:48 AM

Hi
Sorry for the mistake.
I have corrected it. Please post back if you get a better method.
Thanks
indiajoe

ghostdog74 11-18-2009 10:12 AM

how does your input file look like? what should your output be?

Disillusionist 11-18-2009 12:32 PM

use sed:

Code:

sed 's/[1-9] //'
EDIT:-

use the -u option on sort.
Code:

IFSOLD=$IFS
IFS=\n
for i in $(cat input.file|sed 's/_[a-z]*//' |sort -u| cut -b 9-); do grep $i input.file ;done
IFS=$IFSOLD


psynce_friction 11-20-2009 09:27 AM

as i mentioned there was some space and a 1 followed by a spave before the file name i.e. -----1-filename and the cript was being run for each word in the line. I included sed 's/ 1 //'to get rid of this and it worked fine

for i in $(cat file.in|sed 's/_[a-z]*//' |sort |uniq -c |sed 's/ 1 //' |grep -v '2 '); do grep $i file.in ;done

cheers

Brian

Disillusionist 11-21-2009 05:26 AM

The spaces followed by the number and then a space is because you are using uniq -c

If you removed the uniq -c and used sort -u, you would not then need to change the output.


All times are GMT -5. The time now is 04:52 PM.