LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   possible coreutils bug (tr) (https://www.linuxquestions.org/questions/linux-general-1/possible-coreutils-bug-tr-454796/)

mebaro 06-14-2006 04:12 PM

possible coreutils bug (tr)
 
I have a really strange occurance. I'm trying to strip the leading characters in a list of filenames from a bash sctipt.

example:

filenames are:

test-1
test-2
test-3
test-4

My script uses the following sybtax:

#!/bin/bash
Y1=test-*

for Z in $Y1
do

X1=$( echo $Z | tr --delete 'test-' )

echo $Z
echo $X1

done

exit

The output I get is:

test-1
1
test-2
2
test-3

test-4
4


I have created test names upto 35 and in every case the 3 gets stripped from the entire string when using 'tr'.

Has anyone else encountered this? Is there a fix?

I'm running RedHat ES3 and ES4 and get the same behavior on both machines. I have coreutils 4.5.3-28 on one system and 4.5.3-25 on the other system. RH has version 4.5.3-28 as the latest version.

I also have an AIX box and I wrote a script with compatible syntax and it does not strip the 3 in AIX.

If anyone can suggest an alternate way of triming the filename I would greatly appreciate it.

:)

MensaWater 06-14-2006 04:48 PM

Interesting. I created the 4 files you list then ran your script and did NOT have the issue you saw. This was on my RH AS 3 server. Its coreutils rpm is: coreutils-4.5.3-26 You may want to check the version of coreutils you have.

The fact you have a "-" made me think it may be misinterpreting the -3 as a flag but on testing I see nothing that uses this in tr or test (test by the way is a built in command).

Anyway a couple of ways you can do it.

With awk: ls test-* |awk -F- '{print $2}'
Tells it to use the dash as delimiter then prints everything after the dash. Good for variable length patterns. (e.g. would work for test*-* instead of just test-*)

With cut: ls test-* |cut -c6-
Tells it to print everything from the 6th position on. Good for fixed length names such as the one you have. test- = 5 positions so the 6th position would follow the pattern you want to exclude.

Note: You imply you're recreating these files. If you're not it may simply be you have a hidden character in the test-3 file name. Doing ls test-* finds the file so long as the hidden character is after the dash. Try typing "ls test-3" - if it doesn't find the file then you know it has a hidden character in its name. If it does find it then you know it doesn't.

zhangmaike 06-14-2006 05:16 PM

My tr is version 5.2.1, and does not exhibit this problem.

I tried:
Code:

echo "test-3" | tr --delete 'test-'
and the result was 3.

A bit of warning: tr will delete all characters within the character set 'test-', not simply all occurences of the string 'test-'. So the result of

Code:

echo "t-t-e-e-s-s-t-t-3-testtest" | tr --delete 'test-'
is also 3. (And, because the argument to tr is a character set rather than a string, the second t is redundant.)

Anyway, the fact that each of these filenames has already been matched with test-* by the for loop, means that whatever program you use to remove the leading 6 characters does not need to retest for the presence of the prefix (its presence is already guaranteed by the pattern matching). Piping to:

Code:

cut -c6-
should be simple enough. Just don't forget the last dash after the 6.

If the prefix might change in the future, you could always have bash calculate the new cut location by checking the length of the prefix and adding 1:
Code:

#!/bin/bash

prefix="removeme-"
cut_location=$(( ${#prefix}+1 ))

for f in "$prefix"*
do
    echo "$f" | cut -c$cut_location-
done


mebaro 06-15-2006 10:04 AM

Thank you all for the suggestions. I have coreutils-4.5.3-26 also, so I don't know why the 3 is getting stripped. I created test files all the way to -35 and every output with a 3 in it had the 3 stripped out.

I tried using awk with the fiels option, but I got the syntax wrong, thanks for correcting me on that. I also didn't find the cut command in my linux scripting book.

Anyway, thanks again for the suggestions. I'll try both options.

...max


All times are GMT -5. The time now is 09:40 PM.