bash script to increment a filename

jpv50 · 06-29-2015, 09:08 AM

I have a file name in a bash variable. The file name ends in a (possibly 0 padded) decimal number, and I want to generate the file name with the next higher number. I've produced a bash script that seems to work, but it seems like a lot of code for such an apparently simple task. Does anyone know of a better/shorter way?

sample script:

LOC=NET-00.243503.078899

echo filename before .... $LOC

NUM=""
while true
do
case $LOC in
*[0-9] ) NUM="${LOC: -1}$NUM" ; LOC=${LOC%%[0-9]} ;;
* ) break;;
esac
done

DIGITS=${#NUM}
NUM=$((10#$NUM + 1))

LOC=`printf "%s%0${DIGITS}d" $LOC $NUM`

echo filename after ..... $LOC

sample output:

filename before .... NET-00.243503.078899
filename after ..... NET-00.243503.078900

millgates · 06-29-2015, 02:20 PM

Hi, how about

Code:

fname=NET-00.243503.078899

echo "before: $fname"

num=${fname##*[^0-9]}
printf -vnewname "s%0${#num}d" "${fname%$num}" "$((10#$num+1))"


echo " after: $newname"

jpv50 · 06-30-2015, 02:38 AM

Yes, that's much better! - although there is a small typo in your code, you missed off a "%" it should look like this:

fname=NET-00.243503.078899

echo "before: $fname"

num=${fname##*[^0-9]}
printf -vnewname "%s%0${#num}d" "${fname%$num}" "$((10#$num+1))"

echo " after: $newname"

Anyway thanks very much, very helpful!

pan64 · 06-30-2015, 02:54 AM

echo NET-00.243503.078899 | awk -F. ' BEGIN { OFS="."} $NF++'

danielbmartin · 06-30-2015, 06:36 AM

Quote:

Originally Posted by pan64

echo NET-00.243503.078899 | awk -F. ' BEGIN { OFS="."} $NF++'

This wonderfully concise solution has a blemish.

If before = NET-00.243503.078899 then after should = NET-00.243503.078900

Your awk produces this: NET-00.243503.78900
Note that a zero has been lost.

Daniel B. Martin

pan64 · 06-30-2015, 06:42 AM

oh, yes, that should be finetuned, use printf to format output, you can either implement the same way or use fixed values

millgates · 06-30-2015, 07:29 AM

Also, it will only work if the last non-digit in the filename is a dot. It will not work with filenames such as "foo07899". The example in the original post does have a '.' before the number, but the original post does not state it is the case for all the intended input.

jpv50 · 06-30-2015, 10:51 AM

The solution provided by @millgates answers my application. The solution provided by @pan64 is very concise, but it doesn't quite work right. It doesn't keep the zero-padding, and it depends on the last character before the decimal number being a dot. if you could fix both of those with a one-liner I'd be very impressed!

millgates · 06-30-2015, 02:53 PM

A slightly longer one-liner (can probably be made shorter):

Code:

echo NET-00.243503.078899 | awk -F'[^0-9]+' 'sub($NF"$",sprintf("%0*d",length($NF),$NF+1))'

jpv50 · 06-30-2015, 04:22 PM

OK, I'm very impressed!

I wonder if you could explain how that works, I don't really know awk. Thanks.

millgates · 06-30-2015, 06:55 PM

OK, let's start with a slightly longer awk:

Code:

awk -F'[^0-9]+' '{
  s = sprintf("%0*d", length($NF), $NF+1);
  sub($NF"$",s,$0);
  print $0
}'

1/ I launch awk with the -F switch, which sets the field separator to the regex [^0-9]+, so that the last field will be the number at the end of the string.

2/ the last argument is the actual awk program. An awk program is basically a chain of rules in the form

Code:

CONDITION { CODE }

. For each record read, awk goes through all the rules, and, if the CONDITION is true, executes the corresponding CODE
My program has only one rule (a rule with no CONDITION specified is always executed).

3/

Code:

s = sprintf("%0*d", length($NF), $NF+1);

is pretty self-explanatory. $NF is the last field of the current record.

4/

Code:

sub($NF"$",s,$0);

is a regex substitution, where $NF"$", the number with an "end of string" anchor is the pattern, s is the replacement string and $0 (the entire record read) is the target string.

5/

Code:

print $0

prints the current record to stdout.

Now there's a few things to note:

i) I cannot assign directly to $NF, because that forces $0 to be reassembled using the curent OFS (output field separator), which means losing all non-numeric characters.

ii) I can get rid of the extra assignment to s and combine the first two commands together. Also, since $0 is the default target string for sub, that argument can be omitted:

Code:

awk -F'[^0-9]+' '{
  sub($NF"$",sprintf("%0*d", length($NF), $NF+1));
  print $0
}'

iii) The CONDITION may be practically any expression that can be converted to a boolean value.

iv) The sub function returns number of replacements done. In this case, because $NF is always a subset of $0, it is the last field, and every string has exactly one end, it will always return 1 (a true value).

v) Using iii) and iv) we can put the sub statement out of the braces and use it as a CONDITION:

Code:

awk -F'[^0-9]+' 'sub($NF"$",sprintf("%0*d", length($NF), $NF+1)) { print $0 }'

vi) because { print $0; } is the default action, it can be omitted.

Code:

awk -F'[^0-9]+' 'sub($NF"$",sprintf("%0*d",length($NF),$NF+1))'

Actually, the '+' in the field separator is not required and may be left out as well. awk will then split the filename into more fields, but in the end, the result will be the same.

danielbmartin · 06-30-2015, 08:42 PM

Quote:

Originally Posted by millgates

Code:

awk -F'[^0-9]+' 'sub($NF"$",sprintf("%0*d", length($NF), $NF+1)) { print $0 }'
                        ^^^
                        |||

Is that "$" necessary? I omitted it and got the same result.

Daniel B. Martin

millgates · 07-01-2015, 12:52 AM

Quote:

Originally Posted by danielbmartin

Is that "$" necessary? I omitted it and got the same result.

The "$" will make a difference whenever the $NF can be matched anywhere else in the string, e.g., when the filename does not end with a number:

Code:

# with
awk -F'[^0-9]+' 'sub($NF"$",sprintf("%0*d", length($NF), $NF+1)) { print $0 }' <<< foo
foo1

# without
awk -F'[^0-9]+' 'sub($NF,sprintf("%0*d", length($NF), $NF+1)) { print $0 }' <<< foo
1foo

or the filename contains two identical sequences of digits:

Code:

#with
awk -F'[^0-9]+' 'sub($NF"$",sprintf("%0*d", length($NF), $NF+1)) { print $0 }' <<< foo123bar123
foo123bar124

#without
awk -F'[^0-9]+' 'sub($NF,sprintf("%0*d", length($NF), $NF+1)) { print $0 }' <<< foo123bar123
foo124bar123

jpv50 · 07-01-2015, 06:25 PM

That's great. Thanks a lot!