LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 02-03-2013, 03:05 PM   #1
jroggow
Member
 
Registered: Mar 2006
Distribution: Slackware
Posts: 33

Rep: Reputation: 15
Zero Padded Files


I've written a script (Bash 4.2.37) to convert a whole heap of audio files to a common format, and move them to a particular directory. I've done some limited testing and so far it works like a champ, but a couple of finer points have me scratching my head.

Specifically, this snippet:

Code:
#find greatest increment amongst files
files=(~/music/*)
greatest=$(basename ${files[-1]%.*})
((i=$greatest+3)) #Does arithmetic expression see -1 in array and evaluate mathematically?  Add three to increment by one. Weird.  Why is it subtracting twice?
#Is there an alternative?  Seems messy.  It's bugging me.
#Actually, this shouldn't work.  What is eating my leading zeros? #Aren't zero padded numbers considered octal by Bash?
echo "$greatest"
echo "$i"
My stream of conscious comments include my questions.

When I first ran the script, $greatest had a value of 0008 and $i a value of 9. The second go, $greatest was 0012, $i 13.

Which is as it should be. Except not if Bash treats numbers with leading zeros as octal.

The $greatest+3 bit seems to work just fine, but it irks me. I chased my own tail for an age trying to make something like (($greatest++)) work. It didn't.
 
Old 02-03-2013, 03:12 PM   #2
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,154

Rep: Reputation: 333Reputation: 333Reputation: 333Reputation: 333
Well, you could remove the leading zeros with the ${greatest##0} couldn't you? That might help.
 
Old 02-03-2013, 03:23 PM   #3
jroggow
Member
 
Registered: Mar 2006
Distribution: Slackware
Posts: 33

Original Poster
Rep: Reputation: 15
That's the weird thing. I didn't remove the leading zeros, but they're gone when I use $i in my script.

Maybe I should clear the directory and start at 0. It was only a circumstance that 0008 was the largest number in the directory. Did Bash automatically determine that I wasn't using octal because 0008 would be invalid?
 
Old 02-03-2013, 04:20 PM   #4
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,154

Rep: Reputation: 333Reputation: 333Reputation: 333Reputation: 333
I just took a quick look at pinfo bash, and was reminded that you can explicitly state your number base by preceding the number with a specific base, so, for example:
Code:
$ echo $((10#017)) " != " $((017)) 
17  !=  15
As to the leading zeros being gone, the arithmetic evaluation converts the number to an integer, and bash prints integers without leading zeros. If you want, say, a four digit number with leading zeros, try something like this:
Code:
$ for ((i=0;i<21;i=i+3)); do j="0000"${i};echo ${i} "=" ${j: -4};done
0 = 0000
3 = 0003
6 = 0006
9 = 0009
12 = 0012
15 = 0015
18 = 0018
Note that the space after the : in ${j: -4} is needed, since :- has another use.

<edit>
I forgot the printf command!
Code:
$ for ((i=0;i<=21;i=i+3)); do printf "%2d = %04d\n" ${i} ${i};done 
 0 = 0000
 3 = 0003
 6 = 0006
 9 = 0009
12 = 0012
15 = 0015
18 = 0018
21 = 0021
</edit>

Last edited by PTrenholme; 02-03-2013 at 04:44 PM.
 
Old 02-03-2013, 04:51 PM   #5
jroggow
Member
 
Registered: Mar 2006
Distribution: Slackware
Posts: 33

Original Poster
Rep: Reputation: 15
That makes sense. Have I stumbled across a convenient shortcut or will this bite me in the ass?

Counting on my toes, 0012 should convert to 10 rather than 13. Since my script is returning 13 (the value I expect) for $i, it's making a sort of literal conversion rather than a mathematical conversion.

If I don't have to explicitly strip zero padding from the filenames, I will opt for the lazy solution and leave it as it stands.
 
Old 02-03-2013, 05:38 PM   #6
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 2,541

Rep: Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878
This doesn't work for me:

Code:
~/tmp/music$ ls
0001.music  0002.music  0003.music  0004.music  0005.music  0006.music  0007.music  0008.music
~/tmp/music$ files=(~/tmp/music/*)
~/tmp/music$ echo ${files[-1]}
/home/npostavs/tmp/music/0008.music
~/tmp/music$ echo $(basename ${files[-1]})
0008.music
~/tmp/music$ echo $(basename ${files[-1]%.*})
0008
~/tmp/music$ greatest=$(basename ${files[-1]%.*})
~/tmp/music$ ((i=$greatest+3))
bash: ((: i=0008: value too great for base (error token is "0008")
Do you have some strange characters in your filenames? locale setting?
 
Old 02-03-2013, 06:16 PM   #7
jroggow
Member
 
Registered: Mar 2006
Distribution: Slackware
Posts: 33

Original Poster
Rep: Reputation: 15
No, ntubski. That's why I don't particularly understand it.

Here's the code that names my files:

Code:
for file in ${files[@]}
do
	name="$i.mp3"
	num=$(expr length ${name%.*})
	#Zero-padding sorts file numerically.
	if (($num < 4)) #Four seems reasonable.  I don't anticipate having more than 10,000 files.
	then
		pad=$(head -c $num /dev/zero | tr '\0' '0')
	fi
	mv $file ~/music/$pad$name
	i=$(( $i+1 ))
There are no special characters or anything that I can identify. ls in my music directory shows 0000.mp3 0001.mp3 &c. . .

That's why I think it shouldn't work. My understanding is that it should have broken the first time I ran the script when I had 0008 files (which I put there manually).
 
Old 02-03-2013, 06:57 PM   #8
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 2,541

Rep: Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878
Quote:
Originally Posted by jroggow View Post
Here's the code that names my files:
The padding calculation doesn't look right: wouldn't you get 01.mp3, 02.mp3, ..., 0010.mp3, 0011.mp3, ..., 000100.mp3, 000101.mp3, ... from that? Also, you don't need $ on variables within arithmetic expressions: i=$((i+1)) or just ((i++)).


Back to the original question, how about you isolate the problem: make a new script in an empty directory:
Code:
#!/bin/bash

# run this in a new directory

touch 000{1..8}.mp3

files=(*.mp3)
greatest=$(basename ${files[-1]%.*})
((i=$greatest+3))
echo "i = $i, greatest = $greatest"
My output for this is:
Code:
./next-name.bash: line 9: ((: i=0008: value too great for base (error token is "0008")
i = , greatest = 0008
What's your output?

Last edited by ntubski; 02-03-2013 at 10:00 PM. Reason: padding is fine
 
Old 02-03-2013, 07:29 PM   #9
jroggow
Member
 
Registered: Mar 2006
Distribution: Slackware
Posts: 33

Original Poster
Rep: Reputation: 15
I get that same error running your script.
Code:
line 9: ((: i=0008: value too great for base (error token is "0008")
0008
 
Old 02-03-2013, 07:40 PM   #10
jroggow
Member
 
Registered: Mar 2006
Distribution: Slackware
Posts: 33

Original Poster
Rep: Reputation: 15
The padding should be fine. It counts the number of characters in the base filename and if less than four adds as many zeros as needed to make it four digits.
 
Old 02-03-2013, 10:00 PM   #11
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 2,541

Rep: Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878
Quote:
Originally Posted by jroggow View Post
I get that same error running your script.
Code:
line 9: ((: i=0008: value too great for base (error token is "0008")
0008
Okay, now the question is what's the difference between that script and the code in your original post? Do you get the same result if you run it in the ~/music directory?


And the padding code is fine, I just had a temporary comprehension failure.
 
Old 02-04-2013, 07:23 AM   #12
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 16

Rep: Reputation: 233Reputation: 233Reputation: 233
I think it's likely that your first code and the first output was different. Even if 0008 was not treated as octal "i" would not have a value of 9.

Code:
greatest=0008
(( i = $greatest + 3 )) # could be 11 but not 9
 
Old 02-04-2013, 08:02 AM   #13
mina86
Member
 
Registered: Aug 2008
Distribution: Slackware
Posts: 412

Rep: Reputation: 172Reputation: 172
Quote:
Originally Posted by PTrenholme View Post
Well, you could remove the leading zeros with the ${greatest##0} couldn't you? That might help.
This will remove only the first leading zero since “0” matches only that. “##” is of no help here. You'd have to do:
Code:
while [ x"$greatest" != x0 ] && [ x"$greatest" != x"${greatest#0}" ]; do
    greatest=${greatest#0}
done
Or if you don't like loops:
Code:
greatest=${greatest#"${greatest%%[1-9]*}"}
greatest=${greatest:-0}
Quote:
Originally Posted by ntubski View Post
Also, you don't need $ on variables within arithmetic expressions: i=$((i+1)) or just ((i++)).
Nonetheless, it's more portable to use a dollar sign.

Last edited by mina86; 02-04-2013 at 08:09 AM.
 
Old 02-04-2013, 08:34 AM   #14
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 16

Rep: Reputation: 233Reputation: 233Reputation: 233
Quote:
Originally Posted by mina86 View Post
Nonetheless, it's more portable to use a dollar sign.
Can I ask in what sense could it be more portable? What shell / version of Bash?

Also extended globbing would be a better solution for trimming leading zeros:
Code:
shopt -s extglob
... ${VAR##+(0)}
And I don't think portability with Bash < 3.0 would be necessary for that?
 
Old 02-04-2013, 02:33 PM   #15
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950
As mentioned in passing by konsolebox, when a number has a leading zero, the shell treats it as an octal value. Any arithmetic operation on a number that includes an 8 or 9 will result in an error, and others will probably give you incorrect values.

Edit: I just noticed that the OP actually said it first. Oh well.

The best way to avoid this is to strip the leading zeroes off before doing any math, and only re-pad them when you really need it.

http://mywiki.wooledge.org/ArithmeticExpression
http://mywiki.wooledge.org/BashFAQ/018


The leading base string is another option instead of stripping them off, but you'll still have to worry about re-padding the results afterwards.

Last edited by David the H.; 02-04-2013 at 02:36 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
IP with padded zeroes in bash frater Linux - Software 7 10-29-2010 09:59 AM
in copy files or ls files the command want to invert select some files how to?? hocheetiong Linux - Newbie 3 06-27-2008 07:32 AM
payload is padded with zeros ahm_irf Linux - Networking 1 11-11-2007 07:07 AM
Date command - month of year, blank padded? menator Programming 3 06-27-2006 07:00 AM
How do I print using format that padded space between printed value ? Linh Programming 2 06-18-2004 04:22 PM


All times are GMT -5. The time now is 07:02 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration