[SOLVED] sequential : how to find the missing numbers within a sequence of files that have sequential numbers attached to them?
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Just to make it clear, you are looking for missing numbers between the minimum and maximum existing numbers in the bold italic area here:
Code:
FileName-xxx-xxxxxxx.ext
Everything else should be left alone and if 001 is missing, you aren't supposed to add it. You only add missing numbers after the first one that is there, even if it is 013? Then end at the last existing number, like 912?
Hmm. Me too. let me see what I broke...
Oh. You get nothing if you don't supply a max value on the command line -- there should probably be a test for that:
Code:
./perl-find-missing-numbers.pl 270
(or whatever the current name of the script is...)
(your #! must be right, or you'd have got errors)
OIC chuckle not the word maxnumber -- haha ok hold on.
to get the list of files in the format FileName-xxx-xxxxxxx.ext -- then parsed out the xxx (the number between the hyphens) to populate the list of numbers that are present.
If the script doesn't find a bunch of files in that format, then yes, it will just print the range of numbers.
to get the list of files in the format FileName-xxx-xxxxxxx.ext -- then parsed out the xxx (the number between the hyphens) to populate the list of numbers that are present.
If the script doesn't find a bunch of files in that format, then yes, it will just print the range of numbers.
I made sure to change that path to for the proper path to dir on my system. that was a 'dummy' path for example that I gave.
I got everything written to get the numbers , strip off leading zeros and put them in order into a file, back on post #4
now all that is really need is to take that file and process it to look for the missing numbers in the sequence of numbers 1 - 270 and print them out in sequence of numbers missing.
like this
4, 5, , 6, 7, 8 , 9 , 10, 20 not 4 - 10 , 20
I made sure to change that path to for the proper path to dir on my system. that was a 'dummy' path for example that I gave.
I got everything written to get the numbers , strip off leading zeros and put them in order into a file, back on post #4
now all that is really need is to take that file and process it to look for the missing numbers in the sequence of numbers 1 - 270 and print them out in sequence of numbers missing.
like this
4, 5, , 6, 7, 8 , 9 , 10, 20 not 4 - 10 , 20
So the perl script doesn't do what you need? It should output the missing numbers, with no leading zeros, in sequence, using the file name format you posted. Do we need to add the commas?
Do you want to work on why it doesn't? I'm happy to do that.
So the perl script doesn't do what you need? It should output the missing numbers, with no leading zeros, in sequence, using the file name format you posted. Do we need to add the commas?
Do you want to work on why it doesn't? I'm happy to do that.
no comma needed print \n really,
you script prints this. Just showing a little of it, the bold are actual files NOT there they get printed.
Probably fastest.
Pipe the sorted file names to an awk script that scans of the previous file name was one less then the previous. If not write the missing names(s).
OK
O' to the awk --- unfamiliar with it but the one i did find almost does what I am wanting.
Right? Print out the missing numbers?
Again, the filenames must be in the format sometext-nnn-somemoretext. All I'm doing is parsing out the nnn part of the filename...the part between the hyphens.
If your real filename's numbers are not between two, and only two, hyphens, then the regexps I've written won't work and will need to be adjusted to work as you want.
Right? Print out the missing numbers?
Again, the filenames must be in the format sometext-nnn-somemoretext. All I'm doing is parsing out the nnn part of the filename...the part between the hyphens.
If your real filename's numbers are not between two, and only two, hyphens, then the regexps I've written won't work and will need to be adjusted to work as you want.
maybe you made a change I didn't get?
yeah that was a guess - they are really formated like this
FileName-001.mp4
I've been working with so many different file with numbers I just took a wild guess, and the 3 numbers is what is needed, and I made provisions to get just them and it looked like you did the same too, but I did not check your code for stripping them that closely.
Mayhaps, here's what I just ran...you'll need to change the $working_dir value againl
Code:
#!/usr/bin/perl
## ^^ set to location of your perl
$working_dir="/run/media/userx/250GB/NumberedFiles";
##~ $working_dir="."; # testing
if ($ARGV[0]) {
$max = $ARGV[0]; ## get max number from the command line
$max++;
}
else {
print "usage is $0 maxvalue";
}
## get list of files name in array Names are in format of FileName-nnn-xxxxxxx.ext
@files=`find "$working_dir" -type f`;
## remove leading and trailing parts
foreach $file (@files) {
$file =~ s/^.*?-//; #remove from beginning to first hyphen
$file =~ s/-.*$//; #remove from second hyphen to end
$existnums[$file]=$file; #save what's left in array
}
## populate array with all the numbers: 1-input value
for ($i = 1; $i < $max; $i++) {
$allnums[$i] = $i;
}
## remove existing numbers from full list
foreach $nbr (@existnums) {
$allnums[$nbr] = 0
}
## print out the remaining (i.e. missing) numbers
## note, no sorting required because the allnums array is populated in sequence
foreach $nbr (@allnums) {
if ($allnums[$nbr] ne 0) { ## only print the entries that are not -0-
##~ print "$allnums[$nbr] "; ## or
print "$allnums[$nbr]\n"; ## to do one per line
}
}
print "\n"; ## when printing all on one line.
I've been working with so many different file with numbers I just took a wild guess, and the 3 numbers is what is needed, and I made provisions to get just them and it looked like you did the same too, but I did not check your code for stripping them that closely.
No worries. Change line 21 from
Code:
$file =~ s/-.*$//; #remove from second hyphen to end
to
Code:
$file =~ s/\..*$//; #remove from '.' to end
And "it will cut" (sorry..been watching Forged In Fire )
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.