LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   sequential : how to find the missing numbers within a sequence of files that have sequential numbers attached to them? (https://www.linuxquestions.org/questions/programming-9/sequential-how-to-find-the-missing-numbers-within-a-sequence-of-files-that-have-sequential-numbers-attached-to-them-4175609376/)

BW-userx 07-07-2017 08:30 AM

sequential : how to find the missing numbers within a sequence of files that have sequential numbers attached to them?
 
Counting how many files one has is a simple matter, but when I got a few hundred and some are missing within the total sequence what is a quick way to find out which ones are missing within the sequence of the total and put them in a file listing only the numbers that are not presence within the sequence?

say we have here these numbers, whereas it is to be all numbers within that sequence of 1-19.
we search though the files with the attached numbers and only see these numbers.

1, 2 , 3, 5, 7, 8, 19.

the means to search this and report back would result in this means saying that, 4, 6, 9-18 are missing and spelling out the last sequence in numbered order as well. The whole sequence of missing numbers being printed out or into a file.

the actual pattern being.
Code:

FileName-xxx-xxxxxxx.ext
where the leading zeros are place holders. the sequence pattern for the numbers then is: 001, 002, 003, -- 010, 011, -- 120 etc

the middle three x's being the ones to look at to find the missing numbers within the entire sequence of numbers.

TenTenths 07-07-2017 08:50 AM

Not tested but you could try something like:

Code:

testfor.sh

#!/bin/bash

PREFIX=$1
SUFFIX=$2

for I in $(seq -f "%03g" 0 19) ; do

  CHECKFOR=${PREFIX}${I}${SUFFIX}

  if [[ ! -f ${CHECKFOR} ]] ; then

    echo -ne "${I}, "

  fi

done

Then run it in the folder with:

Code:

./testfor.sh /path/to/folder/FileName- -*.ext

BW-userx 07-07-2017 08:57 AM

Ok I got a script to get the numbers to use in a input file - removed the leading zeros, but the output is out of sequence.
Code:

#!/bin/bash

working_dir=/run/media/userx/250GB/NumberedFiles

while read file
do

f=$file
path=${f%/*}
xfile=${f##*/}
title=${xfile%.*}
ext=${xfile##*.}

numbers=${title#*-}
numbers=${numbers%%-*}
#remove all leading zero's
numbers=$(echo $numbers | sed 's/^0*//')

echo "$numbers" >> Numbers

done< <(find "$working_dir" -type f )

just a sniplet of the results.
Quote:


57
58
59
60
61
62
63
64
65
66
67
68
69
70
18
36
53
71
89
108
125
143
177
201
219
237
255
72
73
74
75
76

BW-userx 07-07-2017 10:21 AM

Ok this is defiantly not working yets.

the sequence is
1 - 270

Code:

#!/bin/bash

working_dir=/run/media/userx/250GB/numberedFiles

a=0

while read file
do

f=$file
path=${f%/*}
xfile=${f##*/}
title=${xfile%.*}
ext=${xfile##*.}
t
#get numbers off of files
numbers=${title#*-}
numbers=${numbers%%-*}

#remove all leading zero's
numbers=$(echo $numbers | sed 's/^0*//')

# make sure they are actual digits
if [[ $numbers =~ ^-?[0-9]+$ ]] ; then  NumberedArray[$a]=$numbers ; ((a++)) ; fi
done< <(find "$working_dir" -type f )

# put them in order of 1 - 270

sortedNums=( $( printf "%s\n" "${NumberedArray[@]}" | sort -n ) )

# print them into a file

for (( b= 0 ; $b < "${#sortedNums[@]}" ; b++ ))
{
 echo "${sortedNums[$b]}" >> Numbers
}

#not working like I need
#print out missing numbers
#awk '$1!=p+1{print p+1"-"$1-1}{p=$1}'  Numbers

awk '$1!=p+1{print p+1}{p=$1}'  Numbers

giving me the results of
Code:

userx%slackwhere ⚡ production ⚡> ./getNumberOffFiles
162
164
172
181
186
195
259

using this in the script
Code:

awk '$1!=p+1{print p+1"-"$1-1}{p=$1}'  Numbers
gives me this
Code:

162-162
169-170
172-175
181-181
186-186
195-198
259-257


I know there are more missing than that. 164 - 170 for starters.

maybe I got that awk backwards or something I snagged it off the net.

it is not incrementing to the next valid number and printing out the missing ones between the first missing one and the next valid number.

BW-userx 07-07-2017 10:22 AM

Quote:

Originally Posted by TenTenths (Post 5731892)
Not tested but you could try something like:

Code:

testfor.sh

#!/bin/bash

PREFIX=$1
SUFFIX=$2

for I in $(seq -f "%03g" 0 19) ; do

  CHECKFOR=${PREFIX}${I}${SUFFIX}

  if [[ ! -f ${CHECKFOR} ]] ; then

    echo -ne "${I}, "

  fi

done

Then run it in the folder with:

Code:

./testfor.sh /path/to/folder/FileName- -*.ext

that is interesting, a lot shorter than what I've got.
I might have to play with it to get just the numbers off the file?

I'll give it a go just to see what that does.

BW-userx 07-07-2017 11:43 AM

well I got it to a point - and maybe this is too much for all of
Code:



c=1
 
for (( b = 1 ; $b < "${#sortedNums[@]}" ; b++ ))
{
 if [[ "${sortedNums[$b]}" != "$((c+1))" ]] ; then
    echo "Missing:  $((b+1))"
 fi
c=${sortedNums[$b]}
}

#awk '$1!=p+1{print p+1"-"$1-1}{p=$1}'  Numbers

#awk '$1!=p+1{print p+1}{p=$1}'  Numbers

adding that to the entire script,
Code:

Missing:  162
Missing:  166 < present
Missing:  167
----
168
169
170
---
Missing:  172
----
173
174
175
---
Missing:  176 < present
Missing:  184 < present
Missing:  230 < present





awk

awk '$1!=p+1{print p+1"-"$1-1}{p=$1}' Numbers

Code:

162-162
167-170
172-175
181-181
186-186
195-198
245-245

I am just wanting the exact numbers missing printed in sequence

Laserbeak 07-07-2017 11:55 AM

Do you want a Perl version?

BW-userx 07-07-2017 12:47 PM

Quote:

Originally Posted by Laserbeak (Post 5732011)
Do you want a Perl version?

can it print out the missing numbers in sequence?
if range is between 1 - 100 and missing is 4 - 10, 44, 85 then it prints
4 5 6 7 8 9 10 44 85 NOT 4-10 and whatever else gets printed.

scasey 07-07-2017 02:39 PM

Quote:

Originally Posted by BW-userx (Post 5731899)
Ok I got a script to get the numbers to use in a input file - removed the leading zeros, but the output is out of sequence.

Code:

sort -n Numbers
as the last step should put the numbers in sequence.

Edit: Oh...you already got that...

I can almost see the perl in my minds eye. Be interested to see what Laserbeak comes up with.

BW-userx 07-07-2017 02:45 PM

Quote:

Originally Posted by scasey (Post 5732103)
Code:

sort -n Numbers
as the last step should put the numbers in sequence.

Edit: Oh...you already got that...

I can almost see the perl in my minds eye. Be interested to see what Laserbeak comes up with.

yeah I got the sequencing in order done so,

yeah lets just see if he is up for the challenge MUHahahahaha :D

scasey 07-07-2017 04:17 PM

I couldn't resist. Am interested in Laserbeak's comments and/or version.
Hopefully comments are self-explanatory.
Code:

#!/usr/bin/perl 
## ^^ set to location of your perl

$working_dir="/run/media/userx/250GB/NumberedFiles";
##~ $working_dir=".";  # testing
$max = $ARGV[0];  ## get max number from the command line
$max++;

## get list of file name in array
@files=`find "$working_dir" -type f`;

## remove leading and trailing parts
foreach $file (@files) {
        $file =~ s/^.*?-//;    #remove from beginning to first hyphen
        $file =~ s/-.*$//;        #remove from second hyphen to end
        $existnums[$file]=$file;  #save in array
}

## populate array with all the numbers: 1-input value
for ($i = 1; $i < $max; $i++) {
    $allnums[$i] = $i;
}

## remove existing numbers from full list
foreach $nbr (@existnums) {
    $allnums[$nbr] = 0
}

## print out the remaining (i.e. missing) numbers
## note, no sorting required because the allnums array is populated in sequence
foreach $nbr  (@allnums) {
        if ($allnums[$nbr] ne 0) {      ## only print the entries that are not -0-
                print "$allnums[$nbr] ";  ## or
##~                print "$allnums[$nbr]\n";  ## to do one per line
        }
}

print "\n";  ## when printing all on one line.

Run with /path/to/thisscript.pl maxvalue > savefile
or /path/to/thisscript.pl maxvalue to see on STDOUT

schneidz 07-07-2017 04:25 PM

what i would do is create a file containing the integers in the order you want.
create another file with the filenames.
and then diff the two files.

BW-userx 07-07-2017 04:49 PM

Quote:

Originally Posted by schneidz (Post 5732148)
what i would do is create a file containing the integers in the order you want.
create another file with the filenames.
and then diff the two files.

hummm funny
Code:

userx%slackwhere ⚡ production ⚡> diff Numbers FILES
1,258c1,259
< 1
< 2
< 3
< 4

just prints out numbers then the file names

BW-userx 07-07-2017 04:51 PM

Quote:

Originally Posted by scasey (Post 5732146)
I couldn't resist. Am interested in Laserbeak's comments and/or version.
Hopefully comments are self-explanatory.

Run with /path/to/thisscript.pl maxvalue > savefile
or /path/to/thisscript.pl maxvalue to see on STDOUT

Nothing I get nothing -

Code:

userx%slackwhere ⚡ scripts ⚡> ./perl-find-missing-numbers maxvalue

userx%slackwhere ⚡ scripts ⚡>


userx%slackwhere ⚡ scripts ⚡> ./perl-find-missing-numbers

userx%slackwhere ⚡ scripts ⚡>

does it actually need a .pl tag/ext on the end of it- or is that just a naming convention to indicate a perl script?

chnged it - same output

then checked to be sure I got perl ..
Code:

userx%slackwhere ⚡ scripts ⚡> perl -version
This is perl 5, version 22, subversion 2 (v5.22.2) built for x86_64-linux-thread-multi


scasey 07-07-2017 05:19 PM

you need the output of
Code:

which perl
to match what's on the #! line.

on my server:
Code:

$ which perl
/usr/bin/perl

Hmm. Me too. let me see what I broke...
Oh. You get nothing if you don't supply a max value on the command line -- there should probably be a test for that:
Code:

./perl-find-missing-numbers.pl 270
(or whatever the current name of the script is...)
(your #! must be right, or you'd have got errors)


All times are GMT -5. The time now is 12:03 PM.