LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 07-07-2017, 08:30 AM   #1
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
sequential : how to find the missing numbers within a sequence of files that have sequential numbers attached to them?


Counting how many files one has is a simple matter, but when I got a few hundred and some are missing within the total sequence what is a quick way to find out which ones are missing within the sequence of the total and put them in a file listing only the numbers that are not presence within the sequence?

say we have here these numbers, whereas it is to be all numbers within that sequence of 1-19.
we search though the files with the attached numbers and only see these numbers.

1, 2 , 3, 5, 7, 8, 19.

the means to search this and report back would result in this means saying that, 4, 6, 9-18 are missing and spelling out the last sequence in numbered order as well. The whole sequence of missing numbers being printed out or into a file.

the actual pattern being.
Code:
FileName-xxx-xxxxxxx.ext
where the leading zeros are place holders. the sequence pattern for the numbers then is: 001, 002, 003, -- 010, 011, -- 120 etc

the middle three x's being the ones to look at to find the missing numbers within the entire sequence of numbers.

Last edited by BW-userx; 07-07-2017 at 08:33 AM.
 
Old 07-07-2017, 08:50 AM   #2
TenTenths
Senior Member
 
Registered: Aug 2011
Location: Dublin
Distribution: Centos 5 / 6 / 7
Posts: 3,162

Rep: Reputation: 1361Reputation: 1361Reputation: 1361Reputation: 1361Reputation: 1361Reputation: 1361Reputation: 1361Reputation: 1361Reputation: 1361Reputation: 1361
Not tested but you could try something like:

Code:
testfor.sh

#!/bin/bash

PREFIX=$1
SUFFIX=$2

for I in $(seq -f "%03g" 0 19) ; do

  CHECKFOR=${PREFIX}${I}${SUFFIX}

  if [[ ! -f ${CHECKFOR} ]] ; then

    echo -ne "${I}, "

  fi

done
Then run it in the folder with:

Code:
./testfor.sh /path/to/folder/FileName- -*.ext
 
Old 07-07-2017, 08:57 AM   #3
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
Ok I got a script to get the numbers to use in a input file - removed the leading zeros, but the output is out of sequence.
Code:
#!/bin/bash

working_dir=/run/media/userx/250GB/NumberedFiles

while read file
do

f=$file
path=${f%/*}
xfile=${f##*/}
title=${xfile%.*}
ext=${xfile##*.}

numbers=${title#*-}
numbers=${numbers%%-*}
#remove all leading zero's
numbers=$(echo $numbers | sed 's/^0*//')

echo "$numbers" >> Numbers

done< <(find "$working_dir" -type f )
just a sniplet of the results.
Quote:

57
58
59
60
61
62
63
64
65
66
67
68
69
70
18
36
53
71
89
108
125
143
177
201
219
237
255
72
73
74
75
76
 
Old 07-07-2017, 10:21 AM   #4
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
Ok this is defiantly not working yets.

the sequence is
1 - 270

Code:
#!/bin/bash

working_dir=/run/media/userx/250GB/numberedFiles

a=0

while read file
do

f=$file
path=${f%/*}
xfile=${f##*/}
title=${xfile%.*}
ext=${xfile##*.}
t
#get numbers off of files
numbers=${title#*-}
numbers=${numbers%%-*}

#remove all leading zero's
numbers=$(echo $numbers | sed 's/^0*//')

# make sure they are actual digits
if [[ $numbers =~ ^-?[0-9]+$ ]] ; then  NumberedArray[$a]=$numbers ; ((a++)) ; fi
done< <(find "$working_dir" -type f )

# put them in order of 1 - 270

sortedNums=( $( printf "%s\n" "${NumberedArray[@]}" | sort -n ) )

# print them into a file

for (( b= 0 ; $b < "${#sortedNums[@]}" ; b++ ))
{
 echo "${sortedNums[$b]}" >> Numbers
} 

#not working like I need
#print out missing numbers
#awk '$1!=p+1{print p+1"-"$1-1}{p=$1}'  Numbers

awk '$1!=p+1{print p+1}{p=$1}'  Numbers
giving me the results of
Code:
userx%slackwhere ⚡ production ⚡> ./getNumberOffFiles
162
164
172
181
186
195
259
using this in the script
Code:
awk '$1!=p+1{print p+1"-"$1-1}{p=$1}'  Numbers
gives me this
Code:
162-162
169-170
172-175
181-181
186-186
195-198
259-257

I know there are more missing than that. 164 - 170 for starters.

maybe I got that awk backwards or something I snagged it off the net.

it is not incrementing to the next valid number and printing out the missing ones between the first missing one and the next valid number.

Last edited by BW-userx; 07-07-2017 at 06:32 PM.
 
Old 07-07-2017, 10:22 AM   #5
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
Quote:
Originally Posted by TenTenths View Post
Not tested but you could try something like:

Code:
testfor.sh

#!/bin/bash

PREFIX=$1
SUFFIX=$2

for I in $(seq -f "%03g" 0 19) ; do

  CHECKFOR=${PREFIX}${I}${SUFFIX}

  if [[ ! -f ${CHECKFOR} ]] ; then

    echo -ne "${I}, "

  fi

done
Then run it in the folder with:

Code:
./testfor.sh /path/to/folder/FileName- -*.ext
that is interesting, a lot shorter than what I've got.
I might have to play with it to get just the numbers off the file?

I'll give it a go just to see what that does.

Last edited by BW-userx; 07-07-2017 at 10:30 AM.
 
Old 07-07-2017, 11:43 AM   #6
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
well I got it to a point - and maybe this is too much for all of
Code:

c=1
  
for (( b = 1 ; $b < "${#sortedNums[@]}" ; b++ ))
{
 if [[ "${sortedNums[$b]}" != "$((c+1))" ]] ; then
     echo "Missing:  $((b+1))"
 fi
c=${sortedNums[$b]}
} 

#awk '$1!=p+1{print p+1"-"$1-1}{p=$1}'  Numbers

#awk '$1!=p+1{print p+1}{p=$1}'  Numbers
adding that to the entire script,
Code:
Missing:  162
Missing:  166 < present
Missing:  167
----
168
169
170
---
Missing:  172
----
173
174
175
---
Missing:  176 < present
Missing:  184 < present
Missing:  230 < present




awk

awk '$1!=p+1{print p+1"-"$1-1}{p=$1}' Numbers

Code:
162-162
167-170
172-175
181-181
186-186
195-198
245-245
I am just wanting the exact numbers missing printed in sequence
 
Old 07-07-2017, 11:55 AM   #7
Laserbeak
Member
 
Registered: Jan 2017
Location: Manhattan, NYC NY
Distribution: Mac OS X, iOS, Solaris
Posts: 508

Rep: Reputation: 143Reputation: 143
Do you want a Perl version?
 
Old 07-07-2017, 12:47 PM   #8
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
Quote:
Originally Posted by Laserbeak View Post
Do you want a Perl version?
can it print out the missing numbers in sequence?
if range is between 1 - 100 and missing is 4 - 10, 44, 85 then it prints
4 5 6 7 8 9 10 44 85 NOT 4-10 and whatever else gets printed.

Last edited by BW-userx; 07-07-2017 at 01:06 PM.
 
Old 07-07-2017, 02:39 PM   #9
scasey
LQ Veteran
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.8.2003
Posts: 5,384

Rep: Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021
Quote:
Originally Posted by BW-userx View Post
Ok I got a script to get the numbers to use in a input file - removed the leading zeros, but the output is out of sequence.
Code:
sort -n Numbers
as the last step should put the numbers in sequence.

Edit: Oh...you already got that...

I can almost see the perl in my minds eye. Be interested to see what Laserbeak comes up with.

Last edited by scasey; 07-07-2017 at 02:41 PM.
 
Old 07-07-2017, 02:45 PM   #10
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
Quote:
Originally Posted by scasey View Post
Code:
sort -n Numbers
as the last step should put the numbers in sequence.

Edit: Oh...you already got that...

I can almost see the perl in my minds eye. Be interested to see what Laserbeak comes up with.
yeah I got the sequencing in order done so,

yeah lets just see if he is up for the challenge MUHahahahaha
 
Old 07-07-2017, 04:17 PM   #11
scasey
LQ Veteran
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.8.2003
Posts: 5,384

Rep: Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021
I couldn't resist. Am interested in Laserbeak's comments and/or version.
Hopefully comments are self-explanatory.
Code:
#!/usr/bin/perl  
## ^^ set to location of your perl

$working_dir="/run/media/userx/250GB/NumberedFiles";
##~ $working_dir=".";   # testing
$max = $ARGV[0];   ## get max number from the command line
$max++;

## get list of file name in array
@files=`find "$working_dir" -type f`;

## remove leading and trailing parts
foreach $file (@files) {
	$file =~ s/^.*?-//;    #remove from beginning to first hyphen
	$file =~ s/-.*$//;	#remove from second hyphen to end
	$existnums[$file]=$file;  #save in array
}

## populate array with all the numbers: 1-input value 
for ($i = 1; $i < $max; $i++) {
    $allnums[$i] = $i;
}

## remove existing numbers from full list
foreach $nbr (@existnums) {
    $allnums[$nbr] = 0
}

## print out the remaining (i.e. missing) numbers
## note, no sorting required because the allnums array is populated in sequence
foreach $nbr  (@allnums) {
	if ($allnums[$nbr] ne 0) {      ## only print the entries that are not -0-
		print "$allnums[$nbr] ";   ## or 
##~ 		print "$allnums[$nbr]\n";  ## to do one per line
	}
}

print "\n";   ## when printing all on one line.
Run with /path/to/thisscript.pl maxvalue > savefile
or /path/to/thisscript.pl maxvalue to see on STDOUT

Last edited by scasey; 07-07-2017 at 04:34 PM. Reason: Missing ^ in first regex -- added in red
 
Old 07-07-2017, 04:25 PM   #12
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-30
Posts: 5,290

Rep: Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916
what i would do is create a file containing the integers in the order you want.
create another file with the filenames.
and then diff the two files.
 
Old 07-07-2017, 04:49 PM   #13
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
Quote:
Originally Posted by schneidz View Post
what i would do is create a file containing the integers in the order you want.
create another file with the filenames.
and then diff the two files.
hummm funny
Code:
userx%slackwhere ⚡ production ⚡> diff Numbers FILES 
1,258c1,259
< 1
< 2
< 3
< 4
just prints out numbers then the file names
 
Old 07-07-2017, 04:51 PM   #14
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
Quote:
Originally Posted by scasey View Post
I couldn't resist. Am interested in Laserbeak's comments and/or version.
Hopefully comments are self-explanatory.

Run with /path/to/thisscript.pl maxvalue > savefile
or /path/to/thisscript.pl maxvalue to see on STDOUT
Nothing I get nothing -

Code:
userx%slackwhere ⚡ scripts ⚡> ./perl-find-missing-numbers maxvalue

userx%slackwhere ⚡ scripts ⚡> 


userx%slackwhere ⚡ scripts ⚡> ./perl-find-missing-numbers 

userx%slackwhere ⚡ scripts ⚡>
does it actually need a .pl tag/ext on the end of it- or is that just a naming convention to indicate a perl script?

chnged it - same output

then checked to be sure I got perl ..
Code:
userx%slackwhere ⚡ scripts ⚡> perl -version
This is perl 5, version 22, subversion 2 (v5.22.2) built for x86_64-linux-thread-multi

Last edited by BW-userx; 07-07-2017 at 04:53 PM.
 
Old 07-07-2017, 05:19 PM   #15
scasey
LQ Veteran
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.8.2003
Posts: 5,384

Rep: Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021
you need the output of
Code:
which perl
to match what's on the #! line.

on my server:
Code:
$ which perl
/usr/bin/perl
Hmm. Me too. let me see what I broke...
Oh. You get nothing if you don't supply a max value on the command line -- there should probably be a test for that:
Code:
./perl-find-missing-numbers.pl 270
(or whatever the current name of the script is...)
(your #! must be right, or you'd have got errors)

Last edited by scasey; 07-07-2017 at 05:25 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Replace text string with sequential numbers inside a textfile K-Veikko Programming 3 04-07-2013 03:23 AM
[SOLVED] find the total of numbers that are higher than x in a text file with numbers (using awk??) Mike_V Programming 12 11-24-2010 09:51 AM
[SOLVED] Replace sequential numbers in a file with a different sequence using sed thefiend Linux - Newbie 6 04-12-2010 10:29 PM
HOWTO convert a group of files in a directory to a set of sequential numbers? lleb Linux - General 7 12-24-2009 07:02 PM
sequence of numbers, how to extract which numbers are missing jonlake Programming 13 06-26-2006 03:28 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:52 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration