LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 07-07-2017, 08:05 PM   #31
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148

Quote:
Originally Posted by scasey View Post
No worries. Change line 21 from
Code:
	$file =~ s/-.*$//;	#remove from second hyphen to end
to
Code:
	$file =~ s/\..*$//;	#remove from  '.' to end
And "it will cut" (sorry..been watching Forged In Fire )
oh yeah ok two periods with an escape I was just escapng one period.

Code:
foreach $file (@files) {
	print "first $file";
	$file =~ s/FileName-//; 
	#$file =~ s/^.*?-//;    #remove from beginning to first hyphen
	print "second $file";
	#$file =~ s/-.*$//;	#remove from second hyphen to end
	$file =~ s/.mp4//;
	$existnums[$file]=$file;  #save in array
	print " third $file";
}

then just decided to use actual name
gets this.
Code:
first /run/media/userx/3TB-External/Files-Resampled/FileName-168.mp4
second /run/media/userx/3TB-External/Files-Resampled/168.mp4
 third /run/media/userx/3TB-External/Files-Resampled/168
first /run/media/userx/3TB-External/Files-Resampled/FileName-176.mp4
second /run/media/userx/3TB-External/Files-Resampled/176.mp4
 third /run/media/userx/3TB-External/Files-Resampled/176
first /run/media/userx/3TB-External/Files-Resampled/FileName-179.mp4
second /run/media/userx/3TB-External/Files-Resampled/179.mp4
 third /run/media/userx/3TB-External/Files-Resampled/179
still prints out every number

let me change that to what you got and see what happens

Note:
that is a strange way to populate the array with out an incurrent number? or is it because it is not getting a number within the array element by how it is getting chopped down?

Code:
$existnums[$file]=$file;  #save in array
because $file should equal a number

your new way got me this.
Code:
first /run/media/userx/3TB-External/Files-Resampled/FileName-176.mp4
second /run/media/userx/3TB
 third /run/media/userx/3TB
first /run/media/userx/3TB-External/Files-Resampled/FileName-179.mp4
second /run/media/userx/3TB
 third /run/media/userx/3TB
1
2
3
4
5
6
7
8
9
10
11
12
13
code here
Code:
## remove leading and trailing parts
foreach $file (@files) {
	print "first $file";
	#$file =~ s/FileName-//; 
	#$file =~ s/^.*?-//;    #remove from beginning to first hyphen
	
        $file =~ s/-.*$//;	
	
	print "second $file";
	#$file =~ s/-.*$//;	#remove from second hyphen to end
	#$file =~ s/.mp4//;

	$file =~ s/\..*$//;	#remove from  '.' to end
	$existnums[$file]=$file;  #save in array

	print " third $file";
}

Last edited by BW-userx; 07-07-2017 at 08:10 PM.
 
Old 07-07-2017, 08:10 PM   #32
Laserbeak
Member
 
Registered: Jan 2017
Location: Manhattan, NYC NY
Distribution: Mac OS X, iOS, Solaris
Posts: 508

Rep: Reputation: 143Reputation: 143
This seems to work:

Code:
#!/usr/bin/perl

use strict;
use warnings;

my $dir = '/path/to/files';
opendir(my $DIRECTORY, $dir) || die "Can't open $dir: $!\n";

my %numhash = ();

for (my $min = 0; $min <= 999; $min++) {
      my @files = readdir $DIRECTORY;
      for my $this_file (@files) {
           print $this_file, "\n";
           if ($this_file =~ /^(.*?)-(\d+?)-(.*?)\.ext$/) {
                $numhash{$2} = 1;
           }
      }
}
my @sortedarray = (sort {$a <=> $b} keys %numhash);

for (my $i = $sortedarray[0]; $i < $sortedarray[-1]; $i++) {
    if (exists $numhash{sprintf "%03d", $i}) {
      next;
    } else {
      print STDERR "$i is missing!\n";
    }
}

Last edited by Laserbeak; 07-07-2017 at 08:33 PM.
 
2 members found this post helpful.
Old 07-07-2017, 08:23 PM   #33
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
Code:
 
## remove leading and trailing parts
foreach $file (@files) {
	print "first $file";
	$file =~ s/FileName-//; 
	#$file =~ s/^.*?-//;    #remove from beginning to first hyphen
	#$file =~ s/-.*?-//;	#remove from second hyphen to end	
	 
	print "second $file";
	#$file =~ s/-.*$//;	#remove from second hyphen to end
	#$file =~ s/.mp4//;
	$file =~ s/\..*$//;	#remove from  '.' to end
	$existnums[$file]=$file;  #save in array
	print " third $file";
}
get this
Code:
first /run/media/userx/3TB-External/Files-Resampled/FileName-179.mp4
second /run/media/userx/3TB-External/Files-Resampled/179.mp4
 third /run/media/userx/3TB-External/Files-Resampled/179
but I do not understand why that path to is still there?

in bash it gets chopped off and leaving me with just the number inside of the variable.
 
Old 07-07-2017, 08:25 PM   #34
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
Quote:
Originally Posted by Laserbeak View Post
This seems to work:

Code:
#!/usr/bin/perl

use strict;
use warnings;

my $dir = '/path/to/files';
opendir(my $DIRECTORY, $dir) || die "Can't open $dir: $!\n";

my %numhash = ();

for (my $min = 0; $min <= 999; $min++) {
      my @files = readdir $DIRECTORY;
      for my $this_file (@files) {

           if ($this_file =~ /^(.*?)-(\d+?)-(.*?)\.ext$/) {
                $numhash{$2} = 1;
           }
      }
}
my @sortedarray = (sort {$a <=> $b} keys %numhash);

for (my $i = $sortedarray[0]; $i < $sortedarray[-1]; $i++) {
    if (exists $numhash{sprintf "%03d", $i}) {
      next;
    } else {
      print STDERR "$i is missing!\n";
    }
}
nope
Code:
userx%slackwhere ⚡ scripts ⚡> ./perl-number-list
Use of uninitialized value in numeric lt (<) at ./perl-number-list line 23.
Use of uninitialized value $i in numeric lt (<) at ./perl-number-list line 23.
line 23
Code:
for (my $i = $sortedarray[0]; $i < $sortedarray[-1]; $i++) {
and yes I gave it the proper path

and changed it to proper ext = mp4
Code:
\.mp4$/) {

Last edited by BW-userx; 07-07-2017 at 08:28 PM.
 
Old 07-07-2017, 08:35 PM   #35
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
Quote:
Originally Posted by Laserbeak View Post
This seems to work:
never mind fixed it kind of:
Code:
   #if ($this_file =~ /^(.*?)-(\d+?)-(.*?)\.mp4$/) {
             if ($this_file =~ /^(.*?)-(\d+?)\.mp4$/) {
                $numhash{$2} = 1;
           }
this is what it spit out.
Code:
userx%slackwhere ⚡ scripts ⚡> ./perl-number-list
7813 is missing!
7814 is missing!
7815 is missing!
7816 is missing!
7817 is missing!
7818 is missing!
7819 is missing!
7820 is missing!
7821 is missing!
7822 is missing!
7823 is missing!
7824 is missing!
7825 is missing!
7826 is missing!
7827 is missing!
7828 is missing!
7829 is missing!
7830 is missing!
7831 is missing!
changed it to
Code:
for (my $min = 0; $min <= 270; $min++) {
      my @files = readdir $DIRECTORY;
270 but that didn't stop it for going nuts with the numbers.

AGAIN - fixed it I removed some malformed filenamed files
Code:
userx%slackwhere ⚡ scripts ⚡> ./perl-number-list
162 is missing!
169 is missing!
170 is missing!
172 is missing!
173 is missing!
174 is missing!
175 is missing!
181 is missing!
186 is missing!
195 is missing!
196 is missing!
197 is missing!
198 is missing!
245 is missing!
Just have not visually checked it again the list.

Its getting really late here so I will trust your work and mark this solved and check them numbers in the morning.

Last edited by BW-userx; 07-07-2017 at 08:43 PM.
 
Old 07-07-2017, 08:36 PM   #36
Laserbeak
Member
 
Registered: Jan 2017
Location: Manhattan, NYC NY
Distribution: Mac OS X, iOS, Solaris
Posts: 508

Rep: Reputation: 143Reputation: 143
I added a print statement there to see if it's picking up any files, can you run it like that with your changes (it would probably be easier for you to just add the print $this_file, "\n"; line to whatever you have saved.

Edit:

OK, I guess you got it working.

Last edited by Laserbeak; 07-07-2017 at 08:38 PM.
 
Old 07-07-2017, 08:46 PM   #37
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
Quote:
Originally Posted by Laserbeak View Post
I added a print statement there to see if it's picking up any files, can you run it like that with your changes (it would probably be easier for you to just add the print $this_file, "\n"; line to whatever you have saved.

Edit:

OK, I guess you got it working.
yes thanks -- for the both of you perl jockies @scasey as well ...
 
Old 07-07-2017, 08:55 PM   #38
Laserbeak
Member
 
Registered: Jan 2017
Location: Manhattan, NYC NY
Distribution: Mac OS X, iOS, Solaris
Posts: 508

Rep: Reputation: 143Reputation: 143
Quote:
Originally Posted by BW-userx View Post
never mind fixed it kind of:
Code:
   #if ($this_file =~ /^(.*?)-(\d+?)-(.*?)\.mp4$/) {
             if ($this_file =~ /^(.*?)-(\d+?)\.mp4$/) {
                $numhash{$2} = 1;
           }
this is what it spit out.
Code:
userx%slackwhere ⚡ scripts ⚡> ./perl-number-list
7813 is missing!
7814 is missing!
7815 is missing!
7816 is missing!
7817 is missing!
7818 is missing!
7819 is missing!
7820 is missing!
7821 is missing!
7822 is missing!
7823 is missing!
7824 is missing!
7825 is missing!
7826 is missing!
7827 is missing!
7828 is missing!
7829 is missing!
7830 is missing!
7831 is missing!
changed it to
Code:
for (my $min = 0; $min <= 270; $min++) {
      my @files = readdir $DIRECTORY;
270 but that didn't stop it for going nuts with the numbers.

AGAIN - fixed it I removed some malformed filenamed files
Code:
userx%slackwhere ⚡ scripts ⚡> ./perl-number-list
162 is missing!
169 is missing!
170 is missing!
172 is missing!
173 is missing!
174 is missing!
175 is missing!
181 is missing!
186 is missing!
195 is missing!
196 is missing!
197 is missing!
198 is missing!
245 is missing!
Just have not visually checked it again the list.

Its getting really late here so I will trust your work and mark this solved and check them numbers in the morning.


Actually I don't think you that loop with $min at all, it is leftover from a previous thought on how to do it...there's a slang word for useless code that builds up in a program but I forget what it is at the moment.
 
Old 07-07-2017, 09:08 PM   #39
Laserbeak
Member
 
Registered: Jan 2017
Location: Manhattan, NYC NY
Distribution: Mac OS X, iOS, Solaris
Posts: 508

Rep: Reputation: 143Reputation: 143
Here's my final version according to your first criteria:

Code:
#!/usr/bin/perl

use strict;
use warnings;

my $dir = '/path/to/files';
opendir(my $DIRECTORY, $dir) || die "Can't open $dir: $!\n";

my %numhash = ();

my @files = readdir $DIRECTORY;
for my $this_file (@files) {

    if ($this_file =~ /^(.*?)-(\d+?)-(.*?)\.ext$/) {
        $numhash{$2} = 1;
    }
}

my @sortedarray = (sort {$a <=> $b} keys %numhash);

for (my $i = $sortedarray[0]; $i < $sortedarray[-1]; $i++) {
    if (! exists $numhash{sprintf "%03d", $i}) {
      printf STDERR "%03d is missing!\n", $i;
    }
}

Last edited by Laserbeak; 07-07-2017 at 09:45 PM. Reason: Oops, one final change... make sure missing numbers are front padded with zeros to three digits
 
Old 07-07-2017, 09:36 PM   #40
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
Ya got a understand that ext is just generic name/ abbreviation for extension, ya just got a, I tell ya!

Posted from phone

Last edited by BW-userx; 07-07-2017 at 09:37 PM.
 
Old 07-07-2017, 10:18 PM   #41
Laserbeak
Member
 
Registered: Jan 2017
Location: Manhattan, NYC NY
Distribution: Mac OS X, iOS, Solaris
Posts: 508

Rep: Reputation: 143Reputation: 143
Quote:
Originally Posted by BW-userx View Post
Ya got a understand that ext is just generic name/ abbreviation for extension, ya just got a, I tell ya!

Posted from phone
Yeah, I just posted according to the original theoretical specs since you seem to know more how to customize it to your system and situation than I do....

Last edited by Laserbeak; 07-07-2017 at 10:28 PM.
 
Old 07-07-2017, 10:21 PM   #42
Laserbeak
Member
 
Registered: Jan 2017
Location: Manhattan, NYC NY
Distribution: Mac OS X, iOS, Solaris
Posts: 508

Rep: Reputation: 143Reputation: 143
Oh, I think I just remembered the name of useless code that builds up in a program.... cruft
 
Old 07-08-2017, 01:36 AM   #43
scasey
LQ Veteran
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.8.2003
Posts: 5,384

Rep: Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021
Quote:
Originally Posted by BW-userx View Post
but I do not understand why that path to is still there?

in bash it gets chopped off and leaving me with just the number inside of the variable.
Ahh. I was using a working_dir of . so there was no path on the result, still:
Code:
$file =~ s/^.*?-//;    #remove from beginning to first hyphen
should remove everything up to and including the first hyphen in the file name...including the path.
Oh. you changed that to
Code:
$file =~ s/FileName-//;
which wouldn't account for the path in front of the file name, so it's still there.
My regex says match from the beginning of the string (^) any character (.) any number of times (*) up to the first (?) hyphen (-) and replace with nothing.
(although now that there are not two hyphens, the ? isn't really necessary.)
Your regex says replace 'FileName-' with nothing, so you aren't removing the path/to/the/file from the result of the find command.
Rerun with your print statements, but use my regex instead. You should see:
Code:
first /run/media/userx/3TB-External/Files-Resampled/FileName-179.mp4
second 179.mp4
 third 179
It works to use
Code:
$file =~ s/.mp4//;
instead of
Code:
s/\..*$//
, but only if the extension is always mp4
[and you should escape the '.'
Code:
s/\.mp4//
because you want to match a literal '.' - not 'any one character' - although it works because a '.' is an 'any one character' ]

...but I digress. Your regex works only if the extension is always .mp4 .. my regex will work no matter what the extension is, matching
a literal dot (\.) any character (.) any number of times (*) to the end of the string ($).

Quote:
Note:
that is a strange way to populate the array with out an incurrent number? or is it because it is not getting a number within the array element by how it is getting chopped down?

Code:

$existnums[$file]=$file; #save in array

because $file should equal a number
$file does equal a number as I coded it. If files xxx-001.mp4, xxx-004.mp4, xxx-006.mp4 exist, the array would contain
Code:
$existnums[1]=1
$existnums[4]=4
$existnums[6]=6
The leading zeros go away in much the same way they do in Excel. Put 001 into a variable and then print it, it will display 1. It's how perl works.

I'm a little lost, and it's late here now. I see you used:
Code:
$file =~ s/-.*$//;
for the first substitution. That would remove everything from the hyphen to the end of the string, which would include the number we're trying to capture.

Please run with these regex's:
Code:
foreach $file (@files) {
	$file =~ s/^.*?-//;    #remove from beginning to first hyphen
	$file =~ s/\..*$//;	#remove from .  to end
	$existnums[$file]=$file;  #save what's left in array
}
(I've updated my last post of the full script [#29]...)

PS
I want to look at what Laserbeak contributed, but it really is late...

Maņana

Last edited by scasey; 07-08-2017 at 01:43 AM.
 
Old 07-08-2017, 09:18 AM   #44
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (current), FreeBSD, Win10, It varies
Posts: 9,952

Original Poster
Rep: Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148Reputation: 2148
@scasey

yeah I just happened to check into regex - looked at it briefly just a little while before getting back to this post, and being late here I just hack away on it not having time to really think about what I was doing per se'

whereas bash is just:
Code:
var=${file##*-}
#strips everything from left to right
# to the farthest - (hyphen) right of left side 
#removing path and filename up to the hyphen
#/path/to/File-Name-123.mp4
gets

123.mp4

var=${var%.*}

#then strips right to left to closest dot
giving me

123

#so,
#this
var=${file##*-}
var=${var%.*}
is all I needed to get the numbers.
easier readability for me, whereas just learning the meanings of the symbols used in regex is all . (dot) is anything, ? is at the start, $ is at the end then the \ / \\ /. choppy looking lines like this
Code:
$this_file =~ /^(.*?)-(\d+?)-(.*?)\.ext$/)
it makes a kind of like Hieroglyphs to me.

even though I figured that one out in the filename was not actually as I stated before hand in the first post.

giving me this instead,
Code:
$this_file =~ /^(.*?)-(\d+?)\.mp4$/)
because .mp4 is explicitly stated. then the tail end part is not needed.
Code:
(.*?)
leaving it with searching for just one hyphen and keeping whats in between it (\d+?)

Last edited by BW-userx; 07-08-2017 at 09:23 AM.
 
Old 07-08-2017, 09:45 AM   #45
Laserbeak
Member
 
Registered: Jan 2017
Location: Manhattan, NYC NY
Distribution: Mac OS X, iOS, Solaris
Posts: 508

Rep: Reputation: 143Reputation: 143
Quote:
Originally Posted by BW-userx View Post
easier readability for me, whereas just learning the meanings of the symbols used in regex is all . (dot) is anything, ? is at the start, $ is at the end then the \ / \\ /. choppy looking lines like this
Code:
$this_file =~ /^(.*?)-(\d+?)-(.*?)\.ext$/)
it makes a kind of like Hieroglyphs to me.

even though I figured that one out in the filename was not actually as I stated before hand in the first post.

giving me this instead,
Code:
$this_file =~ /^(.*?)-(\d+?)\.mp4$/)
because .mp4 is explicitly stated. then the tail end part is not needed.
Code:
(.*?)
leaving it with searching for just one hyphen and keeping whats in between it (\d+?)
Once you get used to them, regular expressions are like second nature, you hardly have to think about them. And, although the slash is traditional, you can use any symbol like a comma etc. So $this_file =~ ,^(.*?)-(\d+?)\.mp4$,) would work too.

The question mark is there because by nature, regular expressions are greedy. If you did actually have two hyphens in your filenames like in your original post, a search like /(.*)-/ would suck up everything to the last hyphen. The question mark makes it stop at the first hyphen.


\d is just the set of digits, the + sign means 1 or more while the * sign means 0 or more.

And you know the ^ and $ mean the beginning and end of a line of the string, using the variable $/ to define the end-of-line character. This is automatically set for you depending on your system. On DOS/Windows $/ = "\r\n", on UNIX it is "\n", on the Classic Mac (not Mac OS X which is UNIX) it is "\r". But as in here, you can get around that by adding gs after the s///; statement. That treats the entire thing as a string and does global search and replace instead of one at a time.

Last edited by Laserbeak; 07-08-2017 at 10:26 AM.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Replace text string with sequential numbers inside a textfile K-Veikko Programming 3 04-07-2013 03:23 AM
[SOLVED] find the total of numbers that are higher than x in a text file with numbers (using awk??) Mike_V Programming 12 11-24-2010 09:51 AM
[SOLVED] Replace sequential numbers in a file with a different sequence using sed thefiend Linux - Newbie 6 04-12-2010 10:29 PM
HOWTO convert a group of files in a directory to a set of sequential numbers? lleb Linux - General 7 12-24-2009 07:02 PM
sequence of numbers, how to extract which numbers are missing jonlake Programming 13 06-26-2006 03:28 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 07:28 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration