LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 12-09-2008, 06:29 AM   #1
wtaicken
LQ Newbie
 
Registered: Dec 2008
Location: Dorset, UK
Distribution: Ubuntu 7.1
Posts: 25

Rep: Reputation: 15
Extracting part of a filename


Hi, I need to strip out a numerical character from a filename, the only rule being that it is the numerical character occurring after the word "test" in the filename. A typical example of a filename will be something like "BETA_test3_w10.dat"

I am currently doing:

CODE
if [ -z $1 ]; then
echo "Needs directory as argument."
exit 1
elif [ -d "$1" ]; then
dirname="$1"
fi

# cd to dir
cd $1


#list all files in dir
list="$(ls ARCH*.test)"
echo $list

#for all files in dir
for i in $list
# look for number after word ĘtestĘ in filename
?????????????????????
CODE

I need some code to replace the ?????????'s

cheers

W
 
Old 12-09-2008, 06:56 AM   #2
yowi
Member
 
Registered: Dec 2002
Location: Au
Distribution: Debian
Posts: 209

Rep: Reputation: 55
http://xkcd.com/208/

sorry, couldn't resist

but seriously, cut would do it, or awk with a regex and there's probably a dozen other ways to skin this cat.
 
Old 12-09-2008, 01:42 PM   #3
spurious
Member
 
Registered: Apr 2003
Location: Vancouver, BC
Distribution: Slackware, Ubuntu
Posts: 558

Rep: Reputation: 31
Quote:
Originally Posted by wtaicken View Post
Hi, I need to strip out a numerical character from a filename, the only rule being that it is the numerical character occurring after the word "test" in the filename. A typical example of a filename will be something like "BETA_test3_w10.dat"
I'm not sure if this is what you had in mind, but you can use the perl rename command from the bash prompt (most distros have it by default; do 'man rename' for documentation):

e.g.

Code:
rename s/test./test/ BETA_test3_w10.dat
renames BETA_test3_w10.dat to BETA_test_w10.dat
 
Old 12-09-2008, 02:32 PM   #4
forrestt
Senior Member
 
Registered: Mar 2004
Location: Cary, NC, USA
Distribution: Fedora, Kubuntu, RedHat, CentOS, SuSe
Posts: 1,288

Rep: Reputation: 99
Try this:

Code:
ls | sed -e 's/.*test\([0-9]\).*/\1/'
HTH

Forrest
 
Old 12-09-2008, 02:33 PM   #5
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Arch/XFCE
Posts: 17,802

Rep: Reputation: 740Reputation: 740Reputation: 740Reputation: 740Reputation: 740Reputation: 740Reputation: 740
semantics!!!

When you said "strip out", I read it as "capture the character and store it". The solution above assumes that you meant "remove". (The latter is easier)

If you want to capture just the number after "test", does it matter how many digits?

This gives you the single digit following the first instance of "test" on a line:

Code:
sed -n 's/.*test\([[:digit:]]\).*/\1/p' filename
Have you looked at the SED references suggested in your other threads?
 
Old 12-15-2008, 05:03 AM   #6
wtaicken
LQ Newbie
 
Registered: Dec 2008
Location: Dorset, UK
Distribution: Ubuntu 7.1
Posts: 25

Original Poster
Rep: Reputation: 15
Ok, look like this problem is trickier than I thought. The number of digits after "test" can vary from 1 to 3. The only rule is that it occurs after the word "test" and before the second "_" in the filename.

So possible filenames are CACTUS3D_test67_w10.dat, CACTUS3D_test167_w10.dat, & CACTUS3D_test6_w10.dat

Can I get something that will handle all variations of the filename?
 
Old 01-02-2009, 04:38 AM   #7
dhanyaelizabeth
LQ Newbie
 
Registered: Jan 2009
Posts: 27

Rep: Reputation: 15
Regular Expressions should do the trick. If you want - I can write a small java application to do this for you. But, on linux bash, not really sure useful I can be for you on that front!

Linux Archive

Last edited by dhanyaelizabeth; 01-10-2009 at 04:09 PM.
 
Old 01-02-2009, 09:01 AM   #8
Telemachos
Member
 
Registered: May 2007
Distribution: Debian
Posts: 754

Rep: Reputation: 60
Quote:
Originally Posted by wtaicken View Post
Ok, look like this problem is trickier than I thought. The number of digits after "test" can vary from 1 to 3. The only rule is that it occurs after the word "test" and before the second "_" in the filename.

So possible filenames are CACTUS3D_test67_w10.dat, CACTUS3D_test167_w10.dat, & CACTUS3D_test6_w10.dat

Can I get something that will handle all variations of the filename?
You have a slightly different problem now. If your files look like the three above, then removing the numbers after 'test' would make them all have the same name. What that means in practice is that as the program executes, the second renamed file would stomp on the first renamed file and then the third would stomp on the second. Example:
Code:
rename file1 to CACTUS3D_test_w10.dat
rename file2 to CACTUS3D_test_w10.dat # bye-bye to old file of that name
rename file3 to CACTUS3D_test_w10.dat # bye-bye to second file of that name
At the end you would have only one file. Here's what I mean:
Code:
telemachus ~/testA $ ls
CACTUS3D_test67_w10.dat   CACTUS3D_test6_w10.dat
CACTUS3D_test686_w10.dat  renamey
There's the three files you describe and a Perl script to rename them without the number after "test".
Code:
#!/usr/bin/env perl
use strict;
use warnings;

my @files = glob "*.dat";

foreach my $file (@files) {
    ( my $new_name = $file ) =~ s/test[\d]+_/test_/ ;
    print "$file becomes $new_name\n";
}
That code removes any numbers after "test", but here's the result of running it.
Code:
telemachus ~/testA $ perl renamey 
CACTUS3D_test67_w10.dat becomes CACTUS3D_test_w10.dat
CACTUS3D_test686_w10.dat becomes CACTUS3D_test_w10.dat
CACTUS3D_test6_w10.dat becomes CACTUS3D_test_w10.dat
There's only one renamed file left standing, so to speak.

So you need to do something else in the renaming to make the renamed files different. Here's one way (though probably not the best way):
Code:
#!/usr/bin/env perl
use strict;
use warnings;
use File::Copy;

my @files = glob "*.dat";

my $letter = 'a';
foreach my $file (@files) {
    ( my $new_name = $file ) =~ s/test[\d]+_/test${letter}_/ ;
    copy $file, $new_name
        or die "Can't copy $file to $new_name: $!";
    $letter++; # Increment letter, a -> b, b -> c, etc.
}
Result
Code:
telemachus ~/testA $ ls
CACTUS3D_test67_w10.dat   CACTUS3D_testa_w10.dat  renamey
CACTUS3D_test686_w10.dat  CACTUS3D_testb_w10.dat
CACTUS3D_test6_w10.dat    CACTUS3D_testc_w10.dat
Three originals, three copies.

Now, did you really want to remove the numbers or just capture them?

Last edited by Telemachos; 01-02-2009 at 09:40 AM. Reason: Clarify problem, clean up code...
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
C: Extracting part of a string trevorv Programming 3 08-29-2007 05:36 PM
extracting a library file named filename.1.2.... Tavassoli Linux - Software 1 06-15-2006 08:29 AM
Getting the first part of a filename in a BASH script trevelluk Programming 3 02-15-2005 02:06 AM
Extracting mail attachment filename Jinkzer Linux - Software 3 02-28-2004 09:07 AM
extracting -filename.tar.gz- zokter Linux - Newbie 3 06-29-2003 07:18 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 07:38 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration