-   Linux - General (
-   -   Extract a substring (

prpersonal 10-19-2009 07:55 AM

Extract a substring
Hi All,
I am a newbie and not sure if this is the right fourm to ask.
II just have an string "builder_test_string1ABC001_test" and I would like to extract ABC001 from this. Please note that 001 is dynamic and it can change to any numerical value. I wanted to do this in a shell script.

Thanks for ur help in advance
Ramya Srinivaas

Lordandmaker 10-19-2009 07:59 AM

What is the pattern of the bits you want to keep or destroy? Do you always want three capital letters followed by three numbers? Are they always the only capital letters? Is it always the last six characters you want? Do you always want the substring beginning ABC and finishing at the next underscore? We'd need more sample data to offer any specific help, and we'd *really* like to see what you've tried so far, or even just how you intend to approach this.

You'll likely want to look at sed, awk, and regular expressions. There are a couple of bash scripting guides on which are mostly invaluable if you've any bash script writing to do.

pixellany 10-19-2009 08:14 AM

At tldp, start with the Bash Guide for Beginners.

There is also "grep -o" which returns only the matched expression (but only once per line).

In constructing the regular expression, we also need to know if "any numerical value" means any number of digits---i.e. do you need to match "ABC34", ABC289057", etc.

prpersonal 10-19-2009 08:15 AM

Hi thanks for ur reply. Actually the mentioned string is an output of an clearcase command. I am writing a automatic script which creates a directory called "ABCxxx". All other data in the string is static and will not change. And also this ABC can be in small case also. I always wanted the substring starting from ABC and finishing at the next underscore.

String: builder_project1_projectabc023_build
And I always wanted to keep abc023

Ramya Srinivaas

schneidz 10-19-2009 09:16 AM


echo builder_project1_projectabc023_build | grep -o abc[000-999]

rn_ 10-19-2009 10:06 AM


Originally Posted by schneidz (Post 3724895)
grep -o

learned something new. Thanks.

prpersonal 10-19-2009 11:29 AM

Thanks schneidz, but it didn't worked with:
echo builder_project1_projectabc023_build | grep -o abc[000-999]

but worked with a small correction:
echo builder_project1_projectabc023_build | grep -o abc[0-9][0-9][0-9]

Anyway thanks for all your help

pixellany 10-19-2009 11:52 AM

That does not match your original requirements (capital ABC, everything up to the dash) -- it will match only if there are exactly 3 digits

try this:

grep -o '[Aa][Bb][Cc][[:digit:]]*_' fil|sed 's/_//'
Edit: "fil" is the name of the file I was using for testing..

schneidz 10-19-2009 02:44 PM

^ good corrections. i shouldve warned that mines was quick-and-dirty.

All times are GMT -5. The time now is 01:50 AM.