LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Regex problem in script (https://www.linuxquestions.org/questions/linux-general-1/regex-problem-in-script-4175419606/)

verona 07-31-2012 02:46 PM

Regex problem in script
 
I have a file full of pathnames eg

//blah/blah/blah/123.txt
//blah/blah/blah/blah/456.txt
//blah/blah/789.txt


and I need to end up with :

//blah/blah/blah/123.txt 123
//blah/blah/blah/blah/456.txt 456
//blah/blah/789.txt 789

I need to achieve this with a command that I can put into a script, eg sed, but despite my fiddling I haven't managed to achieve this.

Really appreciate any help.

whizje 07-31-2012 02:57 PM

Here you can find what you need http://www.gnu.org/software/sed/manu...s_0022-Command If you still got problems post what you have tried. And then people can point you in the right direction.

dmdeb 07-31-2012 03:19 PM

Quote:

Originally Posted by verona (Post 4742414)
and I need to end up with :

//blah/blah/blah/123.txt 123
//blah/blah/blah/blah/456.txt 456
//blah/blah/789.txt 789

Indeed, sed seems a good way of doing this.

In the pattern part of sed 's/pattern/substitution/':

- Find a regular expression that matches the filenames you're interested in (such as 456.txt).
- Use escaped parentheses to store the part of the matches you're interested in.

In the substitution part:

- Use \0 to print the entire match and \1 to print the stored part.

I'm not sure about your exact requirements, but maybe this is a place to start from: cat input | sed 's/\/\([0-9]*\)\.txt$/\0 \1/g'

Regards
dmdeb

ruario 08-01-2012 02:49 PM

Code:

sed -r 's,.*/([0-9]{3})\.txt,\0 \1,' inputfile

Snark1994 08-01-2012 02:56 PM

Note that ruario's code only matches a 3-digit filename with a .txt extension - if you have different requirements, you're gonna have to be more specific about what exactly you want.


All times are GMT -5. The time now is 09:16 AM.