sed and regexp matching (GNU sed version 4.2.1)
I would like to extract a number from a string using sed and backreferencing.
Let's say: Code:
i='something_1234.txt' Unfortunately, sed just ignores the + modifier. I also tried \{1,\} instead but it doesn't work too... |
Does it have to be back-referencing? I think a quicker option would be:
Code:
sed 's/[^0-9]*//g' |
Quote:
Code:
echo 'something_1q2r3s4.txt' |sed 's/[^0-9]*//g' Code:
1234 |
Quote:
|
Quote:
Reading his sed made me think his intended question was "Reading left-to-right, let me capture the first numeric string." Daniel B. Martin |
Quote:
Code:
.*\([0-9]\+\).* the first thing sed sees in your regex is the left .*. It will try to match as many characters as possible so that the rest of the regex can still match the rest of the line. Therefore , the left .* will match the string like this: "something_1234.txt", because then it will still have one digit left to match the [0-9]\+ expression and the right .* (the latter does not even need any characters to match). Only then will sed continue with [0-9]\+, which can at this point only match the last digit, because the first three are already "eaten" by the first .*. Therefore your sed command will output Code:
$ echo something_1234.txt|sed 's/.*\([0-9]\+\).*/\1/' Code:
sed 's/[^0-9]*\([0-9]\+\).*/\1/' Code:
sed -r 's/[^0-9]*([0-9]+).*/\1/' |
Quote:
That regexp suggested by sycamorex is perfectly fine. I tend to overdo my regexps because I don't use them very often. :) And thanks for the explanation about greediness, millgates. |
Quote:
Code:
tr -dc '0-9' d and c are options for the translate. "d" says "discard". "c" says "complement". so tr -dc '0-9' says "discard all characters other than 0 through 9." Now you might run this tr against a file and want to preserve the NewLine characters. In that case, use Code:
tr -dc '\n0-9' Daniel B. Martin |
Of course, the solutions by sycamorex and danielbmartin will only work correctly if there's a single set of digits in the string, as they simply delete anything that isn't a number. A string like "1234_something_1234.txt" would end up as "12341234".
But assuming that's ok, then you don't even need to use an external tool. As long as the string is already in a variable, just use simple parameter substitution. Code:
i='something_1234.txt' Code:
i='something_1234.txt' See here for plenty more string manipulations. |
All times are GMT -5. The time now is 04:31 AM. |