LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Sed/awk/grep search for number string of variable length in text file (http://www.linuxquestions.org/questions/linux-newbie-8/sed-awk-grep-search-for-number-string-of-variable-length-in-text-file-783223/)

Alexr 01-19-2010 08:02 AM

Sed/awk/grep search for number string of variable length in text file
 
I need to search a text file for a string of numbers which are different lengths, and always are between number=" and " like:

number="1234567890"
number="22390"

I need to grab those numbers and pipe each one to a line in a file.

I've already tried something with awk and that didn't seem to work.

Thanks in advance.

pixellany 01-19-2010 08:16 AM

If there is only one number per line:

Code:

sed -r 's/.*number=\"([0-9]+)\".*/\1/'  filename > newfilename

Alexr 01-19-2010 09:13 AM

Thanks pixellany, but I think I should mention that the file it is reading from will be a dynamic html page and as an unfortunate result the placement of number="$numberstring" will be between all sorts of other tags.

Cheers anyway

pixellany 01-19-2010 10:45 AM

I think my code will work......I said "one number per line"---that does not preclude other junk being on the line.

Alexr 01-19-2010 11:12 AM

I found a way of doing nearly precisely what I wanted. Pretty ugly, but I've no idea what I'm doing generally - I'm pretty new to BASH and coding generally.

grep -o number="....................." filename | tr -d [:alpha:][:punct:] > newfilename

pixellany 01-19-2010 11:15 AM

No more ugly than mine.....;)

Please use code tags---e.g it will stop thing from being read as smilies---and it preserves formatting.

Tinkster 01-19-2010 11:24 AM

Ummm .. that's ugly. Plus it will only work if what's in the
quotes is exactly 22 characters long.Pixellanys solution is
much cleaner & WILL work.

devnull10 01-19-2010 11:34 AM

sed stands for Stream Editor for a reason....

schneidz 01-19-2010 12:13 PM

heres my stab at it:
Code:

grep -o number.*" " index.html | awk -F = '{print $2}' > numbers.txt

Alexr 01-19-2010 12:20 PM

Tinkster: It does work if the numberstring is less than 22, which it will be 99% of the time.

You have a point though, and I'm going to try pixellany's again, for cleanliness' sake.

pixellany: sorry about the smiley, I'll remember in future.

Schneidz: I'll give it a go.

Thanks all.

Tinkster 01-19-2010 01:34 PM

Quote:

Originally Posted by Alexr (Post 3832411)
Tinkster: It does work if the numberstring is less than 22, which it will be 99% of the time.

That's odd .... what your regex says is:

Code:

Show all consecutive strings that begin with
number=", followed by 22 of anything, followed
by ".

So if your number is shorter than the 22 and it
still shows up you should be very lucky to have
a quote in the 23rd position, otherwise it shouldn't
match the pattern.

And in fact, your code piece doesn't work on the
sample data you provided further up in the thread.
pixellanys does - not sure whether you really want
the bare number or not, but that's what it returns.


Cheers,
Tink


All times are GMT -5. The time now is 11:44 AM.