Old 08-03-2005, 10:49 AM   #1
little help for regexp

Hi folks

I'm little confused with the regexp syntax.

To show the numbers from 3 to 7 I'm using:
How to show the numbers between 3 and 70 ?
kregexpeditor said, that [3-70] are the numbers 3 to 7 and 0 (or something like that)
Old 08-03-2005, 11:51 AM   #2
[] is for single characters

I'm no expert but I'm pretty sure that it doesn't work that way. The [...] thing is used for identifying a single character that fits the criteria in the braces. "70" is not a single character, is it? No, it's not. What "[3-70]" is probably interpreted as is "match anything between 3 and 7 and also 0". I don't have a Linux box in front of me right now to test that out though.

Maybe if you explain exactly what you need to do i could help a little bit more.

-- the dudeMAN DavE
Old 08-03-2005, 12:12 PM   #3
I think you're right

Well, I have one big wget log file and I want to see how many files are bigger then 5MB. The file size is like 3.3M , 5.7M , 11M, 11.3M, etc

I'm getting the size with:
cat logfile |grep "Length"|cut -f3 -d" "|grep M |wc -l

Any suggestions ?
Old 08-03-2005, 12:33 PM   #4
I know that awk might work well for you here. Unfortunately I can't test it out right now (as I'm on a public access internet connection running Windows 2000), but it'd be something like this:
# cat logfile | grep Length | awk '$1 > 5.0 {print $0}' | wc -l
Now, I've never tried that (and can't at the moment) and I don't know if it'd work with the 'M' appended on the end. But I do know that with awk the first part is the condition part and the part in the curly-braces {} is the action part. It says "If the first column has a numeric value greater than 5.0 then print out the whole line". I'm assuming that column/field one is the column with the size value in it (the fields are separated by a space. You can change the field-separation character by specifying the -F switch for awk before you put the command in the single quotes).

Sorry I can't think of much beyond that.

-- the duDemAN daVE
Old 08-03-2005, 01:01 PM   #5
You got my brain working on this! And it's too bad/soo sad that I don't have a Linux box in front of me! Arg!

Anyway, I found a page that explains how awk interprets comparisons (whether string or numeric):

So, from what I read, I believe that if awk is comparing "6.7M" with "5.0" it'll compare it numerically, just like we want. Here's the quote:
The numeric value of a string is the value of any prefix of the string that looks numeric; thus the value of 12.34x is 12.34, while the value of x12.34 is zero. The string value of an arithmetic expression is computed by formatting the string with the output format conversion OFMT.
If you've never used awk before, just know that $1 is the variable for the first field, $2 is for the second field, etc. And $0 is for the whole line. So, because I wouldn't know which field it is that the number is at, I couldn't write the exact command for you. But you can figure that out.

Let me know what you figure out.

-- the dudemaN DAVEEe
Old 08-05-2005, 07:45 PM   #6
awk sounds good. I've made a little test.

$cat testfile

$cat testfile |awk '$1 > 5 {print $0}'

Unfortunately it has some problems with numbers, begining with 1-4. 245M is bigger than 5M. The "45M" part is probably like a string to him

And also my size is in format (14.6M). With the brackets awk recognize it as a string (i think). I can cut -c 2,3,4,5,6 and the result will be 14.6M, but some numbers are in (10M) format, so they will stay 10M)

Stuck again
Old 08-06-2005, 03:19 AM   #7
change delimiter, since i dont know which logfile you talking about, i can only give you the idea:
awk -FM '$1 > 5'

print is default action, so it can be skipped.


