Elimination of lines having fewer than 3 words
Have:
Code:
how now Code:
now is the time for Code:
sed -r '/\w{3}.+/p' $InFile Please advise. Daniel B. Martin |
This is off the top of my head, if it doesn't work for you I'll be happy to develop it further:
Code:
awk -F'\\s*' 'NF > 4' Actually, this works better: Code:
perl -ne 'print $_ if $_ =~ /\s*(?:\w+\s+){2,}\w+/' Code:
awk '$0 ~ /\s*(\w+\s+){2,}\w+/' |
Here's what I had in mind:
Code:
#!/bin/bash Code:
lyle@bowman:~/programming/sh$ ./lines < words.test No awk/sed fancyness though. Lyle. |
This didn't work ...
Code:
awk -F'\\s*' 'NF > 4' This does the job nicely ... Code:
awk 'NF > 2' Daniel B. Martin |
Hi.
Using egrep (or grep -E): Code:
$ cat infile Code:
grep '\(\w\+ \+\)\{3\}' infile |
[QUOTE=firstfire;4758746]
Code:
$ egrep '(\w+ +){3}' infile This is my (mis)understanding. Code:
{3} means 3 instances of (\w+ +) Daniel B. Martin |
Not quite:
Code:
\w means a word character class ... ie same as [[:alnum:]] |
Hi.
Well, as Firefox developers say, this is embarrassing.. There should be '*' (a.k.a. Kleene star -- zero or more) instead of '+' (one or more): Code:
egrep '(\w+ *){3,}' Previous attempt (with ' +') worked on your sample data because there were no line with exactly 3 words. If that would be the case, then there must be at least one space after last word for that RE to work: Code:
$ echo 'a b c' | egrep '(\w+ +){3}' EDIT: grail beats me again :) |
Quote:
Daniel B. Martin |
You might need to be a bit more specific daniel about which first line of code you are referring to?
|
Hi, Daniel.
Again, I'm wrong: Code:
$ echo 'how now'| sed -r 's/(\w+ *)(\w+ *)(\w+ *)/\1:\2:\3/' Code:
$ egrep '(\w+ +){2}\w' infile |
[QUOTE=firstfire;4759364]It looks like the only way to do this using RE is to treat last word separately:
Code:
$ egrep '(\w+ +){2}\w' infile Quote:
I'd mark this thread as SOLVED but it already wears that badge of honor. Daniel B. Martin |
All times are GMT -5. The time now is 02:20 AM. |