Choosing words based on letter count
Have: a file of English words, one word per line.
Sample input ... Code:
quoth Sample output ... Code:
nevermore Code:
|awk '{-F""; Daniel B. Martin |
how about
Code:
grep -e '\(.\).*\1.*\1' |
Quote:
Daniel B. Martin |
ok: [b].[b] is any character. I put it in parentheses \(.\) so it can be referenced later. The \1 does just that. It references the string matched by the expression in \( \). In other words, \1 means "the same character as the one matched by \( \)". Between the \(.\) and the \1 references there's .* which means that the occurences of the matched character may be separated by zero or more other characters.
Another example of using references may be Code:
sed 's/\(.*\) \(.*\)/\2 \1/' |
Quote:
Daniel B. Martin |
Quote:
Code:
grep -e '\(.\).*\1.*\1.*\1.*\1.*\1' Code:
grep -e '\(.\){.*\1}5' |
I was trying something like this
Code:
grep -e '\(.\)\(.*\1\)\{5\}' |
Quote:
Daniel B. Martin |
Quote:
Example 1) Find words which contain at least 3 of the character in column 1, words such as " alabaster" or "abracadabra" Code:
grep -e '\(^.\).*\1.*\1' Next, I made the task more difficult. Example 2) Find words which contain at least 3 of the character in column 2, such as "aardvark". Code:
grep -e '.\(.\).*\1.*\1' Please advise. Daniel B. Martin |
Quote:
Code:
grep -e '.\(.\).*\1.*\1' -e '\(.\)\1.*\1' |
Quote:
Code:
# Find words which contain at least 3 of the character in column 2, Code:
aardvark |
sorry, I forgot the '^' so the pattern matches only at the begining of the line.
Code:
grep -e '^.\(.\).*\1.*\1' -e '^\(.\)\1.*\1' |
Quote:
Code:
grep -e '^.\(.\).*\1.*\1' -e '^\(.\)\1.*\1' EDIT: I'm late again.. |
Quote:
Thanks to millgates and firstfire for timely and instructive responses. Daniel B. Martin |
Quote:
Now I want to produce the same list with each qualifying word preceded by the character which appeared three times. An example: Code:
a aardvark http://www.linuxhowtos.org/System/sedoneliner.htm teaches that sed may be used to emulate grep. Following that guidance I wrote this ... Code:
# Find words which contain Code:
# Find words which contain I've tried various ways to extend these sed codes to perform the desired transformation, without success. Can you do it? Should I stay with grep? Daniel B. Martin |
All times are GMT -5. The time now is 10:59 AM. |