i dont want lines that dont have the noise and not my wants.
let me try and clarify.
the search tool is a "grep -E" equivalent, so i do not have -v option, or ability to pipe, etc.
Code:
grep searches (X)Happy the named input FILEs
(or (X) Frown standard input if no files are
named, or if a (X)puppy single hyphen-minus
(-) is (X) Happy given as file name) for lines
containing a match to the (X)Happy given PATTERN.
By default, grep prints the matching lines.
(X)Happy
In addition, two variant programs egrep
and fgrep are available. (X) Happy egrep is the
same as (X)Chuck grep -E. fgrep is the same as
grep -F. Direct (X) Pencil invocation as either
egrep or (X)Happy fgrep is deprecated, but is
provided to allow (X) Happy historical applications
that rely on them (X) Denny to run unmodified.
noise = "(X)Happy" or "(X) Happy"
hit is "(X)[space][word]" or "(X)[word]"
the sample file above has 14 lines.
if i had lookarounds i would get:
1 no match
2 match for "(X) Frown"
3 match for "(X)puppy"
4 no match
5 no match
6 no match
7 no match
8 no match
9 no match
10 match for "(X)Chuck"
11 match for "(X) Pencil"
12 no match
13 no match
14 match for "(X) Denny"
so w/o lookarounds using "grep -E '/regex/' file" i only see a way to build an exclusion set which will vary in size depending on the actual word to be excluded and the analytics of words.
Code:
so in this example i use something like this:
'\(X\)([ ][a-z]{4}[^y]|[a-z]{4}[^y])'
which i think i can reduce to:
'\(X\)[ ]?[a-z]{4}[^y]'
and maybe even down to:
'\(X\) ?[a-z]{4}[^y]'
this problem makes for a good exam question...