[SOLVED] delete lines that contain an EXACT word

DarkLight90 · 06-28-2012, 08:13 AM

Hi all,
this is my first post, please be patient :-)

this is my text file:

5326
71 e
68 di
39 €
26 del
25 per
25 in
22 il
22 ebay
20 eur
19 la
19 iphone
16 Â
15 nero
15 a
and so on...........

it contains the occurences of words in a text file but i need to delete lines containing italians adverbs or conjunctions (like "per", "di" or "e"); i tried like this:

Code:

sed -e 's/e//g' output.txt > output2.txt

but this instruction delete ALL LINES CONTAINING CHAR "e"...

i tried also this:

Code:

sed -e 's/ e/n//g' output.txt > output2.txt

but nothing...

Any suggestion? Thanks

syg00 · 06-28-2012, 08:50 AM

You need to use word boundaries to ensure you get the construct you want.

You might find grep easier for that.

DarkLight90 · 06-28-2012, 09:01 AM

Quote:

You need to use word boundaries to ensure you get the construct you want.

You might find grep easier for that.

Thanks syg00, but i really don't know how to modify this instruction that build this list i wrote before:

Code:

cat webpage.txt | tr -d '[:punct:]' | tr ' ' '\n' | tr 'A-Z' 'a-z' | sort | uniq -c | sort -rn > output.txt

webpage.txt is the page from where i need the words count... What can i do for simplify my problem?

In meanwhile i try to find a solution with egrep.

Thanks again.

montel · 06-28-2012, 09:20 AM

Not sure if you have already looked at this, since the last comment was to use grep, but i'll give some input.

The -w flag for grep will match standalone words, so if you have a line like this: (test.txt)

This is a line without that letter as a standalone word
This is a e line for testing

This command will only return the second line:

Code:

cat test.txt | grep -w "e"

DarkLight90 · 06-28-2012, 09:33 AM

Quote:

Originally Posted by montel

Not sure if you have already looked at this, since the last comment was to use grep, but i'll give some input.

The -w flag for grep will match standalone words, so if you have a line like this: (test.txt)

This is a line without that letter as a standalone word
This is a e line for testing

This command will only return the second line:

Code:

cat test.txt | grep -w "e"

Yeah u are right motel, i was thinking right now at this but there is no "-exec" for "sed" command, so i don't know how to do a pipe of instructions for deleting what i don't need (selecting whit grep)... also "tr" don't implement this function...

Thanks anyway :-)

montel · 06-28-2012, 09:40 AM

I am not sure how to "delete" the line using grep, but this will output only the lines that do not contain that letter as its own word:

Code:

grep -vw "e" example.txt >> newExample.txt

I am not well versed in what you are using in your command, but could you do something like this?

Code:

cat webpage.txt | grep -vw "e" | tr -d '[:punct:]' | tr ' ' '\n' | tr 'A-Z' 'a-z' | sort | uniq -c | sort -rn > output.txt

DarkLight90 · 06-28-2012, 09:50 AM

Quote:

Originally Posted by montel

I am not sure how to "delete" the line using grep, but this will output only the lines that do not contain that letter as its own word:

Code:

grep -vw "e" example.txt >> newExample.txt

I am not well versed in what you are using in your command, but could you do something like this?

Code:

cat webpage.txt | grep -vw "e" | tr -d '[:punct:]' | tr ' ' '\n' | tr 'A-Z' 'a-z' | sort | uniq -c | sort -rn > output.txt

So great, man! :-D "-vm" that was what i really need of grep!
Thanks a lot!

[SOLVED]

montel · 06-28-2012, 09:52 AM

No problem, glad you figured it out