LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   find one word in a line (https://www.linuxquestions.org/questions/linux-newbie-8/find-one-word-in-a-line-4175433811/)

sam_sung 10-24-2012 04:39 AM

find one word in a line
 
can someone help me with a command to find any one word lines in a file and copy those lines in a new file.. for example :-

The file is a file

There
are 90 boys in the hall.
[tab][space] some

of them are into sports. some are in
class-of-blue.

THE OUTPUT SHOULD BE


There
[tab][space] some
class-of-blue.


[tab][space] are actual tab or space. not the words.

chrism01 10-24-2012 04:50 AM

Sounds a bit like homework (https://www.linuxquestions.org/linux/rules.html); can you show us what you've tried so far?

Here's some good links
http://rute.2038bug.com/index.html.gz
http://tldp.org/LDP/Bash-Beginners-G...tml/index.html
http://www.tldp.org/LDP/abs/html/

sam_sung 10-24-2012 04:55 AM

i have tried grep command so far.

Quote:

grep '^[ \t]*[A-Za-z0-9_]\+\.\?' file > new file

pixellany 10-24-2012 05:30 AM

Code:

sed -r '/[[:alnum:]]+[^[:alnum:]]+[[:alnum:]]+/d' oldfile > newfile
the character class "alnum" includes all letters and numbers, so this works only if other characters are not present

also try:
Code:

sed -r '/[^[:blank:]]+[[:blank:]]+[^[:blank:]]+/d' oldfile > newfile

sam_sung 10-24-2012 06:29 AM

that worked.. thanks mate.

David the H. 10-25-2012 04:29 PM

Please use ***[code][/code]*** tags around your code and data, to preserve the original formatting and to improve readability. Do not use quote tags, bolding, colors, "start/end" lines, or other creative techniques.

They will also allow you to insert literal tabs and spaces into the text.

Here's a more compact grep for you (actually egrep, since it needs extended regex):

Code:

grep -E '^\s*(\w|[[:punct:]])+\s*$'

\s is a built-in synonym for [[:space:]], and \w is a synonym for [[:alnum:]_].

(a|b) is a regex either-or grouping, and finally * means "0 or more" and + is "1 or more".

So this line means any amount of whitespace, followed by a string of one or more word characters and/or punctuation characters, followed by more optional whitespace.

See the regular expressions section of info grep for more on how character classes and backslash characters are defined.


All times are GMT -5. The time now is 05:21 PM.