LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-06-2011, 12:14 AM   #1
rossk
LQ Newbie
 
Registered: Dec 2010
Posts: 12

Rep: Reputation: 0
Question printing a specific word out of a file.


Hello

so i have a file that has the following

file1.txt
ID1 age_11 dog_n3 parent_dog_n1
ID1 age_7 dog_n4 parent_dog_n3
ID1 dog_n5 age_4
ID1 dog_n6 age_4
ID1 age_7 dog_n7
ID1 age_11 dog_n1
ID1 dog_n2 age_3 parent_dog_n3

and i would like the output to be
dog_n3
dog_n4
dog_n5
dog_n6
dog_n7
dog_n1
dog_n2


As you can see i would like the output file to be just the dogs, not the otehr information. But because the information is mixed up how can i extract only the dogs? (i cant do and awk '{print }' because the dogs are found in colounm 2 or 3 or sometimes even 4. and the sed command is confusing me!

please help me!
PS programming is in bash.
 
Old 01-06-2011, 01:24 AM   #2
barriehie
Member
 
Registered: Nov 2010
Distribution: Debian Lenny
Posts: 136
Blog Entries: 1

Rep: Reputation: 23
Sounds like homework so I'll say that it can be done with grep uniq and cut.
 
Old 01-06-2011, 02:14 AM   #3
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,530

Rep: Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897
Or you could use awk and just loop over the fields for something starting with 'dog'
 
0 members found this post helpful.
Old 01-06-2011, 05:05 PM   #4
rossk
LQ Newbie
 
Registered: Dec 2010
Posts: 12

Original Poster
Rep: Reputation: 0
hahah thanks for the reply guys

but just to clarify it is not homework hahah, im just trying to teach my self some bash scripting for linux. (just for my own understanding of scripts)

in reality the file has nothing to do with dogs and cats, i just got some data that i want to extract certain names and numbers from (but that is to complicated to write up in this post so im using as an example dog/cat for simplicity)

but so far im trying to understand how grep works>> Im aware that you got to use grep; so using grep dog file1.txt will print all lines that have dog in it..... so how can it be done to only print the specific word that i want? using grep -w "specific word" file1.txt doesnt seem to do much! please help
 
Old 01-06-2011, 05:08 PM   #5
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
Code:
grep -o -w dog_.. file1.txt
 
Old 01-06-2011, 06:23 PM   #6
barriehie
Member
 
Registered: Nov 2010
Distribution: Debian Lenny
Posts: 136
Blog Entries: 1

Rep: Reputation: 23
I did it like this: (note that the [1-9] only allows for a single digit following the _n)
Code:
04:20:38 /home/barrie/tmp $ > grep -on dog_n[1-9] ./file1.txt | uniq -w 2 | cut -c 3-
dog_n3
dog_n4
dog_n5
dog_n6
dog_n7
dog_n1
dog_n2
04:22:02 /home/barrie/tmp $ >
 
Old 01-06-2011, 07:00 PM   #7
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 946Reputation: 946Reputation: 946Reputation: 946Reputation: 946Reputation: 946Reputation: 946Reputation: 946
If the "dog_" field is never at the start of the line,
Code:
sed -ne 's|^.*[\t\v\f ]\(dog_[^\t\v\f ]*\).*$|\1|p' file1.txt
otherwise
Code:
sed -ne 's|^\(dog_[^\t\v\f ]*\).*$|\1|p; s|^.*[\t\v\f ]\(dog_[^\t\v\f ]*\).*$|\1|p' file1.txt
Here's the breakdown of the first pattern:
Code:
s| This is a replacement command, with | as the separator.
^.* The line may start with anything (or nothing).
[\t\v\f ] Then there must be a tab, a vertical tab, a linefeed, or a space character.
\(dog_[^\t\v\f ]*\) Then there must be "dog_", then any number of characters other than
tab, vertical tab, linefeed, or space.
This matching bit is saved as "\1" for use in the replacement.
.*$ There may be anything or nothing up to the end of the line.
| Replacement follows:
\1 The marked bit.
| Options follow.
p If there was a match, print the line after the replacement.
Because the pattern matches an entire line, only the replacement is printed.
The second pattern is the same, except it first matches the line starting with "dog_", and then elsewhere.

I sometimes use a text editor or even pen and paper to build up a model (in my "own" markup -- basically doodling) of the desired pattern, then just write it as a regular expression. I've found that this really saves time, and makes it pretty easy to construct even complex regular expressions for sed, grep and friends.

Or, adapting from barriehie's solution,
Code:
grep -ow 'dog_[^\t\v\f ]*' file1.txt
does the same thing in most locales.
Nominal Animal

Last edited by Nominal Animal; 03-21-2011 at 01:44 AM.
 
Old 01-06-2011, 07:00 PM   #8
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910
Quote:
Originally Posted by barriehie View Post
I did it like this: (note that the [1-9] only allows for a single digit following the _n)
Code:
04:20:38 /home/barrie/tmp $ > grep -on dog_n[1-9] ./file1.txt | uniq -w 2 | cut -c 3-
dog_n3
dog_n4
dog_n5
dog_n6
dog_n7
dog_n1
dog_n2
04:22:02 /home/barrie/tmp $ >
Heh ... how's that for obfuscation w/ commandline tools? =o)
 
0 members found this post helpful.
Old 01-06-2011, 08:24 PM   #9
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,530

Rep: Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897
So in awk it would be:
Code:
awk '{for(i=1;i<=NF;i++)if($i ~ /^dog/)print $i}' file
Edit: or even simpler
Code:
awk '/^dog/' RS="[ \n]" file

Last edited by grail; 01-06-2011 at 08:28 PM.
 
Old 01-23-2011, 06:11 PM   #10
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 233Reputation: 233Reputation: 233
IMNRHO, grep -o is the only reasonable base to build this command on. It is direct & elegant.

The way the problem is stated, w/ a space being the field delimiter & "dog" the common portion, this is all that is necessary:
Code:
grep -ow 'dog[^ ]*'  file1.txt

Although there was patently no sort in the example, & it was ambiguous as to whether entries are unique, sort -u could be added:
Code:
grep -ow 'dog[^ ]*'  file1.txt  | sort -u

Sometimes I might complicate it like this to expose the logic better:
Code:
cat file1.txt  | grep -ow 'dog[^ ]*'  | sort -u

Notes:
1. I built the file1.txt & tested all 3 of the above; including w/ an add'l line of only "dog_x".

2. Using -w to allow for the "dog_" field to be 1st on the line is superior to this alternative:
Code:
grep -o '[ ^]dog[^ ]*'  file1.txt
Even though I love the symmetry , there's an unwanted space in the output most of the time.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Searching .txt file for (specific) strings and printing them to new file Hb_Kai Linux - General 7 02-18-2010 09:09 AM
SED ? get number before specific word czezz Programming 3 08-11-2009 06:00 PM
Printing specific lines of a file using script. barunparichha Linux - Software 6 05-20-2009 12:31 AM
Reading and printing the 5th, 10th, 15th...word of any file dahr Linux - Newbie 2 07-28-2008 03:51 PM
[SOLVED] find a word in a file, and change a word beneath it ?? vikas027 Programming 10 02-14-2008 09:46 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:00 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration