LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 11-12-2010, 08:43 AM   #1
taskmaster
LQ Newbie
 
Registered: Jul 2006
Location: Vanceboro, NC
Distribution: CentOS
Posts: 10

Rep: Reputation: 0
Simple awk question search for string1 and extract string2


Hi All,

Haven't used Awk in a while, this is probably simple?

have a file with the following line:

<xref image="00001234.tif|V3|1999:11:19:22:13|49487|0"> image: </xref>

want to use awk to search for: <xref image="
??? /<xref image=\"/ ???

then set a variable called imagename to the next 12 characters
??? help ???

******************

Also looking for a way to display a character based text file's hidden characters.

Thanks in advance for your assistance.
 
Old 11-12-2010, 08:59 AM   #2
ee437
LQ Newbie
 
Registered: Oct 2010
Location: Wisconsin, USA
Distribution: Slackware
Posts: 12

Rep: Reputation: 4
If the filename is always followed by a "|" character, this might work:

Code:
awk -F \" '$1=="<xref image=" {print $2}' file.in.name \
awk -F "|" '{print $1}'
(assuming you want to use awk). The -F \" declares the delimiter as a double quote. The first line essentially prints all lines in file.in.name that begin with

<xref image=

and the second line (note the \ to continue the line) uses awk with | as the delimiter and prints everything up to the first |.

in the first line, i didn't quote the -F variable, but i did in the second. the awk script is between two '.

i didn't try the whole script, i hope that helps.
 
1 members found this post helpful.
Old 11-12-2010, 09:04 AM   #3
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
Code:
VARIABLE=$(awk '/^<xref image=".*<\/xref>/{print substr($0,14,12)}' input_file)
Try that...

As for the second issue, what "hidden characters" are you referring to, and in what sort of "text based" file? Is it a text file, or some other sort of file? How do you know they are there, if they are hidden?
 
Old 11-12-2010, 10:03 AM   #4
taskmaster
LQ Newbie
 
Registered: Jul 2006
Location: Vanceboro, NC
Distribution: CentOS
Posts: 10

Original Poster
Rep: Reputation: 0
Thanks both for your responses, I might try the second approach initially as a computer program generated these files and the structure is set in stone, so to speak, at least for the part of the line that I included here.

In regards second question, if I vi the files I see funny stuff, if I display the file in XTerm I see other funny stuff, etc.

Additionally the cursor position counter in vim goes nutty when going across what appears to be empty space: if line reads "THORPE KISHIE C " the cursor position on the C is 1,21 and the cursor positioned on the space after C is 1,22 and the cursor on the next space goes to 1,24-23 then 1,26-24 then 1,28-25 then 1,30-26 etc 1,+2 - +1

Additionally these files although named .txt were never intended for someone to look at them raw. They were only intended for use by a programs backend with a GUI interface for the victim, I mean user.

Thanks Again Guys
 
Old 11-12-2010, 10:25 AM   #5
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
For the funny characters, maybe try:
Code:
cat -v file
the -v "shows non-printing characters" using CTRL (^) and META (M-) notation. Pipe it into `less` to put it on screen in an easily scrollable fashion.

If that doesn't work, let us know; someone will have another idea.
 
1 members found this post helpful.
Old 11-12-2010, 11:51 AM   #6
taskmaster
LQ Newbie
 
Registered: Jul 2006
Location: Vanceboro, NC
Distribution: CentOS
Posts: 10

Original Poster
Rep: Reputation: 0
Well it is some ugly stuff in that file, but again I can work around it. I might close this thread and be back for assistance on a new thread. Want a sneak preview? OK!

The part of the line I am interested in grabbing looks like this:
..."STUBER ROBERT J "...

with all those spaces being as follows when you cat -v
..."STUBER ROBERT J- M- M- M- M- M- M- M etc - M- M- "...

I want awk to fill name variable with "Stuber Robert J" and stop at the J based on the next character being something other than A-Z. It's almost like I need a HEX Editor to see what that first character is and stop filling variable. Is it a dash or just showing as a dash with the cat -v command on a screen.

Take care all.
 
Old 11-12-2010, 11:58 AM   #7
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
Using regular expressions, you can match against non-printing characters too. So for example, if you had the word "John" followed by a bunch of hidden garbage, you could match for that.

For the time being (and if it's all the same to you), I suggest continuing the discussion in this thread, since I am guessing there may yet be more to this situation. Put another way: don't start a new thread, if it is to continue this one; instead, leave this one un-SOLVED until you're sure there's not more to this issue It'll help people find your thread(s) easier, and in less locations, when they search for things on this subject..

If it's a new issue, then by all means, make a new thread. Actually, sed comes to mind for your "Robert Stuber" issue, but we'll know more when you tell us more about this.

Also, a suggestion: when posting chunks of data files, text files, especially when whitespace is relevant in the formatted listing of the data, please put the data in code tags. You can see their usage here:
http://www.phpbb.com/community/faq.php?mode=bbcode#f2r1

Keep us posted!
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bash script to check if string1 contains string2 Sebbern Linux - Newbie 5 12-15-2009 04:19 AM
awk: get lines starting from String1 and finish String2 frenchn00b Programming 9 10-29-2007 08:16 AM
Shell: string1 += string2. How to do this? jhwilliams Linux - Software 6 06-19-2007 01:49 PM
awk Question? Search by line, using 2 files? micksul Linux - Software 4 06-06-2007 06:34 AM
Search and Replace a string1 with string2 in many files TroelsSmit Linux - Newbie 5 12-16-2004 01:04 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:19 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration