LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
LinkBack Search this Thread
Old 12-11-2012, 12:01 PM   #1
atjurhs
Member
 
Registered: Aug 2012
Posts: 133

Rep: Reputation: Disabled
rows to clumn headers


Hi guys,

I have a very large file with column data that doesn't have any headers it is space seperated, sometime several spaces, and it's too big to open in a text editor. I'll call it C.dat. So I do something like
Code:
 head -300 C.dat > littleC.dat
and I can see some of the file contents.

I have another file with row formated text information, like 700 or so rows. I'll call it R.txt. The R.txt file has the "field" information that goes with the columns in C.dat. The R.txt file looks like this:

Code:
--*******************************************************

FIELD_NAME := header_info_a;
 FIELD_DESCRITION := short_name;
 FIELD_SHOW_TRUNCATION := FALSE;
 FIELD_WIDTH := 14;
 FIELD_COLUMN := 12;
 FIELD_JUSTIFICATION := RIGHT;

--*******************************************************

FIELD_NAME := header_info_b;
 --FIELD_UNITS := "---";
 FIELD_SHOW_TRUNCATION := FALSE;
 FIELD_WIDTH := 11;
 FIELD_COLUMN := 37;
 FIELD_JUSTIFICATION := RIGHT;

--*******************************************************

FIELD_NAME := header_info_c;
 --FIELD_UNITS := "---";
 FIELD_SHOW_TRUNCATION := FALSE;
 FIELD_WIDTH := 9;
 FIELD_ROW := 1;
 FIELD_EXP := 0;
 FIELD_COLUMN := 62;
 FIELD_JUSTIFICATION := RIGHT;

--*******************************************************

FIELD_NAME := header_info_d;
 FIELD_WIDTH := 5;
 FIELD_ROW := 1;
 FIELD_COLUMN := 317;
 FIELD_JUSTIFICATION := LEFT;

--*******************************************************

etc.
so the row of information that I need to use from R.txt is FIELD_NAME, and what I'd like to do is strip out each FIELD_NAME string and write it to an another output file along with an index number. I need the index number for another tool that filters by column index.

each "block" of field info is indexed correctly down the R.txt file even though the FIELD_COLUMN numbers are not, so I don't think that I really care about the FIELD_COLUMN numbers, and I can just go down the R.txt file using only the FIELD_NAME string and placing it sequentially in line in the new output file.

here's what I'd like the file to look like:

Code:
1 header_info_a
2 header_info_b
3 header_info_c

etc.
here's what I've done so far:

Code:
#!/bin/bash
grep -F "FIELD_NAME :=" R.txt > temp.txt
sed '/\FIELD_NAME :=/s/FIELD_NAME :=/index_counter/g' temp > outputfile.txt
so this kinda somewhat works, but not really. I don't know how to create the index_counter and the results of sed give more than just the FIELD_NAME string which I don't understand becasue my grep only has the FIELD_NAME string.

thanks soooo much for any help,

Tabitha
 
Old 12-11-2012, 12:21 PM   #2
atjurhs
Member
 
Registered: Aug 2012
Posts: 133

Original Poster
Rep: Reputation: Disabled
wait wait wait, I fixed part of it, I'm so excited now I use

Code:
grep -F "FIELD_NAME :=" R.txt | sed '/\FIELD_NAME :=/s/FIELD_NAME :=/index_counter/g' > outputfile.txt
and this gives an outputfile.txt with:
Code:
index_counter header_info_a;
index_counter header_info_b;
index_counter header_info_c;
index_counter header_info_d;

etc.
so now all I need help with is creating the index_counter.....

thanks sooooo much,

Tabitha
 
Old 12-11-2012, 12:30 PM   #3
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Ubuntu
Posts: 1,032

Rep: Reputation: 275Reputation: 275Reputation: 275
Try this ...
Code:
grep "FIELD_NAME :=" $InFile  \
|cut -d'=' -f2-               \
|nl
Daniel B. Martin
 
Old 12-11-2012, 12:31 PM   #4
shivaa
Senior Member
 
Registered: Jul 2012
Location: Grenoble, Fr.
Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0
Posts: 1,778
Blog Entries: 4

Rep: Reputation: 282Reputation: 282Reputation: 282
You can use one-liner awk:
Code:
awk 'BEGIN{FS=" "}; /FIELD_NAME/ {gsub(/;/,"",$3); print $3}' ./R.txt| nl > /path/to/output_file
Output:
Code:
     1  header_info_a
     2  header_info_b
     3  header_info_c
     4  header_info_d

Last edited by shivaa; 12-11-2012 at 12:33 PM.
 
Old 12-11-2012, 12:40 PM   #5
atjurhs
Member
 
Registered: Aug 2012
Posts: 133

Original Poster
Rep: Reputation: Disabled
ut oh, it looks like I have one other problem, my header strings have a prefix, so they really look like

Code:
prefix.1.header_info_a
prefix.1.header_info_b
prefix.2.header_info_c
prefix.1.header_info_d
and I still need them to look like

Code:
1  header_info_a
2  header_info_b
3  header_info_c
4  header_info_d
sorry guys!
and thanks for helping me!!!

Tabby
 
Old 12-11-2012, 12:51 PM   #6
atjurhs
Member
 
Registered: Aug 2012
Posts: 133

Original Poster
Rep: Reputation: Disabled
Hi Daniel, yours doesn't exactly work, it gives a line number for every line not an index number for just the FIELD_NAMES lines.

Hi Shivaa, yours works perfectly (except for the prefix problem), although your command is a bit beyound my scripting abilities.

could I do something like

Code:
awk 'BEGIN{FS=" "}; /FIELD_NAME/ {gsub(/;/,"",$3); print $3} | sed '/\"prefix."/s/\"prefix."//g' ' ./R.txt| nl > /path/to/output_file
and I guess I'm going to have to learn about gsub and what the BEGIN does?

I still would like to know the general way to create an index_counter and feed it into other awk/sed/bash scripts?

Last edited by atjurhs; 12-11-2012 at 12:57 PM.
 
Old 12-11-2012, 12:56 PM   #7
shivaa
Senior Member
 
Registered: Jul 2012
Location: Grenoble, Fr.
Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0
Posts: 1,778
Blog Entries: 4

Rep: Reputation: 282Reputation: 282Reputation: 282
Alright, don't worry . Just use process substitution to achieve this, as follow:
Code:
awk '{FS="."} {print $3}' <(awk 'BEGIN{FS=" "}; /FIELD_NAME/ {gsub(/;/,"",$3); print $3}' R.txt) | nl > /path/to/output_file
Output:
Code:
     1  header_info_a
     2  header_info_b
     3  header_info_c
     4  header_info_d
Sure, I will give you an awk lesson later

Last edited by shivaa; 12-11-2012 at 01:01 PM. Reason: Line added
 
Old 12-11-2012, 01:08 PM   #8
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Ubuntu
Posts: 1,032

Rep: Reputation: 275Reputation: 275Reputation: 275
Quote:
Originally Posted by atjurhs View Post
Hi Daniel, yours doesn't exactly work, it gives a line number for every line not an index number for just the FIELD_NAMES lines.
Perhaps I misread your problem statement.

Try this ...
Code:
nl $InFile            \
|grep "FIELD_NAME :=" \
|cut -c1-7,21-
Daniel B. Martin

Last edited by danielbmartin; 12-11-2012 at 01:08 PM. Reason: Cosmetic improvement
 
Old 12-11-2012, 01:15 PM   #9
atjurhs
Member
 
Registered: Aug 2012
Posts: 133

Original Poster
Rep: Reputation: Disabled
Shivaa, thanks so much, that's very cool!

is that called "nesting" I had no idea awk could do that?

why did you use $3, I'm not sure what that means, is it different that $1

Thanks again,

Tabby
 
Old 12-11-2012, 01:42 PM   #10
shivaa
Senior Member
 
Registered: Jul 2012
Location: Grenoble, Fr.
Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0
Posts: 1,778
Blog Entries: 4

Rep: Reputation: 282Reputation: 282Reputation: 282
Quote:
Shivaa, thanks so much, that's very cool!
Thanks ! You can mark the question as SOLVED (Under Thread Tools option on the top of the page).
Quote:
...is that called "nesting" I had no idea awk could do that?
It's called process substitution, which means insert output of a command in another.
Quote:
...why did you use $3, I'm not sure what that means, is it different that $1
$ is nothing but represents variables. Well, a simple answer to this question will lead you to get confused, so better first go through Awk lessions here.

Keep smiling!

Last edited by shivaa; 12-11-2012 at 01:45 PM.
 
Old 12-11-2012, 02:13 PM   #11
atjurhs
Member
 
Registered: Aug 2012
Posts: 133

Original Poster
Rep: Reputation: Disabled
Wink

Quote:
Originally Posted by shivaa View Post
It's called process substitution, which means insert output of a command in another.
almost solved, one more question

what's the diference between "process substitution, which means insert output of a command in another" and the | command

I guess I'm asking for the "I will give you an awk lesson later "

Last edited by atjurhs; 12-11-2012 at 02:28 PM.
 
Old 12-11-2012, 10:42 PM   #12
shivaa
Senior Member
 
Registered: Jul 2012
Location: Grenoble, Fr.
Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0
Posts: 1,778
Blog Entries: 4

Rep: Reputation: 282Reputation: 282Reputation: 282
For your knowledge you can read or learn about process substitution, but your question was not all about it.
Well, as per documents..
Quote:
Process substitution feeds the output of a process (or processes) into the stdin of another process.
So I feel, you should once go through following guides, for clear understanding:
1. Process Substitution
2. Advance Bash Scripting Guide
These guides are treasure of knowledge. You'll learn many more techniques as well.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Why Does Linux Headers Keep Including Too Many Headers? minivy Linux - Kernel 2 12-07-2011 02:16 PM
Searching for rows gregarion Programming 2 01-10-2010 01:21 PM
Zypper wants to dl the wrong kernel headers... YaST doesnt have current headers zorb Suse/Novell 2 11-28-2009 11:12 AM
Compare two fields on consecutive rows and print the two rows aditi_borkar Linux - Newbie 3 04-09-2009 05:49 AM
Column into rows bharatbsharma Programming 1 10-25-2007 02:23 AM


All times are GMT -5. The time now is 08:20 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration