LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 10-11-2010, 01:23 AM   #1
flamingo_l
Member
 
Registered: Jul 2010
Posts: 41

Rep: Reputation: 0
Comparing and Formatting the text file


hi,

I need a script which can format the below text file which contains comments


Code:

Code:
 
file1.txt
--------
 
START
Name: some value
Date:
Function Name: .....
...................
Changes:.............
.....................
END
START
Date:
Name: some value
Function Name: .....
...................
Changes:.............
.....................
END
.................
...................



Output should be:


Code:
Code:
Name |Date|Function Name|Changes
Script should compare the column name and paste the output in above said manner.

I am trying this, can anybody please help me on this.

Last edited by flamingo_l; 10-11-2010 at 01:24 AM.
 
Old 10-11-2010, 02:29 AM   #2
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,512

Rep: Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895
Happy to help ... what have you got so far? Are you using a particular app/script/language?
 
Old 10-11-2010, 04:10 AM   #3
flamingo_l
Member
 
Registered: Jul 2010
Posts: 41

Original Poster
Rep: Reputation: 0
Yes, Linux script.

I got the follwoing script. But the below script works only for 3 columns.

I need to do it for 4 columns given in the above sample file.

Quote:
BEGIN {
FS=":" ; OFS="|"
num=split("Name,Date,Changes",cols,",")
print cols[1],cols[2],cols[3]
}
{
s=$2
sub(/^ */, "", s);
if ($1 == "END") print res[1], res[2], res[3]
else
{
if (res[3] != "" && NR=1)
res[3]=res[3]" "$1
for(i=1; i<=num; i++)
{
if (cols[i] == $1)
res[i]=s;
}
}
}
Sample file:
Quote:
START
Name: some value
Date:
Changes:Change A
more of change A
END

Now, am trying to apply to four feild, it is not working.

I need this urgently and it is not working. Please help me.
 
Old 10-11-2010, 04:26 AM   #4
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,512

Rep: Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895
So I am not understanding, you want the words Date, Name, etc or you want the field that they refer too or are the words you have written actually what they say they are? ie Date is an actual date?

If you sample is:
Code:
START
Name: some_name
Date: some_date
Function Name: some_function
Changes:Change A
more of change A
END
What would be the output you require, based on this as input?
 
Old 10-11-2010, 04:47 AM   #5
flamingo_l
Member
 
Registered: Jul 2010
Posts: 41

Original Poster
Rep: Reputation: 0
hi Grail,

The above code given by me, would work for yor sample file.

Suppose if the function name exceeds more than 2 lines, then the code is not working propoerly.

I need code for this sample file:
Quote:
START
Name: some_name
Date: some_date
Function Name: some_function_name(jjjjjjjjj,
fjddddd, gggg, ggg)
Changes:Change A
more of change A
END
START
Date: some_date
Name: some_name
Function Name: some_function_nameB(jjjjjjjjj,
fjddddd, gggg, ggg)
Changes:Change B
more of change B
END
And also the sequence of the sub heading Name, Date, Function Name, Change may vary.

Last edited by flamingo_l; 10-11-2010 at 05:02 AM.
 
Old 10-11-2010, 05:53 AM   #6
flamingo_l
Member
 
Registered: Jul 2010
Posts: 41

Original Poster
Rep: Reputation: 0
For easily extracting the value of function name and Changes, if needed i can place delimeters for the sub-headings (like fucntion name, changes) start and end as follows.


Quote:
START
Name: some_name
Date: some_date
Function Name: <some_function_name(jjjjjjjjj,
fjddddd, gggg, ggg)>
Changes:<Change A
more of change A>
END
START
Date: some_date
Name: some_name
Function Name: <some_function_nameB(jjjjjjjjj,
fjddddd, gggg, ggg)>
Changes:<Change B
more of change B>
END

Since am new to awk programming i am not aware of how to traverse in a given feild.
 
Old 10-11-2010, 07:29 AM   #7
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,512

Rep: Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895
See what ya think:
Code:
#!/usr/bin/awk -f

BEGIN{
	RS="(START|END)\n"
	FS=":"
	OFS="|"
}

NF>0{
    counter++
    for(indx=1;indx<=NF;indx++){
	if($indx ~ "\n"){
	    n=split($indx,pieces,"\n")
	    if(n == 2)
		arr[counter,val]=pieces[1]
	    else
		for(z=1;z<n;z++)
		    if(z > 1)
			arr[counter,val]=arr[counter,val]"\n"pieces[z]
		    else
			arr[counter,val]=pieces[z]

		    val=pieces[n]
	}
	else 
	    val=$indx

	if(!(val in array_vals) && val != "")
	    array_vals[val]++
    }
}

END{
    for(y=1;y<=counter;y++)
	for(u in array_vals)
	    print u,arr[y,u]
}
May need some refining but seems to work for given examples
 
Old 10-12-2010, 01:42 AM   #8
flamingo_l
Member
 
Registered: Jul 2010
Posts: 41

Original Poster
Rep: Reputation: 0
hi Grail,

The code is not working.

I have saved the code given by you in a file awk-script and removed the first statement since it was throwing error.
The sample file is saved in file.txt

Executed the below way:

Quote:
awk -f awk-script file.txt

But the output is :

Quote:
Function Name| <some_function_name(jjjjjjjjj,
fjddddd, gggg, ggg)>
Date| some_date
Changes|<Change A
more of change A>
Name| some_name
Function Name| <some_function_nameB(jjjjjjjjj,
fjddddd, gggg, ggg)>
Date| some_date
Changes|<Change B
more of change B>
END
Name| some_name
But the expected output is:


Quote:
Name |Date|Function Name|Changes
some_name|some_date|some_function_nameB(jjjjjjjjj,fjddddd, gggg, ggg)|Change A more of change A
some_name|some_date|some_function_nameB(jjjjjjjjj,fjddddd, gggg, ggg)|Change B more of change B
 
Old 10-12-2010, 01:46 AM   #9
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,512

Rep: Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895
So you will need to do the formatting part, but as you can see the data in both is mostly equivalent. It is only the part enclosed by:

END{}

That you need to look at for formatting.
 
Old 10-12-2010, 07:44 AM   #10
flamingo_l
Member
 
Registered: Jul 2010
Posts: 41

Original Poster
Rep: Reputation: 0
hi Grail,

I need to understand the logic written so that i can format it accordingly.
Can you please explain me.
 
Old 10-12-2010, 08:51 AM   #11
flamingo_l
Member
 
Registered: Jul 2010
Posts: 41

Original Poster
Rep: Reputation: 0
hi Grail,

I got an idea, instead of writing an AWK script for formatting, can we merge the lines in changes and function name so that we can use the awk script given by me above to do the formatting.

Suppose,
I have a file as below:



Code:
Quote:
Name: some_name
Date: some_date
Function Name: <some_function_name(jjjjjjjjj,
fjddddd, gggg, ggg)>
Changes:<Change A
more of change A>
Name: some_name
Date: some_date
Function Name: some_function_nameB(jjjjjjjjj,
fjddddd, gggg, ggg)
Changes:Change B
more of change B
I need a script which can merge the lines based on the sub-headings.
Expected output is:




Code:
Quote:
Name: some_name
Date: some_date
Function Name: some_function_name(jjjjjjjjj,fjddddd, gggg, ggg)
Changes:Change A more of change A
Name: some_name
Date: some_date
Function Name: some_function_nameB(jjjjjjjjj,fjddddd, gggg, ggg)
Changes:Change B more of change B
 
Old 10-12-2010, 10:17 AM   #12
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,512

Rep: Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895
So using your input from post #5 and with the small changes made to script as below, see output:
Code:
#!/usr/bin/awk -f

BEGIN{
	RS="(START|END)\n"
	FS=":"
	OFS="|"
}

NF>0{
    counter++
    for(indx=1;indx<=NF;indx++){
	if($indx ~ "\n"){
	    n=split($indx,pieces,"\n")
	    if(n == 2)
		arr[counter,val]=pieces[1]
	    else
		for(z=1;z<n;z++)
		    if(z > 1)
			arr[counter,val]=arr[counter,val]" "pieces[z]
		    else
			arr[counter,val]=pieces[z]

		    val=pieces[n]
	}
	else 
	    val=$indx

	if(!(val in array_vals) && val != "")
	    array_vals[val]++
    }
}

END{
    print "Name|Date|Function Name|Changes"
    for(y=1;y<=counter;y++)
	print arr[y,"Name"],arr[y,"Date"],arr[y,"Function Name"],arr[y,"Changes"]
}
Run script as:
Code:
./script.awk input_file
Output based on input above:
Code:
Name|Date|Function Name|Changes
 some_name| some_date| some_function_name(jjjjjjjjj, fjddddd, gggg, ggg)|Change A more of change A
 some_name| some_date| some_function_nameB(jjjjjjjjj, fjddddd, gggg, ggg)|Change B more of change B
I feel you should do some of the hard yards yourself and look up your reference material for awk to workout the how and why.

Post back on anything that you get stuck on
 
1 members found this post helpful.
Old 10-13-2010, 02:25 AM   #13
flamingo_l
Member
 
Registered: Jul 2010
Posts: 41

Original Poster
Rep: Reputation: 0
Thanks Grail. It is working perfectly.

I have done debug of each and every line and got the logic.

I dont think the following code is required. As per my understanding this is placed to have an array of val in an array- array_vals. Removal of this, the code works fine.

Please correct me if am wrong.

Quote:
if(!(val in array_vals) && val != "")
array_vals[val]++
 
Old 10-13-2010, 03:16 AM   #14
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,512

Rep: Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895Reputation: 1895
Yeah that was for older stuff and can be removed.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] formatting text vikas027 Programming 12 10-20-2008 11:52 PM
How to parse text file to a set text column width and output to new text file? jsstevenson Programming 12 04-23-2008 02:36 PM
Comparing text files... jong357 Slackware 14 03-31-2007 04:29 PM
Help with changing the formatting of a text file. zackarya Programming 4 05-06-2006 01:47 PM
how to produce a text file from man w/o formatting? spyghost Linux - Newbie 2 07-30-2003 06:05 PM


All times are GMT -5. The time now is 03:17 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration