LinuxQuestions.org
Latest LQ Deal: Complete CCNA, CCNP & Red Hat Certification Training Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 04-04-2009, 08:39 PM   #1
llattan
LQ Newbie
 
Registered: Apr 2009
Location: Rosario
Distribution: Redhat Enterprise
Posts: 12

Rep: Reputation: 0
sort lines inside blocks in a file


I need to sort lines inside blocks in a file, not between blocks, and not sort blocks.
"Block": two new-line characters together

sample file:


xyz
def
opq

ghi
abc
rstuu
442

fde
932




desired output:


def
opq
xyz

442
abc
ghi
rstuu

932
fde




Which language or command would you use ? (sort, msort, sed, awk, perl, pyhton, bash-script, etc)
Which is the better for this ?
Could anybody help me to write a script to do this ?

Leandro (Argentina)
 
Old 04-05-2009, 05:28 AM   #2
suhas!
Member
 
Registered: Mar 2007
Posts: 100

Rep: Reputation: 17
Hey.... I dont know the better way to do it.. but here is my ugly solution....

[root@localhost ~]# cat /tmp/test
xyz
def
opq

ghi
abc
rstuu
442

fde
932

[root@localhost ~]# cat /tmp/test | awk 'BEGIN { id=1111; }{if ( NF == 1 ){print id$NF;}else{id++;print id;id++;}}' | sort | sed 's/^....//'

def
opq
xyz

442
abc
ghi
rstuu

932
fde

Last edited by suhas!; 04-05-2009 at 05:32 AM.
 
Old 04-05-2009, 09:21 AM   #3
llattan
LQ Newbie
 
Registered: Apr 2009
Location: Rosario
Distribution: Redhat Enterprise
Posts: 12

Original Poster
Rep: Reputation: 0
sort lines inside blocks

Thanks your solution.

But your solution only works if each line not contain any space character. In this case, your solution deletes the line.

I've posted this file only as sample file. The lines can contain any character (can begin with space for example, can contain a space in any position, etc)

This is the rule:
line: between only one new line character
block: between two or more new line characters (collection of lines)

I hope you can help me.
Thanks in advance.
Leandro.



Quote:
Originally Posted by suhas! View Post
Hey.... I dont know the better way to do it.. but here is my ugly solution....

[root@localhost ~]# cat /tmp/test
xyz
def
opq

ghi
abc
rstuu
442

fde
932

[root@localhost ~]# cat /tmp/test | awk 'BEGIN { id=1111; }{if ( NF == 1 ){print id$NF;}else{id++;print id;id++;}}' | sort | sed 's/^....//'

def
opq
xyz

442
abc
ghi
rstuu

932
fde
 
Old 04-05-2009, 11:46 AM   #4
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
Here is an awk code, which stores in the indexes of an array the lines in a block. Every time it encounters an empty line, the indexes of the array are sorted and printed out. The END statement ensures the last block is printed even if there is no empty line at the end of the file:
Code:
!/^$/{
  array[$0] = 1
}

/^$/ { n = asorti(array,sorted)
       for ( i = 1; i <= n; i++ ) {
         print sorted[i]
	 delete array[sorted[i]]
	 delete sorted[i]
       }
       print ""
}

END { n = asorti(array,sorted)
      for ( i = 1; i <= n; i++ ) {
        print sorted[i]
	delete array[sorted[i]]
	delete sorted[i]
      }
}
Just take in mind that indexes of the arrays in awk can be any string, so that I store the content of the line as index of the array, not as value of an array's element. Hope this helps!
 
Old 04-05-2009, 11:56 AM   #5
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
Another clue: if in the same block there are duplicated lines, the code above prints out just one of them. To be sure all of them are printed out, just count their occurrences:
Code:
 !/^$/{
  array[$0] += 1
}

/^$/ { n = asorti(array,sorted)
       for ( i = 1; i <= n; i++ ) {
         for (j = 1; j <= array[sorted[i]]; j++)
           print sorted[i]
	 delete array[sorted[i]]
	 delete sorted[i]
       }
       print ""
}

END { n = asorti(array,sorted)
      for ( i = 1; i <= n; i++ )
        for (j = 1; j <= array[sorted[i]]; j++)
          print sorted[i]
}
 
Old 04-05-2009, 12:46 PM   #6
llattan
LQ Newbie
 
Registered: Apr 2009
Location: Rosario
Distribution: Redhat Enterprise
Posts: 12

Original Poster
Rep: Reputation: 0
Thank you!

I have just only one question now.
Does this script still work if there is no empty line at the beginning of the file ?

How I have to execute your script ?

awk -F your_script file_to_process

Regards.
Leandro.



Quote:
Originally Posted by colucix View Post
Here is an awk code, which stores in the indexes of an array the lines in a block. Every time it encounters an empty line, the indexes of the array are sorted and printed out. The END statement ensures the last block is printed even if there is no empty line at the end of the file:
Code:
!/^$/{
  array[$0] = 1
}

/^$/ { n = asorti(array,sorted)
       for ( i = 1; i <= n; i++ ) {
         print sorted[i]
	 delete array[sorted[i]]
	 delete sorted[i]
       }
       print ""
}

END { n = asorti(array,sorted)
      for ( i = 1; i <= n; i++ ) {
        print sorted[i]
	delete array[sorted[i]]
	delete sorted[i]
      }
}
Just take in mind that indexes of the arrays in awk can be any string, so that I store the content of the line as index of the array, not as value of an array's element. Hope this helps!
 
Old 04-05-2009, 01:14 PM   #7
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
Quote:
Originally Posted by llattan View Post
Does this script still work if there is no empty line at the beginning of the file ?
Yes. If there is an empty line at the beginning, it just print out the empty line, keeping the same structure in the output (all the empty lines are printed out as is).

Quote:
Originally Posted by llattan View Post
How I have to execute your script ?

awk -F your_script file_to_process
Nope. You have to use the -f (lowercase) option. The -F (uppercase) has another meaning: it specifies the field separator, FS being used by awk.
 
Old 04-05-2009, 01:15 PM   #8
llattan
LQ Newbie
 
Registered: Apr 2009
Location: Rosario
Distribution: Redhat Enterprise
Posts: 12

Original Poster
Rep: Reputation: 0
Thank you very much!
It works !
I ran: awk -f script file

Thank you again!

Best Regards.
Leandro.


Quote:
Originally Posted by llattan View Post
Thank you!

I have just only one question now.
Does this script still work if there is no empty line at the beginning of the file ?

How I have to execute your script ?

awk -F your_script file_to_process

Regards.
Leandro.
 
Old 04-05-2009, 01:16 PM   #9
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
You're welcome!
Bests
 
Old 04-05-2009, 01:19 PM   #10
llattan
LQ Newbie
 
Registered: Apr 2009
Location: Rosario
Distribution: Redhat Enterprise
Posts: 12

Original Poster
Rep: Reputation: 0
Thank you very much!
It works !
I ran: awk -f script file

Thank you again!

Best Regards.
Leandro.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Transpose info between blocks from column to lines arragement cgcamal Programming 1 11-16-2008 08:48 PM
[sed || gawk]: find and delete blocks and lines from file Hisu Programming 1 09-16-2008 02:01 PM
counting number of lines inside a directory structure. vl@d Linux - General 4 11-20-2006 12:50 PM
Is there a line limit with the sort utility? Trying to sort 130 million lines of text gruffy Linux - General 4 08-10-2006 08:40 PM
How can I sort the lines in a file? windhair Linux - Software 2 11-17-2005 08:37 AM


All times are GMT -5. The time now is 08:23 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration