LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   sort lines inside blocks in a file (https://www.linuxquestions.org/questions/linux-newbie-8/sort-lines-inside-blocks-in-a-file-716913/)

llattan 04-04-2009 08:39 PM

sort lines inside blocks in a file
 
I need to sort lines inside blocks in a file, not between blocks, and not sort blocks.
"Block": two new-line characters together

sample file:


xyz
def
opq

ghi
abc
rstuu
442

fde
932




desired output:


def
opq
xyz

442
abc
ghi
rstuu

932
fde




Which language or command would you use ? (sort, msort, sed, awk, perl, pyhton, bash-script, etc)
Which is the better for this ?
Could anybody help me to write a script to do this ?

Leandro (Argentina)

suhas! 04-05-2009 05:28 AM

Hey.... I dont know the better way to do it.. but here is my ugly solution....

[root@localhost ~]# cat /tmp/test
xyz
def
opq

ghi
abc
rstuu
442

fde
932

[root@localhost ~]# cat /tmp/test | awk 'BEGIN { id=1111; }{if ( NF == 1 ){print id$NF;}else{id++;print id;id++;}}' | sort | sed 's/^....//'

def
opq
xyz

442
abc
ghi
rstuu

932
fde

llattan 04-05-2009 09:21 AM

sort lines inside blocks
 
Thanks your solution.

But your solution only works if each line not contain any space character. In this case, your solution deletes the line.

I've posted this file only as sample file. The lines can contain any character (can begin with space for example, can contain a space in any position, etc)

This is the rule:
line: between only one new line character
block: between two or more new line characters (collection of lines)

I hope you can help me.
Thanks in advance.
Leandro.



Quote:

Originally Posted by suhas! (Post 3498956)
Hey.... I dont know the better way to do it.. but here is my ugly solution....

[root@localhost ~]# cat /tmp/test
xyz
def
opq

ghi
abc
rstuu
442

fde
932

[root@localhost ~]# cat /tmp/test | awk 'BEGIN { id=1111; }{if ( NF == 1 ){print id$NF;}else{id++;print id;id++;}}' | sort | sed 's/^....//'

def
opq
xyz

442
abc
ghi
rstuu

932
fde


colucix 04-05-2009 11:46 AM

Here is an awk code, which stores in the indexes of an array the lines in a block. Every time it encounters an empty line, the indexes of the array are sorted and printed out. The END statement ensures the last block is printed even if there is no empty line at the end of the file:
Code:

!/^$/{
  array[$0] = 1
}

/^$/ { n = asorti(array,sorted)
      for ( i = 1; i <= n; i++ ) {
        print sorted[i]
        delete array[sorted[i]]
        delete sorted[i]
      }
      print ""
}

END { n = asorti(array,sorted)
      for ( i = 1; i <= n; i++ ) {
        print sorted[i]
        delete array[sorted[i]]
        delete sorted[i]
      }
}

Just take in mind that indexes of the arrays in awk can be any string, so that I store the content of the line as index of the array, not as value of an array's element. Hope this helps! :)

colucix 04-05-2009 11:56 AM

Another clue: if in the same block there are duplicated lines, the code above prints out just one of them. To be sure all of them are printed out, just count their occurrences:
Code:

!/^$/{
  array[$0] += 1
}

/^$/ { n = asorti(array,sorted)
      for ( i = 1; i <= n; i++ ) {
        for (j = 1; j <= array[sorted[i]]; j++)
          print sorted[i]
        delete array[sorted[i]]
        delete sorted[i]
      }
      print ""
}

END { n = asorti(array,sorted)
      for ( i = 1; i <= n; i++ )
        for (j = 1; j <= array[sorted[i]]; j++)
          print sorted[i]
}


llattan 04-05-2009 12:46 PM

Thank you!

I have just only one question now.
Does this script still work if there is no empty line at the beginning of the file ?

How I have to execute your script ?

awk -F your_script file_to_process

Regards.
Leandro.



Quote:

Originally Posted by colucix (Post 3499208)
Here is an awk code, which stores in the indexes of an array the lines in a block. Every time it encounters an empty line, the indexes of the array are sorted and printed out. The END statement ensures the last block is printed even if there is no empty line at the end of the file:
Code:

!/^$/{
  array[$0] = 1
}

/^$/ { n = asorti(array,sorted)
      for ( i = 1; i <= n; i++ ) {
        print sorted[i]
        delete array[sorted[i]]
        delete sorted[i]
      }
      print ""
}

END { n = asorti(array,sorted)
      for ( i = 1; i <= n; i++ ) {
        print sorted[i]
        delete array[sorted[i]]
        delete sorted[i]
      }
}

Just take in mind that indexes of the arrays in awk can be any string, so that I store the content of the line as index of the array, not as value of an array's element. Hope this helps! :)


colucix 04-05-2009 01:14 PM

Quote:

Originally Posted by llattan (Post 3499245)
Does this script still work if there is no empty line at the beginning of the file ?

Yes. If there is an empty line at the beginning, it just print out the empty line, keeping the same structure in the output (all the empty lines are printed out as is).

Quote:

Originally Posted by llattan (Post 3499245)
How I have to execute your script ?

awk -F your_script file_to_process

Nope. You have to use the -f (lowercase) option. The -F (uppercase) has another meaning: it specifies the field separator, FS being used by awk.

llattan 04-05-2009 01:15 PM

Thank you very much!
It works !
I ran: awk -f script file

Thank you again!

Best Regards.
Leandro.


Quote:

Originally Posted by llattan (Post 3499245)
Thank you!

I have just only one question now.
Does this script still work if there is no empty line at the beginning of the file ?

How I have to execute your script ?

awk -F your_script file_to_process

Regards.
Leandro.


colucix 04-05-2009 01:16 PM

You're welcome!
Bests :)

llattan 04-05-2009 01:19 PM

Thank you very much!
It works !
I ran: awk -f script file

Thank you again!

Best Regards.
Leandro.


All times are GMT -5. The time now is 09:23 PM.