LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 09-25-2012, 12:19 AM   #1
sonia102d
LQ Newbie
 
Registered: Sep 2012
Posts: 18

Rep: Reputation: Disabled
Smile sorting sequences in ascending order


Hi,
I have this file with a number of sequence of format
>string1
data
>string100
data
>string10
.....
>string5
...
>string67
......

the dots represent data.
I wanted to get the sequences arranged in ascending order like
>string1
data
>string5
data
>string10
.....
>string67

I used sort -n filename command but it ddint work.
Could some one help me!!
Thanks
 
Old 09-25-2012, 02:04 AM   #2
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
Using awk you could store each data set with arrays. array["stringN"] = "data".

Code:
keys_count = 0
while (getline) {
    if ($0 ~ />string.*/) {
        key = $0
        keys[keys_count++] = key
        data[key] = $0
    } else if (key) {
        data[key] = data[key] "\n" $0
    }
}
Sort the keys array the with those sorted keys, print each data.

Code:
sort(keys) # still have to make a function for that

for (i = 0; i < keys_count; ++i) {
    key = keys[i]
    print data[key]
}
 
Old 10-04-2012, 10:59 AM   #3
sonia102d
LQ Newbie
 
Registered: Sep 2012
Posts: 18

Original Poster
Rep: Reputation: Disabled
Question

Hi
Thanks for the reply.
But I am new into programming and dont understand the code much!!
Isnt there a sed or awk 1 liner that could solve my issue??

Again my file looks like
>string0
qwerrtttt
>string 10
lksksksksks
>string 2
lllllsosowowe

i want it to look like
>string0
qwerrtttt
>string 2
lllllsosowowe
>string 10
lksksksksks

Thanks
Sonia
 
Old 10-04-2012, 03:56 PM   #4
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959
Please use ***[code][/code]*** tags around your code and data, to preserve the original formatting and to improve readability. Do not use quote tags, bolding, colors, "start/end" lines, or other creative techniques.


This is a rather tricky problem, since a) the data are on a separate lines from the sorting strings, and b) the sorting strings are all combo alphanumerics. There's no simple way to sort something like that.

I did manage to get your sample text to work with this bit of awk though:

Code:
#requires gnu awk v4+

awk 'BEGIN{ PROCINFO["sorted_in"]="@ind_num_asc" } \
/^>string/ { gsub(/[^0-9]/,"",$0) ; n=$0 } \
! /^>string/ { arr[n]=$0 } \
END{ for (i in arr) { print ">string"i"\n"arr[i] } }' infile.txt
Only recent versions of gawk have index sorting built in. And even then I had to do some trickery to strip off the non-numeric part of the prefix so it would sort properly, and add it back on again at the final printing. Oh, and it only stores one line of data following the string. And don't try it on files too big for system RAM to hold.

Doubtlessly it could be improved from here with a bit more work.
 
Old 10-05-2012, 01:22 AM   #5
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
Using a sorting algorithm (bubble sort) from this site: http://rosettacode.org/wiki/Sorting_...ubble_sort#AWK

Code:
#!/usr/bin/env gawk -f

function sort(array, count) {
	do {
		haschanged = 0
		for (i = 0; i < count; i++) {
			if (array[i] > array[i + 1]) {
				t = array[i]
				array[i] = array[i + 1]
				array[i + 1] = t
				haschanged = 1
			}
		}
	} while (haschanged)
}

BEGIN {
	keys_count = 0
}	

{
	if ($0 ~ />string.*/) {
		key = $0
		keys[keys_count++] = key
		data[key] = $0
	} else if (key) {
		data[key] = data[key] "\n" $0
	}
}

END {
	sort(keys, keys_count)

	for (i = 0; i < keys_count; ++i) {
		key = keys[i]
		print data[key]
	}
}
Note: There's also Quicksort and the builtin asort.

Last edited by konsolebox; 10-05-2012 at 01:34 AM. Reason: algorithm seemed to have a bug with starting index
 
Old 10-05-2012, 02:28 AM   #6
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,576
Blog Entries: 31

Rep: Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195
If there is a character that does not appear in the strings or the data (used on sep= in the script) ...
Code:
#!/bin/bash

input=input.txt
sep=X
grep --quiet "$sep" "$input"
if [[ $? -eq 0 ]]; then
    echo "Separator $sep can't be used. It appears in $input"
    exit 1
fi

i=0
cat "$input" | while read -r line
do
  ((i++))
  if [[ i -eq 1 ]]; then
      data=$line
  else
      i=0
      echo "$data$sep$line"
  fi
done | sort --field-separator=$sep | sed "s/$sep/\n/"
 
Old 10-05-2012, 11:22 AM   #7
sonia102d
LQ Newbie
 
Registered: Sep 2012
Posts: 18

Original Poster
Rep: Reputation: Disabled
Cool

This is a rather tricky problem, since a) the data are on a separate lines from the sorting strings, and b) the sorting strings are all combo alphanumerics. There's no simple way to sort something like that.

Hi,

Thanks for all the inputs!!
But my sequences are pretty long..
They are like1000 lines in one string.

i will try though!!

Thanks all
Sonia
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Sort numbers in ascending order using echo, and as an alias robfraser Linux - Newbie 2 04-18-2012 03:44 PM
Ascending or descending order for poll results? linuxlover.chaitanya LQ Suggestions & Feedback 3 02-09-2011 11:24 PM
List 4 names from users list and output them to fbusers in numbered ascending order? fezzie Programming 4 02-10-2010 01:05 PM
Sorting by "Number of Replies" in ascending order does not appear to work PTrenholme LQ Suggestions & Feedback 2 02-15-2008 11:59 AM
ascending order huno Programming 4 07-22-2005 07:03 PM


All times are GMT -5. The time now is 06:28 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration