LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   sorting columns individually (https://www.linuxquestions.org/questions/linux-newbie-8/sorting-columns-individually-4175499479/)

a_bahreini 03-25-2014 10:48 PM

sorting columns individually
 
Hi,
I have a file full of this fields and rows (this is showing two rows and two columns but my file has more rows and columns):
Code:

0.269330|0.035118|0.526763|0.792274        0.33555|19.471911|51.844968|1631  ...
3.981490|5.062725|17.190744|111    0.000000|0.030234|0.000000|1631  ...
...

I'd like to sort EACH column individually based on the value after the third "|". This has to be done separately for each column. Is there anybody to help? BTW, the file is tab delimited.
Thanks

grail 03-25-2014 11:03 PM

You may need to explain a little further as it does not appear clear what you require (at least to me).

If you have 2 rows by 2 columns, what do the '...' represent?

If you are using a single field as the reference, you then say you want to sort each column based on this. If there is
only one reference point, won't all the columns end up in the same position after sorting?

a_bahreini 03-25-2014 11:07 PM

Sorry for the confusion. "..." means just that I have more of those fields and rows, please ignore those. My file has 41 fields and 60 rows. I have only inputed two rows and two columns in my last post. I don't want to sort based on a particular field but I'd like to sort each column by itself based on the number after third vertical bar on that column. I just wanted 41 sorted columns in my output. Hope this helped

chrism01 03-25-2014 11:35 PM

In the field after the 3rd '|' you have a field that has "number< some spaces >anothernumber"
That looks really odd....

How about posting some real data eg 5 cols, 5 rows and use CODE tags https://www.linuxquestions.org/quest...do=bbcode#code and possibly re-phrase the qn.
Thx.

a_bahreini 03-25-2014 11:40 PM

Hi Chrism01,
That's not "some space" but a "tab". The file is tab delimited. Each field has a bunch of numbers separated by "|".

grail 03-26-2014 12:52 AM

ok ... using your current example (although I think chrism01 might be right that a couple more lines may help)
please show what the output would look like after the sort has been completed?

John VV 03-26-2014 01:12 AM

for small things like this using a spread sheet is likely the best tool

excel or the Openoffice Equiv. " Calc" is very good at things like this

chrism01 03-26-2014 03:41 AM

This
[quote]
sort each column by itself
[/code]
makes sense.

This
Code:

based on the number after third vertical bar on that column
not so much ...
Do you mean 'on that row' ?
Also, which number, if each field has multiple numbers; 1st, 2nd, 3rd ...?


As per grail, we need example o/p after sorting.

a_bahreini 03-26-2014 08:39 AM

Ok, let's make this simple. Here is two columns of my file:
Code:

2|3|40|50        21|32|60|70
12|40|30|60        34|21|50|80
43|33|20|21        54|23|70|56

Here I sort each column based on the third number (bold after the second "|"):
Code:

43|33|20|21        34|21|50|80
12|40|30|60        21|32|60|70
2|3|40|50        54|23|70|56

I can't use excel since I have too many numbers in each field separated with "|" and I will end up with tons of columns if replace "|" with "\t".
Thanks in advance

schneidz 03-26-2014 08:49 AM

what have you tried and where are you stuck ?
i would cut each grouping into its own file
then sort by the 3rd key delimited by '|'
then paste the result back to 1 file separated by tab.

a_bahreini 03-26-2014 08:56 AM

Thanks schneidz for the solution but can I do this in a single command line? How can I "cut" all the columns into the individual fields?

schneidz 03-26-2014 08:58 AM

man cut

Madhu Desai 03-26-2014 09:33 AM

You can make use of multiple delimiters in awk. something like this...

Code:

$ cat file
4.269330|0.035118|0.526763|0.792274        0.33555|19.471911|51.844968|1631
3.981490|5.062725|17.190744|121                4.12300|0.030234|0.000000|1631
6.269330|0.035118|0.526763|0.392274        1.67525|19.471911|51.844968|1631
2.269330|0.035118|0.526763|0.992274        88.55|19.471911|51.844968|1631
8.981490|5.062725|17.190744|511                0.000000|0.030234|0.000000|1631
0.981490|5.062725|17.190744|007                3.1|0.030234|0.000000|1631


$ awk -F'[|\t]+' '{print $4 | "sort -g"}' file
0.392274
0.792274
0.992274
007
121
511


grail 03-26-2014 09:43 AM

Based on post #3, I would use your favourite tool (awk, perl, ruby, ...) and create 41 arrays / hashes with 60 elements using the third pipe separated number as the index.
Then either sort or regurgitate the numbers in the order required.

This would require that none of the pipe separated third fields are repeated ... can you guarantee this?
If you cannot, how is the sort supposed to know where to look after this field?

TB0ne 03-26-2014 10:03 AM

Quote:

Originally Posted by a_bahreini (Post 5141546)
Thanks schneidz for the solution but can I do this in a single command line? How can I "cut" all the columns into the individual fields?

You can POST WHAT YOU HAVE DONE AND TRIED SO FAR, as you were asked to. So far, you've posted a question, but have shown no effort of your own..not even reading the man pages on the commands you were given, which would TELL YOU how to do this.


All times are GMT -5. The time now is 04:13 PM.