LinuxQuestions.org
Did you know LQ has a Linux Hardware Compatibility List?
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 11-03-2011, 10:39 AM   #1
bldcerealkiller
LQ Newbie
 
Registered: Aug 2011
Posts: 16

Rep: Reputation: Disabled
Question Converting columns to lines using AWK


Hi everybody,

I need to convert columns into rows in my file using awk.

The file looks like:

6 5 7 8
6 5 7 8
6 5 7 8

The output should be like this:
6 6 6
5 5 5
7 7 7
8 8 8

or this

6 6 6 5 5 5 7 7 7 8 8 8

Thanks in advance for your reply and sorry if this is a repost.

Cheers
 
Old 11-03-2011, 12:01 PM   #2
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942
For each record, loop over the fields, appending each field to an array indexed by the field number. Remember the maximum number of fields, so you'll know how many to print later on. Do not print anything yet. Then, in an end rule, print each value of the array separately:
Code:
awk '{ for (i = 1; i <= NF; i++) f[i] = f[i] " " $i ;
       if (NF > n) n = NF }
 END { for (i = 1; i <= n; i++) sub(/^  */, "", f[i]) ;
       for (i = 1; i <= n; i++) print f[i] }
    ' infile >outfile
The first loop in the end rule removes the superfluous leading spaces; it is simpler than not adding the leading space. I added extra semicolons, so you can put the entire scriptlet on one line if you wish.

Oh, and this seems to work for left-aligned triangular matrices too.

Last edited by Nominal Animal; 11-03-2011 at 12:05 PM.
 
1 members found this post helpful.
Old 11-03-2011, 03:00 PM   #3
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947
The gawk user guide has an example script that does exactly this, here:

http://www.gnu.org/software/gawk/man...mensional.html


PS: Please use [code][/code] tags around your code (including example text), to preserve formatting and to improve readability.

Last edited by David the H.; 11-03-2011 at 03:03 PM.
 
Old 11-04-2011, 06:38 AM   #4
bldcerealkiller
LQ Newbie
 
Registered: Aug 2011
Posts: 16

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Nominal Animal View Post
For each record, loop over the fields, appending each field to an array indexed by the field number. Remember the maximum number of fields, so you'll know how many to print later on. Do not print anything yet. Then, in an end rule, print each value of the array separately:
Code:
awk '{ for (i = 1; i <= NF; i++) f[i] = f[i] " " $i ;
       if (NF > n) n = NF }
 END { for (i = 1; i <= n; i++) sub(/^  */, "", f[i]) ;
       for (i = 1; i <= n; i++) print f[i] }
    ' infile >outfile
The first loop in the end rule removes the superfluous leading spaces; it is simpler than not adding the leading space. I added extra semicolons, so you can put the entire scriptlet on one line if you wish.

Oh, and this seems to work for left-aligned triangular matrices too.

Thank you very much for the answer my friend but it seems that I'm gettin problems with the last column. this is the output using your script:

6 6 6
5 5 5
7 7 7
8

Any suggestion?

P.s. the example in gawk user guide orders the rows in the opposite direction..but I guess I can find a way to modify that.
 
Old 11-04-2011, 08:11 AM   #5
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947
Quote:
Originally Posted by bldcerealkiller View Post
P.s. the example in gawk user guide orders the rows in the opposite direction..but I guess I can find a way to modify that.
When I run that script on the example text you posted, I get exactly the output you asked for; a clockwise quarter-turn. What do you get?

If you need it to do something different than what you originally requested, then you need to clarify that.

Edit: Ah, maybe I see it now. You aren't just rotating the array, you need each top-to-bottom column to become a left-to-right row, is that it? It would've been clearer if you'd used different numbers for each row.

Modifying the second loop in the END section to count up instead of down appears to do that.
change this...
Code:
for (y = max_nr; y >= 1; --y)
...to this:
Code:
for (y = 1; y <= max_nr; y++)
When I make the above change, this...
Code:
6 7 8 9
5 6 7 8
4 5 6 7
...becomes this:
Code:
6 5 4
7 6 5
8 7 6
9 8 7

Last edited by David the H.; 11-04-2011 at 08:26 AM. Reason: as posted
 
1 members found this post helpful.
Old 11-04-2011, 08:21 AM   #6
bldcerealkiller
LQ Newbie
 
Registered: Aug 2011
Posts: 16

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by David the H. View Post
When I run that script on the example text you posted, I get exactly the output you asked for; a clockwise quarter-turn. What do you get?

If you need it to do something different than what you originally requested, then you need to clarify that.
Yes you're right having the same numbers that script would work.
The problem is that if the input file has this format
1 2 3 4
5 6 7 8
9 10 11 12
the result with that script is
9 5 1
10 6 2
11 7 3
12 8 4
while I'd like to have
1 5 9
2 6 10
3 7 11
4 8 12

In conclusion, I needed a script to convert columns to rows not to make a clockwise 90 turn
Anyway, thanks for your support

Last edited by bldcerealkiller; 11-04-2011 at 08:23 AM.
 
Old 11-04-2011, 09:41 AM   #7
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,475

Rep: Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888
Seems to work perfectly. May I ask if the file containing your data was created on Windows? If so, try running dos2unix over it first and then see what your results are.
 
Old 11-04-2011, 09:47 AM   #8
bldcerealkiller
LQ Newbie
 
Registered: Aug 2011
Posts: 16

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by grail View Post
Seems to work perfectly. May I ask if the file containing your data was created on Windows? If so, try running dos2unix over it first and then see what your results are.
Are you referring to the first script? anyway, I'm using a file previously created with awk on unix.
 
Old 11-04-2011, 11:06 AM   #9
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,475

Rep: Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888Reputation: 1888
I am referring to the file with numbers in it ... was it created in windows? As I said, the code provides the exact output you are requesting when I run it.
 
1 members found this post helpful.
Old 11-04-2011, 11:39 AM   #10
bldcerealkiller
LQ Newbie
 
Registered: Aug 2011
Posts: 16

Original Poster
Rep: Reputation: Disabled
I've started again from the beginning and now it's working!
Thanks everybody for your support!
 
Old 11-04-2011, 02:02 PM   #11
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942
Good. I could not reproduce any of your problems using my script at all. For me, it always yields the correct output, me every time. I even tried different awk variants, and files missing a final newline.

If you happen to have data files created in non-Linux/UNIX systems, you might wish to use
Code:
env LANG=C LC_ALL=C awk '
BEGIN { RS="[\t\n\v\f\r ]*[\n\r][\t\n\v\f\r ]*" ; FS="[\t\v\f ]+" ; SP=" " ; NL="\n" }
      { for (i = 1; i <= NF; i++) f[i] = f[i] SP $i ;
        if (NF > n) n = NF }
  END { for (i = 1; i <= n; i++) sub(/^  */, "", f[i]) ;
        for (i = 1; i <= n; i++) printf("%s%s", f[i], NL) }
      ' infile >outfile
The env command runs the awk script using the C (or POSIX) locale. Most Linux distributions use an UTF-8 locale by default, and at least GNU awk stops processing if it sees a non-UTF8 sequence in the input. Explicitly setting the locale avoids the issue totally. Explicitly using env means the above form will work regardless of the shell you are using.

In the input, the BEGIN rule sets new record separator (RS) and new field separator (FS). The record separator is any ASCII whitespace, including any type of newlines, that contains at least one newline (linefeed or carriage return). The field separator is any ASCII whitespace, not including newlines.

In the output, the SP (space, above) defines the separator between columns, and NL (newline, above) defines the separator between rows. These are also defined in the BEGIN rule.

Note that the script does not require the values to be numbers. It reads and writes each input token (word) as-is, without trying to parse them at all. Other than the env command setting the locale explicitly, and the BEGIN rule, the script is still the same as before.
 
  


Reply

Tags
awk


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] AWK: add columns while keep format for other columns cristalp Programming 3 10-13-2011 06:14 AM
Text file manipulation: selecting specific lines/columns using awk and print CHARL0TTE Linux - Newbie 2 02-27-2010 02:40 AM
awk print correct lines when certain columns are blank schneidz Programming 11 04-04-2008 04:06 PM
awk/gawk/sed - read lines from file1, comment out or delete matching lines in file2 rascal84 Linux - General 1 05-24-2006 09:19 AM
columns to lines Luskacik Linux - General 2 08-31-2004 11:38 AM


All times are GMT -5. The time now is 06:54 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration