-   Programming (
-   -   transposing rows to columns (

wilelucho 03-27-2013 02:29 PM

transposing rows to columns
Hi all,

I have the following problem,

I generated a file with 253 columns and 11323 rows, I want to generate separate files from each row, at the same time I need every file to be a single-column file. I tried with the following script to transpose both, the main matrix and the single row files that I also generated

#! /bin/sh
# Transpose a matrix: assumes all lines have same number
# of fields

exec awk '
NR == 1 {
        n = NF
        for (i = 1; i <= NF; i++)
                row[i] = $i
        if (NF > n)
                n = NF
        for (i = 1; i <= NF; i++)
                row[i] = row[i] " " $i
        for (i = 1; i <= n; i++)
                print row[i]
}' ${1+"$@"}

Nevertheless it gives me 11323 files (which is correct) with 251 rows instead of 253 as expected. I'm pretty new to awk, I used this script in the past and worked fine, I'm puzzled

Working on Ubuntu 12.04 LTS

colucix 03-27-2013 02:56 PM


Originally Posted by wilelucho (Post 4920086)
I'm pretty new to awk, I used this script in the past and worked fine, I'm puzzled

This make me think there's something weird in the input file. Formally the code looks correct (except it doesn't actually creates the files, does it?). Any chance the input file has been created on a windows system and/or has some hidden control character that triggers this strange behaviour?

danielbmartin 03-27-2013 03:16 PM

With this InFile ...

apple red honda panama
banana yellow toyota mexico
cherry blue subaru canada
fig brown bmw italy
lemon green volvo ireland
mango pink nissan china
peach tan chevrolet germany
grape black ford sweden

... this code ...

#  Daniel B. Martin  Mar13
#  To execute this program, launch a terminal sesson and enter:
#  bash /home/daniel/Desktop/LQfiles/dbm712.bin
# This program inspired by: 
#    transposing-rows-to-columns-4175455825/                               

# File identification
  Path=$(readlink -f $0 | cut -d'.' -f1)

awk '{gsub("[ \t]","\n"); print > "'"$OutFile"'"++k}' $InFile

echo; echo "Normal end of job."; echo; exit

... produces output files such as these ...
out.txt1 ...


out.txt2 ...


et cetera ...

This code copes with tab characters which may be in the input file.
All files (InFile, code, and the multiple OutFiles) will be in the same folder.
On my machine all files have similar names, including dbm712.
You would specify complete file identifiers appropriate to your application.

Daniel B. Martin

David the H. 03-28-2013 05:25 AM

No need to get all matrix-y here. All you really want to do is convert the space delimiters on each line into newlines, and print each line to a separate file.


awk '{ for (i=1;i<=NF;i++){ print $i > "file" NR ".txt" } }' infile.txt
There're probably easier ways to do it too, but this is the best I could do at short notice. Using gsub to convert the spaces to newlines would be another option.

And just to round it out, here's a quick bash loop too:

while read -ra line; do
    printf '%s\n' "${line[@]}" >"file$((n++)).txt"
done <infile.txt

grail 03-28-2013 09:16 AM

Even less need actually:

awk '{$1=$1;print > "file" NR ".txt"}' OFS="\n" file

David the H. 03-29-2013 06:18 AM

I knew it! Once again, grail comes through with the shortest solution.

Let me guess. Trying to print $0 alone fails, because it stores the original line intact, field separators and all. But if you modify the line in any way, then the OFS setting will apply to it. So just set one field to itself and suddenly it starts printing with newlines between them.

Now why didn't I think of that? :doh:

grail 03-29-2013 08:10 AM

It's ok David ... you get the kudos for all the other stuff .. I just get the awk one's now and then ;)

All times are GMT -5. The time now is 10:27 PM.