LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 02-23-2012, 03:33 PM   #1
JJB83
LQ Newbie
 
Registered: Feb 2012
Posts: 1

Rep: Reputation: Disabled
need to make a shell script, transpose a matrix, and average across rows


I'm new-ish to Linux, thanks for any help and sorry if this has been asked somewhere else.

1) I need to transpose the rows and columns of a matrix. I've been trying to use the code suggested here:
http://www.unix.com/shell-programmin...ig-data-3.html

but I don't know how to follow the instruction to "save the code to a file, make it executable, and invoke it with one argument, the name of the data file."
I tried saving the code to a text file called transpose, giving it a .sh extension. i also did 'chmod ugo+x tranpose.sh' and verified with 'ls -l' that it is an executable (also it showed up in green text at that point). nevertheless it tells me when I tried " ./transpose.sh "
that 'command not found.'

i tried the same syntax on another executable .sh file and it worked...then i pasted this code into that file and it stopped working and told me command not found again. what's going on?

2) is there a better way to transpose than the code i'm using? like the OP in that thread i have a huge file, ~8500 rows and several hundred thousand columns.

3) after transposing (or before) i need to calculate, for each row, both the average of the raw values and the average of the square of the raw values. can linux do this kind of simple math?

Thanks very much for your help.
 
Old 02-23-2012, 03:58 PM   #2
suicidaleggroll
LQ Guru
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 5,258

Rep: Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947Reputation: 1947
Shell scripting is mostly used for simple file manipulation, string manipulation, etc. I think you would be much better served writing a quick C, Fortran, Perl, Python, etc. code to do the inversion of an 8500x###### matrix along with row averaging. You could probably pull it off with shell scripting, but it's going to be much slower and more cumbersome than it has to be.

Last edited by suicidaleggroll; 02-23-2012 at 04:01 PM.
 
Old 02-23-2012, 10:29 PM   #3
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,576
Blog Entries: 31

Rep: Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195
The code:
Code:
#!/bin/sh

infile="$1"

awk '
    BEGIN {
        getline
        l=length(NF)
        for (i=1; i<=NF; i++) {
            f[i]=sprintf("%"l"s", i) ".col"
            gsub(" ", "0", f[i])
            printf("%s", $i) > f[i]
            close(f[i])
        }
    }

    {
        for (i=1; i<=NF; i++) {
            printf("%s%s", FS, $i) >> f[i]
            close(f[i])
        }
    }

    END {
        for (i in f) {
            printf("\n") >> f[i]
            close(f[i])
        }
    }' "$infile"

for f in *.col; do
    echo "$f"
done | xargs cat > outfile
The input file:
Code:
IID    PAT    MAT    SEX    PHENOTYPE    rs15286_1    rs319_1    rs80300_1    rs40777_1    rs8597_1    rs5136_1    rs60595_1    rs64968_1    rs4405_1    rs1554_1
TD-MIKV    0 0 2 1 1 0 0 1 0 1 0 1 1 0
TD-HA4Q 0 0 2 1 1 0 0 0 0 0 0 0 0 0
TD-H9ZG 0 0 2 2 0 0 0 1 0 0 0 0 0 0
TD-HAQX 0 0 2 1 0 0 0 2 0 0 0 0 0 0
TD-HA5E 0 0 2 2 0 1 1 1 0 0 0 1 1 0
TD-MGFV 0 0 2 2 1 0 0 0 0 NA 0 0 0 1
TD-HB4V 0 0 2 1 0 0 1 0 1 NA 0 1 1 0
TD-MIPE 0 0 2 2 0 0 0 1 0 0 0 0 0 0
TD-MINR 0 0 2 2 0 0 0 0 0 2 0 1 1 0
Runs OK for me.

The only commands are /bin/sh, awk, xargs and cat which are very common so the error message is puzzling if that is the exact code in your script.
 
Old 02-23-2012, 11:35 PM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,252

Rep: Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685
Do I guess correctly that the file being transposed is too big to run in memory?

On the trivial example I would simply use:
Code:
#!/usr/bin/awk -f

{
    for ( i=1; i <=NF;i++ )
        row[i] = row[i]((row[i])?" ":"")$i
}

END{
    for ( x = 1; x <= length(row); x++ )
        print row[x]
}
And redirect to a file
 
Old 02-24-2012, 12:42 AM   #5
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,240

Rep: Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324
I agree that shell is maybe not the best lang.
Re orig qn, you do know that you have a typo there between tranpose & transpose.
Maybe that was your problem.
Do an ls and copy/paste the result to the cmd line
 
Old 02-24-2012, 01:11 AM   #6
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,576
Blog Entries: 31

Rep: Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195
Quote:
Originally Posted by grail View Post
Do I guess correctly that the file being transposed is too big to run in memory?
Yes -- the discussion linked in the OP is about very large matrices.
 
Old 02-24-2012, 03:39 AM   #7
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,252

Rep: Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685Reputation: 2685
Thanks for that catkin ... how about:
Code:
#!/usr/bin/awk -f

{
    for ( i=1; i <= NF; i++ ){
        printf("%s%s",(NR == 1)?"":FS,$i) >> "part"i
        close("part"i)
    }
}

END{
    while( ++x <= NF ){
        file = "part"x
        getline line < file
        print line > "outfile"
        close(file)
        print | "rm "file
    }

    close("outfile")
}
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] How to make several "renames" by inoking a single command? Make a shell script file? Foxbat1155 Linux - Newbie 20 11-25-2011 10:13 AM
[SOLVED] Transpose multiple rows into a single column wonjusup Linux - Newbie 13 04-09-2011 07:53 AM
shell script for matrix operation dynamics Programming 6 09-19-2010 09:34 PM
Reading a .CSV file and then calculating average per minute basis in shell script. krishdeeps Linux - Newbie 1 04-23-2010 05:38 PM
Calculate average from csv file in shell script khairilthegreat Linux - Newbie 5 11-21-2007 01:57 PM


All times are GMT -5. The time now is 04:00 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration