LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 09-07-2009, 03:14 PM   #1
baidym
LQ Newbie
 
Registered: Oct 2007
Posts: 16

Rep: Reputation: 0
manipulating variable columns in shell or in perl


Hello all,

Having an ongoing battle with manipulating a string of numbers. I need to multiply all columns except 1 & 2 by 100 and output all columns except 2 in whole numbers as below.

If I have a file with
FNUM,SET,C1N,C2N,C3N,C1S,C2S,C3S,
4535, 109, 5.0709, 5.1546, 5.2002, 304.4215, 315.4393, 299.0198,
4536, 109, 5.1311, 5.2059, 5.2861, 282.5050, 295.5363, 288.6789,
4537, 109, 4.7416, 4.9326, 5.1422, 305.3368, 316.0573, 297.5717,

and I want an output in the form
FNUM C1N C2N C3N C1S C2S C3S
4535 507 515 520 30442 31543 29901
4536 513 520 528 28250 29553 28867
4537 474 493 514 30533 31605 29757

I can use
Quote:
in csh:
awk '{if(NR!=1) printf("%-6d%-6d%-6d%-6d\t%-6d%-6d%-6d\n",$1,($3*100),($4*100),($5*100),($6*100),($7*100),($8*100))}' $rfile > ${set}_tmp

or in perl:
system "awk \'{if(NR!=1) printf(\"%-6d%-6d%-6d%-6d\t%-6d%-6d%-6d\\n\",\$1,(\$3*100),(\$4*100),(\$5*100),(\$6*100),(\$7*100),(\$8*100))}\' $file > ${seq}_rtmp";
The problem is, every time I have a file with a different number of C values I have to change the script.
I want to be able to be able to use the script to do the same thing regardless of the number of columns without having to change the script every time. So if i had a file with 4 "C" values:

FNUM,SET,C1N,C2N,C1S,C2S,
4535, 109, 5.0709, 5.1546, 304.4215, 315.4393,
4536, 109, 5.1311, 5.2059, 282.5050, 295.5363,
4537, 109, 4.7416, 4.9326, 305.3368, 316.0573,

I could use the same script and get:
FNUM C1 C2 C1 C2
4535 507 515 30442 31543
4536 513 520 28250 29553
4537 474 493 30533 31605

Can anyone show me how to do this in shell or in perl?
Many thanks,
M.
 
Old 09-07-2009, 03:38 PM   #2
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
In awk the NF built-in variable stores the number of fields in every line, independently from the actual length of the line itself. Hence you can print any number of "C fields" using the same criteria, that is the very same awk script:
Code:
BEGIN { FS = "," }

NR == 1 {
  printf "%-6s", $1
  for ( i = 3; i <= NF-1; i++)
     printf "%-6s", $i
  print ""
}

NR > 1 {
  printf "%-6d", $1
  for ( i = 3; i <= NF-1; i++)
     printf "%-6d", $i * 100
  print ""  
}
The code above matches exactly your requirements, given you want to preserve the header in the output file and the fact that every line terminates with a comma.
 
Old 09-07-2009, 04:56 PM   #3
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
And, less elegantly than colucix's awk, here's a bash solution
Code:
#!/bin/bash
shopt -s extglob
line_no=0
while read line
do
    let line_no++
    output=''
    if [[ "$line_no" -eq 1 ]]; then
        # Parse the headings into an array
        IFS=',' headings=( $line )
        last_col_no="${#headings[*]}"
        for (( i = 0; i <= $last_col_no; i++ ))
        do  
            if [[ "$i" -ne 1 ]]; then  # Skip second column
                output="$output ${headings[$i]}"
            fi  
        done
        echo "${output# }" > output.txt
    else
        # Parse the numbers into an array
        IFS=',' numbers=( $line )
        for (( i = 0; i <= $last_col_no; i++ ))
        do  
            case $i in
                0 ) 
                    # First column: number is unchanged
                    number="${numbers[$i]##*( )}" 
                    output="$output $number"
                    ;;  
                1 ) 
                    # Second column: skip
                    ;;  
                * ) 
                    # Other columns: multiply number 100 and truncate to integer
                    number="${numbers[$i]##*( )}" 
                    if [[ "$number" != '' ]]; then  # Skip any empty columns
                        number="$( echo "$number * 100 / 1" | /usr/bin/bc )"
                    fi  
                    output="$output $number"
            esac
        done
        echo "${output##*( )}" >> output.txt
    fi  
done < input.txt
EDIT:
Code:
IFS=',' numbers=( $line )
is dangerous; it leaves IFS set to ",". See this post for an explanation.

Last edited by catkin; 09-23-2009 at 12:14 PM.
 
Old 09-08-2009, 09:30 AM   #4
baidym
LQ Newbie
 
Registered: Oct 2007
Posts: 16

Original Poster
Rep: Reputation: 0
awk code in a shell script

Thanks for the replies.

How can I use the awk in a c shell script? If I use:
Quote:
#!/bin/csh

awk 'BEGIN { FS = "," }

NR == 1 {
printf "%-6s", $1
for ( i = 3; i <= NF-1; i++)
printf "%-6s", $i
print ""
}

NR > 1 {
printf "%-6d", $1
for ( i = 3; i <= NF-1; i++)
printf "%-6d", $i * 100
print ""
}'
it comes back with unmatched '. Will I need to specify the infile within that string?

Thanks,
M
 
Old 09-08-2009, 09:43 AM   #5
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
C-shell does not interprets unmatched quotes as "continue to the next line until the closing quote" like bash do. You have to explicitly put the continuation character at the end of each line:
Code:
#!/bin/csh
awk 'BEGIN { FS = "," } \
\
NR == 1 { \
  printf "%-6s", $1 \
  for ( i = 3; i <= NF-1; i++ ) \
     printf "%-6s", $i \
  print "" \
} \
\
NR > 1 { \
  printf "%-6d", $1 \
  for ( i = 3; i <= NF-1; i++ ) \
     printf "%-6d", $i * 100 \
  print "" \
}' file
The input file has to be specified in the same way as one-line awk commands: put it as argument at the end of the last line (see "file" above).
 
Old 09-08-2009, 06:02 PM   #6
baidym
LQ Newbie
 
Registered: Oct 2007
Posts: 16

Original Poster
Rep: Reputation: 0
solved

Many thanks colucix!!
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Batch manipulating CSV columns and files in Perl script briana.paige Linux - Newbie 1 07-14-2009 11:02 AM
compare $php variable to indexed distinct mysql columns secretlydead Programming 1 02-18-2008 10:48 PM
Trying to write a perl script that will print shell variable ohcarol Programming 2 04-16-2007 08:02 AM
manipulating $PATH variable mfazi1612 Linux - Newbie 4 07-20-2004 11:22 PM
Manipulating the PATH variable. redgore Linux - General 2 07-11-2002 05:56 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:35 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration