LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 11-15-2011, 11:46 AM   #1
Physicsphdsophia
LQ Newbie
 
Registered: Nov 2011
Posts: 3

Rep: Reputation: Disabled
Question Awk Problem in Deleting Fields from Lines


Hi Everyone,

I am a newbie to this forum .
I am stuck with some awk programming.
Basically, I have a huge file of data (1000*16384).

My objective is that for each
impair line (so line 1, 3, ...; lines starting with 1), I want to delete the entries that are smaller than say 0.1 . Furthermore, I also want to delete the entries in the pair lines (i.e. line 2, 4, ...) that occupy the index positions of the deleted entries of the previous line (so if entry/field i=5 was deleted in line 1 because it is smaller than 0.1, then I want the entry/field i=5 in line 2 to be also deleted regardless of its value). Here is an attempt

awk '{ for (j=1; j<=NR; j=j+2) { for (i=1; i<=NF; i++) { { if ($i<0.1) { sub($i,"") ; c = i ; NR == k} }; NR==k+1 ; sub($c,"") }} } } } print }' dataFile1.output > dataFile2.output

thanks in advance
 
Old 11-15-2011, 12:01 PM   #2
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Please, can you post an example of input and the related output you want to obtain? It would be far more clear what the problem is. Anyway, I see your code doesn't take advantage of the awk power. The loop:
Code:
for (j=1; j<=NR; j=j+2) ...
means that for each line of input it cycle over the odd numbers from 1 to the line number read so far with step 2, but it doesn't cycle over the lines themselves. Awk reads one line at a time and execute all the rules enclosed in brackets on every line. To distinguish between odd and even lines, you might do something like:
Code:
NR % 2 == 0 {
  #
  # This is an even line
  #
}
NR % 2 == 1 {
  #
  # This is an odd line
  #
}
 
1 members found this post helpful.
Old 11-15-2011, 12:09 PM   #3
Physicsphdsophia
LQ Newbie
 
Registered: Nov 2011
Posts: 3

Original Poster
Rep: Reputation: Disabled
Thanks colucix for your quick reply.

Yes, here is the sort of input / output I mean
input

0.20 0.30 0.05 0.22 0.12 0.07 0.08 0.14...
20.8 20.6 20.4 20.2 20.0 19.8 19.6 19.4
0.16 0.25 0.31 0.02 0.19 0.04 0.28 0.12
20.8 20.6 20.4 20.2 20.0 19.8 19.6 19.4
...

output
0.20 0.30 0.22 0.12 0.14...
20.8 20.6 20.2 20.0 19.4
0.16 0.25 0.31 0.19 0.28 0.12
20.8 20.6 20.4 20.0 19.6 19.4
...
 
Old 11-15-2011, 12:18 PM   #4
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Good. I would try something like this: odd lines: print the fields > 0.1 and store (remember) the index of the printed fields; even lines: print only the fields in the list of indexes stored above. Translated in awk:
Code:
NR % 2 == 1 {
  #
  # This is an odd line
  #
  for ( i = 1; i <= NF; i++ )
    if ( $i > 0.1 ) {
      printf "%s ", $i
      #
      #  We want to preserve the i-th field in the next (even) line, so
      #  we store it as index of the array _
      #
      _[i]++
    }
  printf "\n"
}
NR % 2 == 0 {
  #
  # This is an even line
  #
  for ( i = 1; i <= NF; i++ )
    if ( i in _ )
      printf "%s ", $i
  printf "\n"
  #
  #  Forget about previously stored indexes now!
  #
  delete _
}
The delete statement is mandatory, so that after every pair of lines the stored indexes are forgotten and the _ is recreated upon reading the next line. Hope this helps.
 
1 members found this post helpful.
Old 11-17-2011, 09:26 AM   #5
Physicsphdsophia
LQ Newbie
 
Registered: Nov 2011
Posts: 3

Original Poster
Rep: Reputation: Disabled
Thanks a lot colucix, it worked!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] deleting fields in lines that have more fields than the average patolfo Linux - Software 4 09-14-2011 11:03 AM
awk loops and deleting lines skray Programming 5 06-08-2009 11:58 AM
Deleting lines based on comparing fields..... OldGaf Programming 2 02-22-2008 07:04 AM
AWK - why fields go to seperate lines? korhan Linux - Newbie 2 03-01-2007 03:21 PM
awk to remove first 3 lines and print remaining $1, $2 fields phyx Linux - General 1 01-10-2007 05:21 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:37 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration