LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 02-07-2012, 03:07 PM   #1
dwarf
LQ Newbie
 
Registered: Feb 2012
Location: Frankfurt
Posts: 5

Rep: Reputation: Disabled
Removing lines containing "0"s


Hi experts,
I have a text file consisting five columns (tab separated). The first column is a timestamp; the other columns have values between +4000.0 and -4000.0 .I like to remove all the lines, which apart from the timestamp, have only "0" values.
Do you have any suggestion how to do this easily (awk or sed)?

Thanks in advance!
 
Old 02-07-2012, 03:11 PM   #2
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,811
Blog Entries: 1

Rep: Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191
Hi and welcome to LQ.

What have you tried so far? Can you post the code so that we could help you?

edit: Additionally, can you post a sample of your data (including some lines which need to be removed)?

Last edited by sycamorex; 02-07-2012 at 03:14 PM.
 
Old 02-07-2012, 03:31 PM   #3
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
A suggestion:
Code:
awk '$2+$3+$4+$5' file
This uses http://www.gnu.org/software/gawk/man...l#Truth-Values. To edit the file in place:
Code:
awk '$2+$3+$4+$5 {print > FILENAME}' file
 
Old 02-08-2012, 04:11 AM   #4
dwarf
LQ Newbie
 
Registered: Feb 2012
Location: Frankfurt
Posts: 5

Original Poster
Rep: Reputation: Disabled
OK,
here's what I've tried so far:

Quote:
grep "[0-9][0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9]" input_file | awk 'BEGIN {FS="\t"}{ if(($2 >0) || ($3 >0) || ($4 >0) || ($5 >0)) print}' > input_filtered.txt
It returns the correct values but only in a positive range. When I try to filter for negative values ( eg. $2 < 0) it returns everything including "0" values.
There must be a way to exclude the zeros.

Regards
 
Old 02-08-2012, 05:31 AM   #5
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
Difficult to tell without seeing a sample of the input data. Please, post it enclosed in CODE tags (not QUOTE) to preserve formatting.
 
Old 02-08-2012, 02:38 PM   #6
dwarf
LQ Newbie
 
Registered: Feb 2012
Location: Frankfurt
Posts: 5

Original Poster
Rep: Reputation: Disabled
Please find attached a sample file to understand what I am talking about.
Attached Files
File Type: txt input_file.txt (1.1 KB, 15 views)
 
Old 02-08-2012, 02:51 PM   #7
Harlin
Member
 
Registered: Dec 2004
Location: Atlanta, GA U.S.
Distribution: I play with them all :-)
Posts: 316

Rep: Reputation: 30
You're only wanting to keep the timestamps and that's it?
 
Old 02-08-2012, 03:19 PM   #8
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
The solution suggested in post #3 should work. Otherwise, please describe what do you want to achieve in more details. Example:
Code:
$ cat input_file.txt 
2012.012.00:01.000      0       100     0       0
2012.012.00:01.001      0       0       2331    -400
2012.012.01:01.002      0       88      0       -423
2012.012.01:01.003      0       85      0       0
2012.012.01:01.004      0       0       0       -437
2012.012.02:01.005      0       83      2299    0
2012.012.03:01.006      0       0       0       0
2012.012.03:01.007      0       0       0       0
2012.012.03:01.008      0       0       0       -223
$ awk '$2+$3+$4+$5' input_file.txt
2012.012.00:01.000      0       100     0       0
2012.012.00:01.001      0       0       2331    -400
2012.012.01:01.002      0       88      0       -423
2012.012.01:01.003      0       85      0       0
2012.012.01:01.004      0       0       0       -437
2012.012.02:01.005      0       83      2299    0
2012.012.03:01.008      0       0       0       -223
The two lines in shaded brown (containing only zeroes) are removed from the output. Moreover, notice that you might omit the FS specification since TAB is one of the default separators in awk. From the GNU awk user's guide:
Quote:
By default, fields are separated by whitespace, like words in a line. Whitespace in awk means any string of one or more spaces, TABs, or newlines;
...
In POSIX awk, newlines are not considered whitespace for separating fields.
 
Old 02-08-2012, 03:30 PM   #9
barnac1e
Member
 
Registered: Jan 2012
Location: Moorhead, Minnesota, USA (birthplace of Slackware, ironically)
Distribution: openSuSE 13.1 - KDE
Posts: 234
Blog Entries: 1

Rep: Reputation: 9
In awk, try

match(str, regex)
match(str, regex, [, array]) {G}
 
Old 02-08-2012, 03:31 PM   #10
Cedrik
Senior Member
 
Registered: Jul 2004
Distribution: Slackware
Posts: 2,140

Rep: Reputation: 242Reputation: 242Reputation: 242
I don't know if it would be possible in those data, that 2,3,4,5 addition could result to zero with all numbers not necessarly equal to zero, I mean:
Code:
2012.012.01:01.003      0       85      0       -85
In this case, maybe it's better to do something like:
Code:
awk '$2$3$4$5 != "0000"' input_file.txt

# or maybe safer
awk '$2 || $3 || $4 || $5' input_file.txt

Last edited by Cedrik; 02-08-2012 at 03:45 PM.
 
1 members found this post helpful.
Old 02-09-2012, 04:58 AM   #11
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
Thanks, Cedrik. You're absolutely right! I should have thought about this possibility.
 
Old 02-09-2012, 01:41 PM   #12
dwarf
LQ Newbie
 
Registered: Feb 2012
Location: Frankfurt
Posts: 5

Original Poster
Rep: Reputation: Disabled
Thanks colucix and Cedrik for the answers.
I've tested:
Code:
$ awk '$2+$3+$4+$5' input_file.txt
awk '$2 || $3 || $4 || $5' input_file.txt  and
awk '$2$3$4$5 != "0000"' input_file.txt
with openSUSE 12.1

All of them return what I was looking for. I will check tomorrow on a Solaris 8 machine.
The 3rd piece of code makes sense to me but for the other two it's not clear for me why I get this result, -
and why my first attempt
Code:
awk 'BEGIN {FS="\t"}{ if(($2 !=0) || ($3 !=0) || ($4 !=0) || ($5 !=0)) print}'> output_file.txt
doesn't work.
Regards,
 
Old 02-09-2012, 04:35 PM   #13
Cedrik
Senior Member
 
Registered: Jul 2004
Distribution: Slackware
Posts: 2,140

Rep: Reputation: 242Reputation: 242Reputation: 242
Your first attempt didn't work because:

This expression is fine:
Code:
awk 'BEGIN {FS="\t"}{ if(($2 !=0) || ($3 !=0) || ($4 !=0) || ($5 !=0)) print}'
But your grep reg expression (which was not needed btw) does not match:
Code:
[0-9][0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9]
It matches:

<4 digits number>
<any char>
<3 digits number>
<any char>
<2 digits number>
<any char>
<2 digits number>
<any char>
<2 digits number>
<any char>
<3 digits number>

This one matches:
Code:
^[0-9]\{4\}\.[0-9]\{3\}\.[0-9][0-9]:[0-9][0-9]\.[0-9]\{3\}

<4 digits number>
(at the start of line: ^)
<a '.' char>
<3 digits number>
<a '.' char>
<2 digits number>
< a ':' char>
<2 digits number>
<a '.' char>
<3 digits number>

Last edited by Cedrik; 02-09-2012 at 04:46 PM.
 
Old 02-10-2012, 03:49 AM   #14
dwarf
LQ Newbie
 
Registered: Feb 2012
Location: Frankfurt
Posts: 5

Original Poster
Rep: Reputation: Disabled
I've checked different syntaxes which is working on openSUSE on a solaris8 machine and none of them are working. They return an error message:
awk: syntax error near line 1
awk: bailing out near line 1
I've checked the
Code:
awk 'BEGIN {FS="\t"}{ if(($2 !=0) || ($3 !=0) || ($4 !=0) || ($5 !=0)) print}' input_file
which returns the original input_file without any changes.
Any conclusions what different on solaris?
 
Old 02-10-2012, 06:06 AM   #15
Cedrik
Senior Member
 
Registered: Jul 2004
Distribution: Slackware
Posts: 2,140

Rep: Reputation: 242Reputation: 242Reputation: 242
If think on Solaris 8, you should use /usr/bin/nawk instead of /usr/bin/awk

or if you have perl installed
Code:
perl -ane 'print if grep {$_} @F[1..4]' input_file.txt

Last edited by Cedrik; 02-10-2012 at 06:10 AM.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
how can I "cat" or "grep" a file to ignore lines starting with "#" ??? callagga Linux - Newbie 7 08-16-2013 07:58 AM
Removing the Enlightenment Sound Daemon (aka "esd" aka "esound") jgombos Debian 4 03-30-2010 03:33 PM
bash - how to remove lines from "FILE_A" which presents in "FILE_B" ? Vilmerok Programming 4 03-13-2009 05:27 AM
"Could not init font path element""Unix /: 7100 removing from list/ zameer_india Linux - Networking 7 07-03-2006 07:11 AM
Removing "shutdown" and "reboot" in logout window nearfar Red Hat 1 10-07-2003 12:33 PM


All times are GMT -5. The time now is 04:30 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration