LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-06-2011, 06:44 PM   #1
lethalfang
LQ Newbie
 
Registered: Jun 2011
Location: San Francisco, CA
Posts: 11

Rep: Reputation: Disabled
Question Remove lone lines from a text file


Hey, anyone has ideas how to remove lone lines from a text file?

If I have a file that is like this:
-----------------------------------
line 1
line 2
line 3

line 4

line 5
line 6

line 7

line 8
line 9
line 10
-----------------------------------

What command(s) will remove the lone lines of this file, i.e., line 4 and line 7?

Thanks in advance.

Last edited by lethalfang; 07-09-2011 at 08:41 PM. Reason: solved
 
Old 07-06-2011, 07:41 PM   #2
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
Slightly modified from the sed FAQ here:
Code:
sed ': more;$!N;s/\n/&/2;t enough;$!b more;: enough;/^\n.*\n$/d;P;D' file
This deletes also the blank lines around the single line. If you want to preserve them:
Code:
sed ': more;$!N;s/\n/&/2;t enough;$!b more;: enough;s/^\n.*\n$/\n/;P;D' file
If you want to keep only one blank line:
Code:
sed ': more;$!N;s/\n/&/2;t enough;$!b more;: enough;s/^\n.*\n$//;P;D' file
Hope this helps.
 
1 members found this post helpful.
Old 07-06-2011, 07:43 PM   #3
rojak
LQ Newbie
 
Registered: May 2006
Location: Singapore
Distribution: Ubuntu 9.10
Posts: 16

Rep: Reputation: 0
Try: $ sed -i.bk '/^$/ d' myfile
 
Old 07-06-2011, 07:57 PM   #4
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,811
Blog Entries: 1

Rep: Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191
Quote:
Originally Posted by rojak View Post
Try: $ sed -i.bk '/^$/ d' myfile
That doesn't delete lines containing just spaces or tabs.
 
Old 07-06-2011, 08:17 PM   #5
lethalfang
LQ Newbie
 
Registered: Jun 2011
Location: San Francisco, CA
Posts: 11

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by colucix View Post
Slightly modified from the sed FAQ here:
Code:
sed ': more;$!N;s/\n/&/2;t enough;$!b more;: enough;/^\n.*\n$/d;P;D' file
This deletes also the blank lines around the single line. If you want to preserve them:
Code:
sed ': more;$!N;s/\n/&/2;t enough;$!b more;: enough;s/^\n.*\n$/\n/;P;D' file
If you want to keep only one blank line:
Code:
sed ': more;$!N;s/\n/&/2;t enough;$!b more;: enough;s/^\n.*\n$//;P;D' file
Hope this helps.
Thanks. This kinda works, but when there are multiple lone lines, it seems to only delete one at a time. For example:
------
line 1
line 2

line 3

line 4

line 5
line 6
------

The script gets rid of line 3, but not line 4.
Is there any way to get rid of all lone lines at once?

Thanks.
 
Old 07-07-2011, 01:43 PM   #6
sandwormusmc
Member
 
Registered: Nov 2006
Distribution: Fedora 15 x86_64
Posts: 76

Rep: Reputation: 24
Quote:
Originally Posted by lethalfang View Post
Thanks. This kinda works, but when there are multiple lone lines, it seems to only delete one at a time. For example:
------
line 1
line 2

line 3

line 4

line 5
line 6
------

The script gets rid of line 3, but not line 4.
Is there any way to get rid of all lone lines at once?

Thanks.
I actually found a ridiculously easy way to do this a while back that made me say "duh" at the way I'd been doing it (complex sed commands and whatnot).

Try:

Code:
 # grep . myfile
Then you can redirect that to a temp file and remove the old one if necessary ...
 
1 members found this post helpful.
Old 07-07-2011, 01:47 PM   #7
lethalfang
LQ Newbie
 
Registered: Jun 2011
Location: San Francisco, CA
Posts: 11

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by sandwormusmc View Post
I actually found a ridiculously easy way to do this a while back that made me say "duh" at the way I'd been doing it (complex sed commands and whatnot).

Try:

Code:
 # grep . myfile
Then you can redirect that to a temp file and remove the old one if necessary ...
This just gets rid of empty lines?
I'm wondering if I can get rid of the lines that are empty above and below.
 
Old 07-07-2011, 04:24 PM   #8
sandwormusmc
Member
 
Registered: Nov 2006
Distribution: Fedora 15 x86_64
Posts: 76

Rep: Reputation: 24
Guess I'm confused on what you mean by "lone lines". I assumed you meant empty lines, but are you saying you want to remove specific lines? As in "remove arbitrary line X and Y" from a set of input?
 
Old 07-07-2011, 04:58 PM   #9
lethalfang
LQ Newbie
 
Registered: Jun 2011
Location: San Francisco, CA
Posts: 11

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by sandwormusmc View Post
Guess I'm confused on what you mean by "lone lines". I assumed you meant empty lines, but are you saying you want to remove specific lines? As in "remove arbitrary line X and Y" from a set of input?
Yep. Basically, if a line has an empty line both above and beneath, I want that line removed.

I actually wrote a tedious and rudimentary script to do that. It kinda works, but it's totally inefficient. I can write some rudimentary bash scripts, but I'm not all that good at it.

Code:
#!/bin/bash

file=$1

# Get the total number of lines in the file
num_lines=$(cat $file | wc -l)

line_j=1

empty_var=""

while [ $line_j -le $num_lines ]
do

   # line_i is the line before line_j, and line_k is the line after line_j.
   line_i=$(( $line_j - 1 ))
   line_k=$(( $line_j + 1 ))   

   # see if those lines are empty
   val_i=$(cat $file | awk 'NR=='$line_i'' | awk '{print $1}' )
   val_k=$(cat $file | awk 'NR=='$line_k'' | awk '{print $1}' )


      if [ $val_i = $empty_var -a $val_k = $empty_var ]

         then true

      else
   
         cat $file | awk 'NR=='$line_j'' >> Duplicate_$file

      fi



   line_j=$(( $line_j + 1 ))


done
Besides, the terminal keep popping up
"./identify_duplicates.sh: line 22: [: too many arguments"
for this line of code
"if [ $val_i = $empty_var -a $val_k = $empty_var ]"

Last edited by lethalfang; 07-07-2011 at 05:00 PM.
 
Old 07-07-2011, 08:11 PM   #10
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,240

Rep: Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324
You can add
Code:
set -xv
as the 2nd line to see what the script is actually doing.
 
1 members found this post helpful.
Old 07-07-2011, 10:41 PM   #11
lethalfang
LQ Newbie
 
Registered: Jun 2011
Location: San Francisco, CA
Posts: 11

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by chrism01 View Post
You can add
Code:
set -xv
as the 2nd line to see what the script is actually doing.
Ahh, that's a good tip.
The error messages went away when I changed
Code:
if [ $val_i = $empty_var -a $val_k = $empty_var ]
into
Code:
if [ "$val_i" = "$empty_var" -a "$val_k" = "$empty_var" ]
The issue seemed to be that, when $val_i has an non-empty value, say, "STUFF," the code was reading '[' STUFF = ']', i.e., something being compared to nothing. It's a messed up inequality, but the equality test failed anyway, so the previous code did its job.

Now does anyone have a more efficient one-liner for that stuff? :-)
 
Old 07-07-2011, 10:50 PM   #12
Diantre
Member
 
Registered: Jun 2011
Distribution: Slackware
Posts: 495

Rep: Reputation: 212Reputation: 212Reputation: 212
Quote:
Originally Posted by lethalfang View Post
Now does anyone have a more efficient one-liner for that stuff? :-)
I'm not sure if this is more efficient or not, but it's a one liner that does the same:

Code:
[ -z "$val_i" -a -z "$val_k" ] && cat $file | awk 'NR=='$line_j'' >> Duplicate_$file
 
1 members found this post helpful.
Old 07-07-2011, 11:47 PM   #13
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Arch
Posts: 3,013

Rep: Reputation: 1225Reputation: 1225Reputation: 1225Reputation: 1225Reputation: 1225Reputation: 1225Reputation: 1225Reputation: 1225Reputation: 1225
Quote:
Originally Posted by lethalfang View Post
Now does anyone have a more efficient one-liner for that stuff? :-)
Getting close to the edge of what you can reasonably call a "one-liner", but yes:
Code:
awk '{l3=l2;l2=l1;l1=$0}NR>=2&&!(l3==""&&l2!=""&&l1==""){print l2}END{print}' $file >> Duplicate_$file
 
1 members found this post helpful.
Old 07-08-2011, 12:02 AM   #14
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,253

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
Well its not pretty and can probably be condensed, but this seems to work:
Code:
awk 'x && NF{ y=1 }y{ print x }{if(NF)x = $0;else{ if(y)print; x = y = 0}}END{if(y)print x}' file
 
1 members found this post helpful.
Old 07-08-2011, 12:47 AM   #15
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,240

Rep: Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324
Re post #11 Double Brackets [[ ]] work better http://tldp.org/LDP/abs/html/testcon...ml#DBLBRACKETS
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Remove lines in text file that contain two '@' symbols xsyntax Linux - Newbie 5 12-07-2009 06:58 PM
Remove lines in a text file based on another text file asiandude Programming 10 01-29-2009 11:59 AM
Adding lines of text to beginning of a text file BillKat Programming 2 01-19-2009 11:40 AM
Grab text lines in text file LULUSNATCH Programming 1 12-02-2005 11:55 AM
Remove odd lines from a text file Mr. Gone Programming 2 09-19-2005 12:16 PM


All times are GMT -5. The time now is 10:31 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration