LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 03-14-2011, 10:56 PM   #1
Stevy12
LQ Newbie
 
Registered: Mar 2011
Posts: 3

Rep: Reputation: 1
Question Remove lines with sed


Hi

I have a large file and need to remove all the lines containing symbol/symbols.

For example: . , ! " # $ % & / ( ) = ? ' + * { } ] [ - _ : ; , > < (maybe more)

Thanks in advance!
 
Old 03-14-2011, 11:39 PM   #2
GlennsPref
Senior Member
 
Registered: Apr 2004
Location: Brisbane, Australia
Distribution: Mageia Studio-13.37 Kubuntu.
Posts: 3,325
Blog Entries: 33

Rep: Reputation: 199Reputation: 199
Quote:
Hi, Welcome to LQ!

LQ has a fantastic search function that may save you time waiting for an answer to a popular question.

With over 4 million posts to search it's possible the answer has been given.

Some tutes here from IBM.

The first one may be the one.
what you want to do is remove all chars except [A-Z] [a-z] [0-9]
regexp would look like...[^A-Z] [^a-z] [^0-9]

http://www.ibm.com/developerworks/li...ry/l-sed2.html

http://www.ibm.com/developerworks/vi...for+a+new+user

http://www.regular-expressions.info

http://sed.sourceforge.net/sed1line.txt

Regards Glenn
 
1 members found this post helpful.
Old 03-15-2011, 12:10 AM   #3
Stevy12
LQ Newbie
 
Registered: Mar 2011
Posts: 3

Original Poster
Rep: Reputation: 1
Hi GlennsPref

I don't want to remove chars, I want to remove the whole lines containing one or more symbols.
I already read about sed at some spanish sites, by the way I can't understand english tutorials then it's a bit hard to learn at these sites.

Anyway, I will take a look.
 
Old 03-15-2011, 12:34 AM   #4
GlennsPref
Senior Member
 
Registered: Apr 2004
Location: Brisbane, Australia
Distribution: Mageia Studio-13.37 Kubuntu.
Posts: 3,325
Blog Entries: 33

Rep: Reputation: 199Reputation: 199
using the regexp, [A-Z][a-z][0-9]

Code:
# print only lines which match regular expression (emulates "grep")
 sed -n '/regexp/p'           # method 1
 sed '/regexp/!d'             # method 2
and redirect the output to a file.

Code:
sed -n '/[A-Z][a-z][0-9]/p' > ~/filename
All lines containing symbols do not appear in the out put.

Or the other way 'round.
Code:
# print only lines which do NOT match regexp (emulates "grep -v")
 sed -n '/regexp/!p'          # method 1, corresponds to above
 sed '/regexp/d'              # method 2, simpler syntax
This is the theory of it anyway. (awk, grep, sed and vi)
Hope that helps, you'll have to experiment.

Cheers Glenn
 
1 members found this post helpful.
Old 03-15-2011, 01:42 AM   #5
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,589

Rep: Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942
The 'd' flag is indeed the easier option and the easier regex is a character class:
Code:
sed '/[^[:alnum:]]/d' file
Include '-i' option to make the change to the file or redirect if an alternate file required.
 
Old 03-15-2011, 04:34 AM   #6
GlennsPref
Senior Member
 
Registered: Apr 2004
Location: Brisbane, Australia
Distribution: Mageia Studio-13.37 Kubuntu.
Posts: 3,325
Blog Entries: 33

Rep: Reputation: 199Reputation: 199
Hmm, I got this, still looking for something more succinct.

Code:
bash-4.1$ sed -e '/[^A-Z][^a-z][^0-9]/d' -e '/:/d; / *#/d; /^ *$/d' /home/glenn/build/scripting/filename1
checking sed to remove lines with symbols 0123456789
 sed G
bash-4.1$
input file, /home/glenn/build/scripting/filename1

Code:
http://sed.sourceforge.net/sed1line.txt
-------------------------------------------------------------------------
USEFUL ONE-LINE SCRIPTS FOR SED (Unix stream editor)        Dec. 29, 2005
Compiled by Eric Pement - pemente[at]northpark[dot]edu        version 5.5

Latest version of this file (in English) is usually at:
   http://sed.sourceforge.net/sed1line.txt
   http://www.pement.org/sed/sed1line.txt

This file will also available in other languages:
  Chinese     - http://sed.sourceforge.net/sed1line_zh-CN.html
  Czech       - http://sed.sourceforge.net/sed1line_cz.html
  Dutch       - http://sed.sourceforge.net/sed1line_nl.html
  French      - http://sed.sourceforge.net/sed1line_fr.html
  German      - http://sed.sourceforge.net/sed1line_de.html
  Italian     - (pending)
  Portuguese  - http://sed.sourceforge.net/sed1line_pt-BR.html
  Spanish     - (pending)

#ref. http://www.ibm.com/developerworks/linux/library/l-sed2.html
#sed script that will remove HTML tags from a file
sed -e 's/<[^>]*>//g' myfile.html
checking sed to remove lines with symbols 0123456789
# Rem blank lines and # comments

# Use following sed magic to remove both comments and empty lines at the same expense:

sed '/ *#/d; /^ *$/d'

#SED processes whatever you give it, and displays it on "STDOUT"---by default, your terminal window. It does not change filenames---that is done with the "mv" command.

#why "ls -d" ?

#I think you need something like this:
for filename in *; do newname= $(sed 's/+//g' $filename); mv $filename $newname; done

To drill down in the directory tree, use "$(ls -R) instead of "*"

sed -e '/[^.][^,][^!][^"][^#][^$][^%][^&][^/][^(][^)][^=][^?][^][^][^'][^][^+][^*][^][^{][^}][^]][^[][^-][^_][^:][^]][:blank:][:alnum:]/d' /home/glenn/filename1
sed s -e '/[^\.][^\,][^\!][^\"][^\#][^\$][^\%][^\&][^\/][^\(][^\)][^\=][^\?][^\][^\][^\'][^\][^\+][^\*][^\][^\{][^\}][^\]][^\[][^\-][^\_][^\:][^\]][:blank:][:alnum:]/d' /home/glenn/filename1
sed -e '/[[:blank:]][[:alnum:]]/d' /home/glenn/filename1
cat /home/glenn/filename1 | sed -d '/#\.\*\[\]\\\/\$\^\-\_\?/d'
cat /home/glenn/filename1 | sed -e '/#\*\[\]\\/d'
cat /home/glenn/filename1 | sed -e '/#\.\*\[\]\\\/\$\^\-\_\?/d'



FILE SPACING:

 # double space a file
 sed G

 # double space a file which already has blank lines in it. Output file
 # should contain no more than one blank line between lines of text.
 sed '/^$/d;G'

 # triple space a file
 sed 'G;G'

 # undo double-spacing (assumes even-numbered lines are always blank)
 sed 'n;d'

 # insert a blank line above every line which matches "regex"
 sed '/regex/{x;p;x;}'

 # insert a blank line below every line which matches "regex"
 sed '/regex/G'

 # insert a blank line above and below every line which matches "regex"
 sed '/regex/{x;p;x;G;}'

NUMBERING:

 # number each line of a file (simple left alignment). Using a tab (see
 # note on '\t' at end of file) instead of space will preserve margins.
 sed = filename | sed 'N;s/\n/\t/'

 # number each line of a file (number on left, right-aligned)
 sed = filename | sed 'N; s/^/     /; s/ *\(.\{6,\}\)\n/\1  /'

 # number each line of file, but only print numbers if line is not blank
 sed '/./=' filename | sed '/./N; s/\n/ /'

 # count lines (emulates "wc -l")
 sed -n '$='

TEXT CONVERSION AND SUBSTITUTION:

 # IN UNIX ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format.
 sed 's/.$//'               # assumes that all lines end with CR/LF
 sed 's/^M$//'              # in bash/tcsh, press Ctrl-V then Ctrl-M
 sed 's/\x0D$//'            # works on ssed, gsed 3.02.80 or higher

 # IN UNIX ENVIRONMENT: convert Unix newlines (LF) to DOS format.
 sed "s/$/`echo -e \\\r`/"            # command line under ksh
 sed 's/$'"/`echo \\\r`/"             # command line under bash
 sed "s/$/`echo \\\r`/"               # command line under zsh
 sed 's/$/\r/'                        # gsed 3.02.80 or higher

 # IN DOS ENVIRONMENT: convert Unix newlines (LF) to DOS format.
 sed "s/$//"                          # method 1
 sed -n p                             # method 2

 # IN DOS ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format.
 # Can only be done with UnxUtils sed, version 4.0.7 or higher. The
 # UnxUtils version can be identified by the custom "--text" switch
 # which appears when you use the "--help" switch. Otherwise, changing
 # DOS newlines to Unix newlines cannot be done with sed in a DOS
 # environment. Use "tr" instead.
 sed "s/\r//" infile >outfile         # UnxUtils sed v4.0.7 or higher
 tr -d \r <infile >outfile            # GNU tr version 1.22 or higher

 # delete leading whitespace (spaces, tabs) from front of each line
 # aligns all text flush left
 sed 's/^[ \t]*//'                    # see note on '\t' at end of file

 # delete trailing whitespace (spaces, tabs) from end of each line
 sed 's/[ \t]*$//'                    # see note on '\t' at end of file

 # delete BOTH leading and trailing whitespace from each line
 sed 's/^[ \t]*//;s/[ \t]*$//'

 # insert 5 blank spaces at beginning of each line (make page offset)
 sed 's/^/     /'

 # align all text flush right on a 79-column width
 sed -e :a -e 's/^.\{1,78\}$/ &/;ta'  # set at 78 plus 1 space

 # center all text in the middle of 79-column width. In method 1,
 # spaces at the beginning of the line are significant, and trailing
 # spaces are appended at the end of the line. In method 2, spaces at
 # the beginning of the line are discarded in centering the line, and
 # no trailing spaces appear at the end of lines.
 sed  -e :a -e 's/^.\{1,77\}$/ & /;ta'                     # method 1
 sed  -e :a -e 's/^.\{1,77\}$/ &/;ta' -e 's/\( *\)\1/\1/'  # method 2

 # substitute (find and replace) "foo" with "bar" on each line
 sed 's/foo/bar/'             # replaces only 1st instance in a line
 sed 's/foo/bar/4'            # replaces only 4th instance in a line
 sed 's/foo/bar/g'            # replaces ALL instances in a line
 sed 's/\(.*\)foo\(.*foo\)/\1bar\2/' # replace the next-to-last case
 sed 's/\(.*\)foo/\1bar/'            # replace only the last case

 # substitute "foo" with "bar" ONLY for lines which contain "baz"
 sed '/baz/s/foo/bar/g'

 # substitute "foo" with "bar" EXCEPT for lines which contain "baz"
 sed '/baz/!s/foo/bar/g'

 # change "scarlet" or "ruby" or "puce" to "red"
 sed 's/scarlet/red/g;s/ruby/red/g;s/puce/red/g'   # most seds
 gsed 's/scarlet\|ruby\|puce/red/g'                # GNU sed only

 # reverse order of lines (emulates "tac")
 # bug/feature in HHsed v1.5 causes blank lines to be deleted
 sed '1!G;h;$!d'               # method 1
 sed -n '1!G;h;$p'             # method 2

 # reverse each character on the line (emulates "rev")
 sed '/\n/!G;s/\(.\)\(.*\n\)/&\2\1/;//D;s/.//'

 # join pairs of lines side-by-side (like "paste")
 sed '$!N;s/\n/ /'

 # if a line ends with a backslash, append the next line to it
 sed -e :a -e '/\\$/N; s/\\\n//; ta'
Will that do?

@Grail.
On my system that command erased every thing except blank lines.

I'm still learning these tools. Hope this wasn't homework!!!

Regards Glenn
 
Old 03-15-2011, 04:48 AM   #7
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
Quote:
Originally Posted by GlennsPref View Post
On my system that command erased every thing except blank lines.
Maybe we have to retain lines with blank spaces (or tabs) among the words:
Code:
sed '/[^[:alnum:][:space:]]/d' filename
 
Old 03-15-2011, 05:39 AM   #8
GlennsPref
Senior Member
 
Registered: Apr 2004
Location: Brisbane, Australia
Distribution: Mageia Studio-13.37 Kubuntu.
Posts: 3,325
Blog Entries: 33

Rep: Reputation: 199Reputation: 199
@colucix, That one works well.

Quote:
sed '/[^[:alnum:][:space:]]/d' filename
But left behind the blank lines too.

I'm having trouble appending another regexp to remove leading spaces.

Cheers Glenn
 
Old 03-15-2011, 06:29 AM   #9
jschiwal
Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654
You can also use the -v option in grep to exclude lines matching the patterns, leaving the rest of the lines.
 
1 members found this post helpful.
Old 03-15-2011, 06:32 AM   #10
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
You can put together multiple expressions using multiple -e options. For example:
Code:
sed -e '/[^[:alnum:][:space:]]/d' -e 's/^[[:space:]]*//' -e '/^$/d' filename
The second one removes leading spaces, if any. The third one removes empty lines. Since a line may contain only spaces, better to keep the expressions in this order: first spaces are removed, then the resulting empty line is deleted.
 
1 members found this post helpful.
Old 03-15-2011, 07:54 AM   #11
GlennsPref
Senior Member
 
Registered: Apr 2004
Location: Brisbane, Australia
Distribution: Mageia Studio-13.37 Kubuntu.
Posts: 3,325
Blog Entries: 33

Rep: Reputation: 199Reputation: 199
Beautiful! colucix.

I hope the OP likes it.

Glenn
 
Old 03-15-2011, 08:13 AM   #12
Animal X
LQ Newbie
 
Registered: Jun 2009
Location: SW VA
Distribution: Ubuntu, Mepis, Fedora, Clonezilla, GPartEd
Posts: 11
Blog Entries: 1

Rep: Reputation: 1
does it have t be sed? grep -v and then the regex for your symbols will give you what you want.

edit:ah, just saw where someone already mentioned it

Last edited by Animal X; 03-15-2011 at 08:15 AM.
 
Old 03-16-2011, 12:09 PM   #13
Stevy12
LQ Newbie
 
Registered: Mar 2011
Posts: 3

Original Poster
Rep: Reputation: 1
Thanks everyone, this help me so much
 
1 members found this post helpful.
Old 03-16-2011, 07:10 PM   #14
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,589

Rep: Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942Reputation: 1942
Glad you got a solution. Please mark as SOLVED once you are satisfied.
 
  


Reply

Tags
regular expressions, sed


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
SED or AWK - remove every 4 of 5 new lines Mallardle Linux - Newbie 6 08-30-2010 07:44 AM
[SOLVED] Using sed to remove lines around a specified string twchambers Linux - General 1 06-04-2010 11:19 AM
advance sed to remove multile lines. ufmale Linux - Newbie 8 05-14-2010 09:44 AM
[SOLVED] Using sed to remove lines with duplicate ID's, but different endings... wapitismith Linux - Newbie 4 05-08-2010 12:30 PM
sed to remove specific lines in a file tekmann33 Linux - Newbie 3 05-21-2009 03:41 PM


All times are GMT -5. The time now is 04:04 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration