LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 10-13-2016, 12:59 AM   #1
Johng
Member
 
Registered: Feb 2002
Location: NZ
Distribution: Mint Suse
Posts: 364

Rep: Reputation: 30
Remove lines from a file using sed


I'm trying to remove lines from an xml file using sed.

The lines are in the form: <time>2013-12-11T00:57:51.000Z</time> where the text in the middle (between <time> and </time> varies.
 
Old 10-13-2016, 01:07 AM   #2
Timothy Miller
Moderator
 
Registered: Feb 2003
Location: Arizona, USA
Distribution: Debian, EndeavourOS
Posts: 3,765
Blog Entries: 9

Rep: Reputation: 1378Reputation: 1378Reputation: 1378Reputation: 1378Reputation: 1378Reputation: 1378Reputation: 1378Reputation: 1378Reputation: 1378Reputation: 1378
All the lines you want to remove start with <time>? Are there any lines starting with <time> that you DON'T want to remove? If not, then it's pretty easy
Code:
sed -i '/^<time>/d' file
If you wanted to see the changes before actually writing them, then simply omit the -i
Code:
sed '/^<time>/d' file
and it would print the file to the screen for you to verify that's what you wanted. Or you could output it to a second file to make sure it's what you wanted
Code:
sed '/^<time>/d' file > file2

Last edited by Timothy Miller; 10-13-2016 at 01:08 AM.
 
Old 10-13-2016, 01:07 AM   #3
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 20,180

Rep: Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755
In that case, show us what you've tried.
 
Old 10-13-2016, 06:09 PM   #4
Johng
Member
 
Registered: Feb 2002
Location: NZ
Distribution: Mint Suse
Posts: 364

Original Poster
Rep: Reputation: 30
Thanks Timothy for your reply. I copied/paste your first suggestion (with named file), but there was no change to the file. So I did the second command. It created a new file, but it was 0B (ie empty).

syg00: This is one of my attempts:
Code:
sed -e 's/<time>.*<\</time>//' Karapoti.gpx
sed  '/<time>.*<\</time>/,$d' Karapoti.gpx
 
Old 10-13-2016, 06:20 PM   #5
Johng
Member
 
Registered: Feb 2002
Location: NZ
Distribution: Mint Suse
Posts: 364

Original Poster
Rep: Reputation: 30
And then I tried:
Code:
sed -e '/<time>/ { d; }' Karapoti.gpx > Karapoti.txt
and this worked for me. Thank you.
 
Old 10-13-2016, 06:21 PM   #6
Timothy Miller
Moderator
 
Registered: Feb 2003
Location: Arizona, USA
Distribution: Debian, EndeavourOS
Posts: 3,765
Blog Entries: 9

Rep: Reputation: 1378Reputation: 1378Reputation: 1378Reputation: 1378Reputation: 1378Reputation: 1378Reputation: 1378Reputation: 1378Reputation: 1378Reputation: 1378
Not sure why my suggestiong didn't work for you:

Code:
[tmiller@ab-app-blade ~]$ cat test.gpx
<time>2013-12-11T00:57:51.000Z</time>
<time>2013</time>
text
<nottime>
<time>2013-12-11T00:57:51.000Z</time>
other text
<start>not time</start>
<time>2013-12-11T00:57:51.000Z</time>
stop
Code:
[tmiller@ab-app-blade ~]$ sed '/^<time>/d' test.gpx
text
<nottime>
other text
<start>not time</start>
stop
 
Old 10-13-2016, 06:27 PM   #7
goumba
Senior Member
 
Registered: Dec 2009
Location: New Jersey, USA
Distribution: Current: Debian and OpenSUSE. Past: Arch, RedHat (pre-RHEL). FreeBSD & OpenBSD novice, Hackintosh
Posts: 1,193
Blog Entries: 7

Rep: Reputation: 336Reputation: 336Reputation: 336Reputation: 336
Quote:
Originally Posted by Johng View Post
Thanks Timothy for your reply. I copied/paste your first suggestion (with named file), but there was no change to the file. So I did the second command. It created a new file, but it was 0B (ie empty).

syg00: This is one of my attempts:
Code:
sed -e 's/<time>.*<\</time>//' Karapoti.gpx
sed  '/<time>.*<\</time>/,$d' Karapoti.gpx
Is there any whitespace before the <time> string?

If so, you'll want to modify Timothy Miller's regexes:

Code:
sed '/^[[:space:]]*<time>/d' file
Will match any whitespace (be it space, tab, etc) before <time>
 
Old 10-13-2016, 06:51 PM   #8
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 20,180

Rep: Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755Reputation: 3755
Quote:
Originally Posted by Johng View Post
This is one of my attempts:
Code:
sed -e 's/<time>.*<\</time>//' Karapoti.gpx
sed  '/<time>.*<\</time>/,$d' Karapoti.gpx
That fails because you are escaping the (second) "<" - you needed to escape the forward slash "/" - so "\/" (and remove the second < reference).
Personally I prefer specific regex - it mitigates catching (in this case deleting) unintended data. So I like your attempt here rather than your "successful" one.
 
Old 10-15-2016, 03:48 AM   #9
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 1,882

Rep: Reputation: 835Reputation: 835Reputation: 835Reputation: 835Reputation: 835Reputation: 835Reputation: 835
Code:
sed 's/<time>.*<\/time>//'
substitutes the matching part of the lines with an empty string.
Code:
sed '/<time>.*<\/time>/d'
deletes the matching lines, just like
Code:
grep -v '<time>.*</time>'
 
Old 10-15-2016, 04:44 AM   #10
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 6,083
Blog Entries: 3

Rep: Reputation: 3180Reputation: 3180Reputation: 3180Reputation: 3180Reputation: 3180Reputation: 3180Reputation: 3180Reputation: 3180Reputation: 3180Reputation: 3180Reputation: 3180
"sed" is kind of difficult for XML parsing. There's a perl module, XML::TreeBuilder, that does that already and is easy to use. It does require well-formed XML though.

Code:
#!/usr/bin/perl                                                                                   

use warnings;
use strict;
use XML::TreeBuilder;

my $file = shift || '/dev/stdin';

my $root = XML::TreeBuilder->new;

$root->parse_file( $file )
        or die( "Could not parse '$file' : $! \n");

while ( my $zap = $root->look_down( _tag => q(time) ) ) {
        $zap->delete;
}

print $root->as_XML(undef, "  ");

exit ( 0 );
That uses the delete method to delete a part of the tree.

Last edited by Turbocapitalist; 10-15-2016 at 04:45 AM.
 
1 members found this post helpful.
Old 10-15-2016, 05:03 AM   #11
Johng
Member
 
Registered: Feb 2002
Location: NZ
Distribution: Mint Suse
Posts: 364

Original Poster
Rep: Reputation: 30
Thank you all for your help, I'm better informed now. And, yes there was a space before <time> which I did not mention.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] sed: remove lines from a text file not containing parentheses shusai Linux - Newbie 4 09-17-2013 12:04 PM
[SOLVED] Remove lines with sed Stevy12 Linux - Newbie 13 03-16-2011 08:10 PM
SED or AWK - remove every 4 of 5 new lines Mallardle Linux - Newbie 6 08-30-2010 08:44 AM
[SOLVED] Using sed to remove lines around a specified string twchambers Linux - General 1 06-04-2010 12:19 PM
sed to remove specific lines in a file tekmann33 Linux - Newbie 3 05-21-2009 04:41 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 07:30 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration