LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 07-10-2006, 08:09 AM   #1
hfawzy
Member
 
Registered: Aug 2002
Location: Egypt
Distribution: Debian Sarge, Slackware 10.0
Posts: 163

Rep: Reputation: 30
remove some text from a file


I want to replace all instances of
Quote:
\textit{Some text here}
simply with
Quote:
Some text here
in a file.
In other words, I want to remove all the "\textit{" and the corresponding "}" in that file.
How can I do this ?
Thank you.
 
Old 07-10-2006, 08:33 AM   #2
spirit receiver
Member
 
Registered: May 2006
Location: Frankfurt, Germany
Distribution: SUSE 10.2
Posts: 424

Rep: Reputation: 33
If each instance is in a single line and there are no nested curly brackets, the following should work:
Code:
sed 's/\\textit{\([^}]*\)}/\1/g'
Otherwise, I'd suggest to use Perl or remove only the "\textit" part and leave the brackets in place, they shouldn't do any harm in TeX.
 
Old 07-10-2006, 08:36 AM   #3
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 680Reputation: 680Reputation: 680Reputation: 680Reputation: 680Reputation: 680
You could use sed.
sed 's/\\textit{\([^}]*\)}/\1/g' oldfile >newfile
This will remove the \textit from your file if it isn't split up between two or more lines.

>cat sample
This is a test. \textit{This is a sample line of Latex}. This is more on the line.
This is a second line.
This is the third line.
This \textit{italic text is
divided on two lines.}
This is a line without italic text.


sed -e 's/\\textit{\([^}]*\)}/\1/g' -e '/\\textit{\([^}]*\)$/{ N;' -e '/}/s/\\textit{\([^}]*\)}/\1/}' sample
This is a test. This is a sample line of Latex. This is more on the line.
This is a second line.
This is the third line.
This italic text is
divided on two lines.
This is a line without italic text.

The above example handles \textit{ .* } split up on two lines. For three lines you will need to use branching and use a sed program instead of '-e' on a oneliner:
sed -f sedprogram.sed sample >newsample

The above example will remove the \textit{ part even if there isn't a closing } so it isn't perfect. Also, for sed programs it is important to check if the pattern to be replaced works when it is on the last line.

Last edited by jschiwal; 07-10-2006 at 09:08 AM.
 
Old 07-10-2006, 11:04 AM   #4
hfawzy
Member
 
Registered: Aug 2002
Location: Egypt
Distribution: Debian Sarge, Slackware 10.0
Posts: 163

Original Poster
Rep: Reputation: 30
Thanks for the replies.. The sed command worked great.
To get used to sed and know how to use it next time, I would like to understand the command you gave :
Quote:
sed 's/\\textit{\([^}]*\)}/\1/g' oldfile >newfile
As I'm not familiar with Regular expressions, I would like to understand this part : \([^}]*\
Anyone willing to explain ?
Thank you.
 
Old 07-10-2006, 01:04 PM   #5
spirit receiver
Member
 
Registered: May 2006
Location: Frankfurt, Germany
Distribution: SUSE 10.2
Posts: 424

Rep: Reputation: 33
I'll begin with the "center" of that regular expression. The brackets [] specify a set of characters that the expression is supposed to match. But ^ as the first character in [] negates its content, i.e. we want to match all characters except those that are contained in [].
Thus [^}] matches all characters except a closing curly brace. We don't just want a single non-bracket character, but arbitrarily many, that's why [^}] is followed by *.
So far, \\textit{[^}]*} will match \textit followed by some text in curly braces, and we need to store the content of the brackets for later reference, that's why it has to be enclosed in \( .. \). Then we can use \1 to insert the first matching pair \( .. \) .
 
Old 07-10-2006, 02:40 PM   #6
hfawzy
Member
 
Registered: Aug 2002
Location: Egypt
Distribution: Debian Sarge, Slackware 10.0
Posts: 163

Original Poster
Rep: Reputation: 30
Thank you for taking the time to explain, spirit receiver.
I really appreciate that.
 
Old 07-10-2006, 02:47 PM   #7
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 680Reputation: 680Reputation: 680Reputation: 680Reputation: 680Reputation: 680
FYI, you might want to read the sed manual. Also, there is a "man regex" page which might help a little on understanding regular expressions. But a google search would return something more readable.

Also, if you edit in vim, you can do the same thing with the command:
:s/\\textit{\(.*\)}/\1/

This will perform the replacement on the current line. For the entire document, :%s\\textit{\(.*\)}/\1/g
This won't work when you have your italic text split up in two or more lines.

Last edited by jschiwal; 07-10-2006 at 02:48 PM.
 
Old 07-10-2006, 09:22 PM   #8
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 9,117
Blog Entries: 4

Rep: Reputation: 3211Reputation: 3211Reputation: 3211Reputation: 3211Reputation: 3211Reputation: 3211Reputation: 3211Reputation: 3211Reputation: 3211Reputation: 3211Reputation: 3211
Also check out 'awk'.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Which light text editor can copy text from file and paste in browser? davidas Linux - Software 9 03-06-2006 11:28 AM
remove text from file with script paul_mat Linux - Software 3 11-17-2005 12:21 PM
Remove odd lines from a text file Mr. Gone Programming 2 09-19-2005 11:16 AM
remove "special" text out of a file tearinox Linux - General 1 03-15-2004 08:43 PM
How to remove line of text from file netkepala Linux - General 2 05-23-2003 11:49 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 10:35 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration