LinuxQuestions.org
Go Job Hunting at the LQ Job Marketplace
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
LinkBack Search this Thread
Old 08-30-2007, 10:20 AM   #1
jkh107
LQ Newbie
 
Registered: Jul 2006
Posts: 18

Rep: Reputation: 0
Batch search-and-replace?


I have several text documents that have citations inserted using the Windows program Reference Manager. I would like to replace these citations with BibTeX citations so that the documents can be processed in LaTeX. Because there are around a hundred different citations, I am looking for an easy way to find all the citations in one format and replace them with citations in the other format. It would be quite easy to prepare a list of equivalent citations in a spreadsheet. Is there a good way of doing this? I'm not quite sure what program I am looking for.

Thanks
 
Old 08-30-2007, 10:22 AM   #2
nan0meter
Member
 
Registered: Aug 2007
Location: The Netherlands
Distribution: Fedora 7 x86_64
Posts: 119

Rep: Reputation: 15
Tried using an editor like gedit, geany, SCiTE, scribes, kates with search and replace yet? If that doesn't work you can try doing it with regexxer with regular expressions.
 
Old 08-30-2007, 11:38 AM   #3
jkh107
LQ Newbie
 
Registered: Jul 2006
Posts: 18

Original Poster
Rep: Reputation: 0
Thanks for the reply.

Essentially, what I have at the moment is two lists: a list of Reference Manager citations and a list of corresponding BibTeX citations (or one list made up of pairs of citations). I could type each individual pair into any search-and-replace facility, but it would take hours because there are around a hundred pairs. Plus it would only replace them in one document. I'm looking for a way of replacing the citations that does not involve typing them all in to search-and-replace boxes and then having to go through the same laborious process for every single document. Do any of the programs you mentioned allow you to supply the program with a long list of terms and the terms you want to replace them with?

Thanks again.
 
Old 08-30-2007, 01:55 PM   #4
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Arch/XFCE
Posts: 17,797

Rep: Reputation: 724Reputation: 724Reputation: 724Reputation: 724Reputation: 724Reputation: 724Reputation: 724
Suppose you want to replace all instances of "dog" with "horse". Put all the files in one directory, and do this---in that directory:
Code:
for i in *; do sed -i 's/dog/horse/g' $i; done
This will process every file in the directory.
To be more selective, you could--e.g.-- process all the .doc files thusly:
Code:
for i in *.doc; do sed -i 's/dog/horse/g' $i; done
To put the modified data in new files:
Code:
for i in *;do sed 's/dog/horse/g' $i > new$i; done
(takes data from "name", modifies it, and writes to "newname".

Last edited by pixellany; 08-30-2007 at 01:59 PM.
 
Old 08-31-2007, 07:21 AM   #5
jkh107
LQ Newbie
 
Registered: Jul 2006
Posts: 18

Original Poster
Rep: Reputation: 0
I would still have to do that about a hundred times:
Code:
for i in *; do sed -i 's/dog/horse/g' $i; done
for i in *; do sed -i 's/cat/gerbil/g' $i; done
for i in *; do sed -i 's/mouse/iguana/g' $i; done
...
and so on to change a hundred animals into a hundred different ones.
 
Old 08-31-2007, 07:42 AM   #6
muha
Member
 
Registered: Nov 2005
Distribution: xubuntu, grml
Posts: 451

Rep: Reputation: 37
Please post the format of both citations with examples. If it is indeed _one format_ it should be possible with sed regexp to change the one format into another.
 
Old 08-31-2007, 11:12 AM   #7
jkh107
LQ Newbie
 
Registered: Jul 2006
Posts: 18

Original Poster
Rep: Reputation: 0
The first format looks like
Code:
Smith, 2007 1 /id
The second format looks like
Code:
Smith2007
Although they are not always such good matches, such as:
Code:
Högberg, 2001 27 /id
Jones, 1997 12 /id
van der Heijden, 1998 32 /id
are
Code:
Hogberg2001
Jones1997b
Heijden1998
which is why I am not sure sed is quite the right tool.
 
Old 08-31-2007, 11:25 AM   #8
muha
Member
 
Registered: Nov 2005
Distribution: xubuntu, grml
Posts: 451

Rep: Reputation: 37
Hmm, should not be so hard. The only thing I wonder is where the b comes from in: Jones1997b

I think the sed regexp might look like:
Code:
sed 's/.*[ ]\{0,1\}\([^ ]*,[0-9]*\)\s.*$/\1/g'
I can't test it at the moment ...

Although if you want to run it on the whole text it would need to only match citations.

Last edited by muha; 08-31-2007 at 11:32 AM.
 
Old 08-31-2007, 12:08 PM   #9
jkh107
LQ Newbie
 
Registered: Jul 2006
Posts: 18

Original Poster
Rep: Reputation: 0
The 'b' is because there are already another two Jones references from 1997 (Jones1997 and Jones1997a) in that database. There is no real way of matching them up to the ones in the other database using regular expressions, because there is no real connection between them. I can export a complete list of citations from each database and pair them up in a spreadsheet (CSV file, whatever) quite easily, so if I could just feed that data into a program that would replace all the text in the first column with the corresponding text in the second column (without any use of regular expressions), it would give exactly the right result.
 
Old 08-31-2007, 12:22 PM   #10
muha
Member
 
Registered: Nov 2005
Distribution: xubuntu, grml
Posts: 451

Rep: Reputation: 37
Quote:
Originally Posted by jkh107 View Post
so if I could just feed that data into a program that would replace all the text in the first column with the corresponding text in the second column
You can do that in your spreadsheet program too.
I don't know if you're interested but sed can count as well. It's a little-bit more advanced but you can this sort of stuff with sed. For people interested in more: http://www.linuxquestions.org/bookmarks/tags/sed
 
Old 09-02-2007, 04:11 AM   #11
jkh107
LQ Newbie
 
Registered: Jul 2006
Posts: 18

Original Poster
Rep: Reputation: 0
How would I use a spreadsheet program for this?
 
Old 09-02-2007, 06:18 AM   #12
AwesomeMachine
Senior Member
 
Registered: Jan 2005
Location: USA and Italy
Distribution: Debian Testing; OpenSuSE 12.1; Fedora 17
Posts: 1,541

Rep: Reputation: 148Reputation: 148
I think 'tr', man tr, would work for you. It's a search and replace utility that translates by regular expressions.
 
Old 09-02-2007, 07:26 AM   #13
johnhamiltion
Member
 
Registered: Aug 2007
Posts: 92

Rep: Reputation: 15
awk (or gawk) is by far the best stream editor.

It is available for both UNIX and windows.

Takes a bit of learning though.
 
Old 09-02-2007, 10:03 AM   #14
jkh107
LQ Newbie
 
Registered: Jul 2006
Posts: 18

Original Poster
Rep: Reputation: 0
Looks like sed might do the job if I use a text file containing a list of all the substitutions. Thanks for all the suggestions.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
search and replace ip address ? cdm33 Programming 4 06-29-2007 01:13 PM
search & replace raj_sony2001 Linux - General 4 10-05-2006 02:05 PM
Batch search and replace devilkin Linux - Newbie 2 02-14-2005 02:39 AM
problem in perl replace command with slash (/) in search/replace string ramesh_ps1 Red Hat 4 09-10-2003 01:04 AM
Grep for search, but what for replace? TheSpecial Linux - Software 18 04-28-2003 09:01 AM


All times are GMT -5. The time now is 10:52 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: @linuxquestions
Open Source Consulting | Domain Registration