LinuxQuestions.org
Register a domain and help support LQ
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 01-03-2007, 09:15 AM   #1
trashbird1240
Member
 
Registered: Sep 2006
Location: Durham, NC
Distribution: Slackware, Ubuntu (yes, both)
Posts: 463

Rep: Reputation: 31
Recursive Text Editing: Where to begin?


Howdy forum,

I have a series of Stata scripts (.do-files) that were saved in Windows and I just converted to Linux; Stata interprets the scripts just fine, however when I look at them in a text editor, I get "^M" at the end of every line. This hampers the use of ESS' syntax highlighting. So, I want to accomplish a couple of things, mainly get these files into readable format.

The files are arranged in a subdirectories of two separate directories, themselves subdirectories of /home/joel

1. Go through each file and change "c:/" to "~" and in general change Windows pathnames to UNIX pathnames.
2. fromdos <old-do-file> new-do-file
3. pack up the old-do-files into a tree structure just like the current one and tar-then-bzip2 it so that I only have the new ones in my directories.

The last step seems like the easiest part. I thought of using sed to accomplish goal #1, and fromdos (slackware) is pretty easy to use and always does the job correctly.

HOWEVER, because of the directory structure, I need to do this recursively. Where do I start writing a script that will recursively search for the files I need to edit and change, and archive?

Also, I've been learning and using bash quite a lot -- would it be better to use some other shell (e.g., zsh) to interpret this script?

Thanks,
Joel
 
Old 01-03-2007, 09:49 AM   #2
Nick_Battle
Member
 
Registered: Dec 2006
Location: Bracknell, UK
Distribution: SUSE 13.1
Posts: 159

Rep: Reputation: 32
> Where do I start writing a script that will recursively search
> for the files I need to edit and change, and archive?

Does find(1) not give you what you want?

Cheers,
-nick
 
Old 01-03-2007, 10:00 AM   #3
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,458

Rep: Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941
Quote:
Originally Posted by trashbird1240
Howdy forum,

Stata interprets the scripts just fine, however when I look at them in a text editor, I get "^M" at the end of every line. This hampers the use of ESS' syntax highlighting.
A method to remove the ^M and any other control character is by means of the col command, e.g.

Code:
cat input_file | col -b > output_file
 
Old 01-03-2007, 10:30 AM   #4
billymayday
Guru
 
Registered: Mar 2006
Location: Sydney, Australia
Distribution: Fedora, CentOS, OpenSuse, Slack, Gentoo, Debian, Arch, PCBSD
Posts: 6,678

Rep: Reputation: 122Reputation: 122
Sounds to me like the files are simply in DOS text format (ie with CR/LF at the end of each line). You say you converted them to Linux, but did you run dos2unix on each file? This will remove the extra control characters automatically.

How many subdirectories are you talking about?
 
Old 01-03-2007, 11:50 AM   #5
trashbird1240
Member
 
Registered: Sep 2006
Location: Durham, NC
Distribution: Slackware, Ubuntu (yes, both)
Posts: 463

Original Poster
Rep: Reputation: 31
[QUOTE=Nick_Battle]> Where do I start writing a script that will recursively search
> for the files I need to edit and change, and archive?

Does find(1) not give you what you want?
/QUOTE]

I suppose it does: I just tried it and it prints full pathnames on STDOUT. So, I'll just pipe them from find into whatever command I use to do the editing.

Joel
 
Old 01-03-2007, 11:52 AM   #6
trashbird1240
Member
 
Registered: Sep 2006
Location: Durham, NC
Distribution: Slackware, Ubuntu (yes, both)
Posts: 463

Original Poster
Rep: Reputation: 31
Quote:
Originally Posted by billymayday
Sounds to me like the files are simply in DOS text format (ie with CR/LF at the end of each line). You say you converted them to Linux, but did you run dos2unix on each file? This will remove the extra control characters automatically.
I said I will convert them using fromdos. fromdos is the Slackware version of dos2unix.

Quote:
Originally Posted by billymayday
How many subdirectories are you talking about?
I'm talking about subdirectories of subdirectories, dude. Recursion to the max.

Seriously, I organize my projects by researchers I work for, then the projects that they are doing, then sometimes by section of the analysis I'm doing.

Thanks for the help -- I think I'm getting a good place to start.

Joel
 
Old 01-03-2007, 01:55 PM   #7
billymayday
Guru
 
Registered: Mar 2006
Location: Sydney, Australia
Distribution: Fedora, CentOS, OpenSuse, Slack, Gentoo, Debian, Arch, PCBSD
Posts: 6,678

Rep: Reputation: 122Reputation: 122
Joel, I'm with Nick - look at find pretty carefully - especially the exec options
 
Old 01-19-2007, 03:41 PM   #8
trashbird1240
Member
 
Registered: Sep 2006
Location: Durham, NC
Distribution: Slackware, Ubuntu (yes, both)
Posts: 463

Original Poster
Rep: Reputation: 31
Okay, now that I've been working on this for a while, it's time to update you:

Here's what I've tried:
Code:
find ./data/ ./ado/personal/ -depth -iregex ^.+\\.[a]*do$ -exec sed '/c:/s/\\/\//g;/c:/s/c:/\~/g;s/^.*^M$//' {} \;
The sed command does exactly what it should, and the find command finds the files just perfectly. My question is how to I redirect the output so that it actually edits the files (saves the STDOUT to the file that I am editing). Right now it will print out the modified out put, and testing the commands individually shows that they are doing what I want, but the actual files are untouched.

For example, if I enter

Code:
find ./data/ ./ado/personal/ -depth -iregex ^.+\\.[a]*do$ -exec sed '/c:/s/\\/\//g;/c:/s/c:/\~/g;s/^.*^M$//' {} > {} \;
The command executes and I get no errors. I also get the files remaining the same, even though I thought I could direct the output back to the file with {} > {}. If I do the above with grep

Code:
find ./data/ ./ado/personal/ -depth -iregex ^.+\\.[a]*do$ -exec sed '/c:/s/\\/\//g;/c:/s/c:/\~/g;s/^.*^M$//' {} \;|grep "~"
Then I see that my edits are happening. However, the files are still the same.

How do I actually edit the files? (I've tried /w in sed but that does something different from what I want)

Thanks,
Joel
 
Old 01-19-2007, 10:20 PM   #9
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 2,397

Rep: Reputation: 814Reputation: 814Reputation: 814Reputation: 814Reputation: 814Reputation: 814Reputation: 814
Quote:
Originally Posted by man sed
-i[SUFFIX], --in-place[=SUFFIX]

edit files in place (makes backup if extension supplied)
Or output to a different file: {} > {}.new
 
Old 01-22-2007, 10:00 AM   #10
trashbird1240
Member
 
Registered: Sep 2006
Location: Durham, NC
Distribution: Slackware, Ubuntu (yes, both)
Posts: 463

Original Poster
Rep: Reputation: 31
thanks -- I will try it today.
Joel
 
Old 01-22-2007, 11:51 AM   #11
trashbird1240
Member
 
Registered: Sep 2006
Location: Durham, NC
Distribution: Slackware, Ubuntu (yes, both)
Posts: 463

Original Poster
Rep: Reputation: 31
it worked

Joel
 
Old 01-22-2007, 12:17 PM   #12
ygloo
Member
 
Registered: Aug 2006
Distribution: slack
Posts: 323

Rep: Reputation: 30
...................

Last edited by ygloo; 01-22-2007 at 12:19 PM.
 
  


Reply

Tags
bash, find, grep, regex, regexp, sed, shell script, zsh


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
text editing from a script vogelbaugh Linux - General 2 09-22-2005 12:53 PM
editing text files killertofu Linux - Newbie 2 10-15-2004 05:55 PM
Suggestions for text editing jackpotrobot Linux - Software 2 07-06-2004 04:08 PM
text editing machinemen Linux - Newbie 2 06-18-2004 04:38 AM
Text editing PhuckFonix Linux - Newbie 5 05-22-2004 09:19 PM


All times are GMT -5. The time now is 10:59 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration