LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 11-13-2011, 01:32 AM   #1
geodave0110
LQ Newbie
 
Registered: Dec 2010
Distribution: Ubuntu
Posts: 6

Rep: Reputation: 0
Recursively naming files in a directory after processing them in a "for" loop


Hello All,

I ran across a sed command that removes line breaks in a text file, and I'd like to recursively run this command for a set of files in a directory. But I'd like to retain the name of each text file and add an incrementing number at its end.

Here's what I've done so far:

Code:
#!/bin/bash
for file in *.txt
do
sed -n -e ":a" -e "$ s/\n//gp;N;b a" "$file"
done
When I run this script I basically get each file's content displayed in my terminal the way I want it. But I'm having a hard time figuring out a way to recursively write the new contents to separate files named incrementally.

I'd be happy to provide clearer information if the above doesn't make sense. Any help or suggestions are much appreciated!

PS. Can someone please explain to me what the sed command's steps are doing? I'm new to programming in Linux and would like to learn beyond merely plugging and chugging lines without really knowing their details.
 
Old 11-13-2011, 02:06 AM   #2
frieza
Senior Member
 
Registered: Feb 2002
Location: harvard, il
Distribution: Ubuntu 11.4,DD-WRT micro plus ssh,lfs-6.6,Fedora 15,Fedora 16
Posts: 3,233

Rep: Reputation: 406Reputation: 406Reputation: 406Reputation: 406Reputation: 406
try
Code:
let count=0 #set an increment variable
for file in `ls *.txt` #act on output of 'ls *.txt*'
do
filename=`echo $file | cut -d. -f1` #cut off the .txt extension
newfile=$filename$x.txt #add increment variable to new file name
cp -v $file $newfile #copy file to new file name
sed -i -n -e ":a" -e "$ s/\n//gp;N;b a" "$newfile" #perform sed on new file
let count=count+1 #increment increment variable
done
hope this helps
 
1 members found this post helpful.
Old 11-13-2011, 04:00 AM   #3
geodave0110
LQ Newbie
 
Registered: Dec 2010
Distribution: Ubuntu
Posts: 6

Original Poster
Rep: Reputation: 0
Works like a charm

Thank you frieza! Your additions worked perfectly. Thanks especially for commenting your script so I know what's happening at each step!
 
Old 11-13-2011, 05:11 AM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
Well I don't normally like to pick on some's code, but when your already using a superior method it seems necessary:
Code:
for file in *.txt # No word splitting done

for file in `ls *.txt` # Suffers from word splitting and parsing of ls
For more information on both issues listed see Ls Parsing and Word Splitting

I am also curious how you claim:
Quote:
Your additions worked perfectly.
When the following line is wrong:
Code:
newfile=$filename$x.txt #add increment variable to new file name
The increment is not added as 'x' is not the counter but rather the variable 'count'

Also, no need to go through so many steps or outside commands to perform the following:
Code:
filename=`echo $file | cut -d. -f1` #cut off the .txt extension
newfile=$filename$x.txt #add increment variable to new file name

# simply use
newfile=${file%.*}$((count++)).txt
Lastly, if any of your file names have whitespace or special characters, you will need to copy them when used in the cp command.
 
Old 11-13-2011, 07:42 AM   #5
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Recent versions of bash (4.0+) have a new feature allowing globbing through subdirectories.
You can also use a subshell to avoid resetting everything each time.

Code:
#!/bin/bash

# enable ** globbing, and also nullglob, so that it doesn't error on empty directories.
shopt -s globstar nullglob

# set the starting number to count from
c=1

# Now loop through the list of subdirectories produced by **/.  Include the topdir too.
for dir in . **/ ; do

	# run each loop in a subshell ( everything inside (...) )
	# this avoids having to reset everything for each directory
	# when it exits, you're back at the starting directory with all variables at their initial values

	(
		cd "${dir}"
		echo "Now processing [$dir]"

		#loop through each file in the directory
		for file in * ; do
			sed -i -n -e ":a" -e "$ s/\n//gp;N;b a" "$file" "${file%.*}-$(( c++ )).txt"
		done
	)

	echo
done

exit 0
 
Old 11-13-2011, 03:04 PM   #6
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948
I would personally use find; the Bash globbing rules (especially wrt. files that start with a dot) may not do exactly what one might expect. Find is more clear-cut.
Code:
#!/bin/bash

# In and UTF-8 locale, invalid byte sequences may halt execution.
# Avoid that by explicitly using a C/POSIX locale.
export LC_ALL=C LANG=C

# First command line parameter must exist.
if [ $# -lt 1 ] || [ ! -e "$1" ]; then
    echo "Usage: $0 directories-or-files..." >&2
    exit 1
fi

find "$@" -type f -print0 | while read -rd "" FILE ; do
    # Verify file still exists.
    [ -f "$FILE" ] || continue

    # Check if the file name ends with a version number. Skip if so.
    INDEX="${FILE##*.}"
    [ -n "$INDEX" -a "$FILE" != "$INDEX" -a -z "${INDEX//[0-9]/}" ] && continue

    # Find the first unused index, starting with 1.
    INDEX=1
    NEWFILE="$FILE.$INDEX"
    while [ -e "$NEWFILE" ]; do
        INDEX=$[INDEX+1]
        NEWFILE="$FILE.$INDEX"
    done

    # Copy $FILE to $NEWFILE, removing all newlines.
    tr -d '\r\n' <"$FILE" >"$NEWFILE" || exit $?

    # If possible, retain mode and owner.
    chown --reference="$FILE" "$NEWFILE" &>/dev/null
    chmod --reference="$FILE" "$NEWFILE" &>/dev/null

    # Let the user know.
    echo "$FILE: ${NEWFILE##*/}"
done
This one will skip files that have a numeric suffix.
For other files, it will find the first numeric suffix (starting at .1) that does not exist yet.
It will then use tr to strip newlines from the original file, saving the result as the new file.
It will attempt to retain the owner and mode, but quietly.
Finally, it will output the path to the old file, and the new file name, if successful.

If you run it without command-line parameters, or the first parameter is not an existing file or directory, it will output simple help and abort.

For testing, comment out the tr, chown, and chmod lines by adding a # at the beginning of those three lines. That way the script will just say what file names it would use, but not actually create any new files (nor modify existing ones).
 
Old 11-14-2011, 12:06 PM   #7
frieza
Senior Member
 
Registered: Feb 2002
Location: harvard, il
Distribution: Ubuntu 11.4,DD-WRT micro plus ssh,lfs-6.6,Fedora 15,Fedora 16
Posts: 3,233

Rep: Reputation: 406Reputation: 406Reputation: 406Reputation: 406Reputation: 406
Quote:
Originally Posted by grail View Post

I am also curious how you claim:

When the following line is wrong:
Code:
newfile=$filename$x.txt #add increment variable to new file name
The increment is not added as 'x' is not the counter but rather the variable 'count'

Also, no need to go through so many steps or outside commands to perform the following:
[code]
d'oh, my bad, you're absolutely right on that count, as for the rest, well that's the best i could pull off with my knowledge of bash scripting, i'll be the first to admit it was a crude solution, minus the typo
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Replace ":" from multiple files names (even recursively in directories) pepeq Linux - General 3 02-01-2010 11:09 AM
Recursively add missing ".jpg" to all JPEG files. Chekote Linux - General 4 09-14-2009 11:45 PM
"Permission denied" and "recursive directory loop" when searching for string in files mack1e Linux - Newbie 5 06-12-2008 07:38 AM
shell script to recursively "compare" all files in a directory... silex_88 Programming 3 05-12-2007 04:24 AM
Recursively deleting ".directory" Frank616 Linux - Software 2 03-25-2005 11:58 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:04 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration