Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
03-09-2005, 09:27 AM
|
#1
|
Member
Registered: Jan 2004
Location: VT, USA
Distribution: Gentoo, Ubuntu - t3h 1337 & the easy, respectively
Posts: 125
Rep:
|
bash and filenames with special characters
I am trying to write a script that organizes and sorts photos from my digital camera into folders by date - totally destroying the current random folder structure.
The problem is that for any folder that has a space or special character in the name, it chokes up and more or less skips over the files.
I think this is related to the way 'for' will use a space as a separator. I've tried doing some funky stuff with 'while', but it still screws up.
Can someone tell me how to change this so that each new line from the `find` command will be interpereted as the entire string and not broken down further by subsequent commands? Perhaps a way to escape the strings (tried using sed 's/\ /\\ /g' to no avail)? I've done my homework on this, but I'm not finding a solution.
Help, Please, and Thank you!
Also: Am I reinventing the wheel to rename my pics? Or if not then is bash a rational choice for this task? Or would I be better suited to take some quick lessons on a more advanced language?
Code:
#!/bin/bash
# Copyright 2004 Alvin A ONeal Jr
# GPL'ed
# USAGE: picdir.sh /path/to/pics/ /new/dir/
# This script will move all jpgs from one folder
# to another whilst sorting and renaming them by timestamp
# checksums determine whether dups are actually dups
# *** dups will be overwritten ***
# needs BASH FIND GREP CUT FILE JHEAD
# FLAWS: Doesn't like relative paths
# Need to escape SPACES and other SPECIAL CHARACTERS!
COUNT=0
# need to translate relative paths to absolutes before this will work.
# also consider quoted path values... messy messy messy
PATH_OLD=${1}
# PATH_OLD="$HOME"
echo "line 21: Pictures will be gathered from '$PATH_OLD'"
#if [ -e "${2}" ]; then
# PATH_NEW="${2}"
#else
PATH_NOW="$HOME/Pictures/Life/"
#fi
echo "line 28: Pictures will be placed in '${PATH_NOW}'" ###
# http://www.issociate.de/board/post/1...t_in_find.html
# find $PATH_OLD -type f -print | while read -r PICTURE
for PICTURE in `find ${PATH_OLD} -type f` # MEMORY HOG!!
do
PATH_NEW=${PATH_NOW}
HAS_EXIF=`file ${PICTURE} | grep JPEG | grep EXIF`
if [ -n "${HAS_EXIF}" ]; then
echo "line 38: '${PICTURE}' has EXIF data" ###
TIMESTAMP=`jhead $PICTURE 2>1 | grep 'Date/Time' | cut -d':' -f2-6`
# need something to skip a file if it causes jhead error...
# Deciding path
DATE=`echo ${TIMESTAMP} | cut -d' ' -f1`
YEAR=`echo ${DATE} | cut -d':' -f1`
MONTH=`echo ${DATE} | cut -d':' -f2`
DAY=`echo ${DATE} | cut -d':' -f3`
PATH_NEW="${PATH_NEW}${YEAR}/${MONTH}/${DAY}/"
mkdir -p ${PATH_NEW}
# Deciding filename
TIME=`echo ${TIMESTAMP} | cut -d' ' -f2`
HOUR=`echo ${TIME} | cut -d':' -f1`
MINUTE=`echo ${TIME} | cut -d':' -f2`
SECOND=`echo ${TIME} | cut -d':' -f3`
PATH_NEW="${PATH_NEW}${HOUR}${MINUTE}${SECOND}.jpg"
# Complete path
echo "line 56: ${PATH_NEW}" ###
if [ -f "${PATH_NEW}" ]; then
echo "line 58: Name exists, checking... maybe dup?" ###
if [ ! "${PATH_NEW}" = "${PICTURE}" ]; then
# relative paths make this not work
echo "line 61: It isn't itself" ###
SUM_ORIG=`/usr/bin/md5sum ${PICTURE} | cut -d' ' -f1`
SUM_NEW=`/usr/bin/md5sum ${PATH_NEW} | cut -d' ' -f1`
if [ ! "${SUM_ORIG}" = "${SUM_NEW}" ]; then
# These pictures are not the same
i=0
PATH_NEWER="d${i}-${PATH_NEW}"
until [ ! -f ${PATH_NEWER} ]; do
(( i++ ))
PATH_NEWER="d${i}-${PATH_NEW}"
done
echo "${PATH_NEW} exists, appending 'd${i}-' to name."
PATH_NEW="${PATH_NEWER}"
echo "It's now ${PATH_NEW}?"
fi
# 3) dups deleted, non-dups renamed
fi
# 2) It wasn't the same file (might be duplicates)
# 1) A file of that name existed (might be itself).
# 0) All that settled, should be safe to move the filei
fi
echo "Gonna move that pic..." ###
mv -i ${PICTURE} ${PATH_NEW}
fi
# doesn't have EXIF, not bothering...
(( COUNT++ ))
echo $COUNT
done
echo "Mucked around with ${COUNT} files successfully!"
Last edited by CoolAJ86; 03-09-2005 at 09:36 AM.
|
|
|
03-09-2005, 10:25 AM
|
#2
|
Member
Registered: Sep 2002
Location: Tulsa, OK
Distribution: Slack, baby!
Posts: 349
Rep:
|
(=
I smile because I find this type of issue really annoying, and for a long time had no clue how to fix it, either. However, there is some really good news for you. (=
The secret is the bash variable IFS, or the Internal Field Separator. This bash variable is what determines how bash splits word boundaries. It's default is to match a tab, a space, or a newline.
You can easily change this variable before the for loop in question to only match tabs and newlines/carriage returns, and this will cause the filenames with spaces to remain intact.
One word of caution is this:
If you're doing other field splitting, make sure to revert this value to it's original state, because you may encounter split problems due to the fact that it no longer contains a space.
Consider that the dir /tmp/IFS contains the following files:
Code:
-rw-r--r-- 1 root root 0 2005-03-09 10:30 has\ onespace.txt
-rw-r--r-- 1 root root 0 2005-03-09 10:30 nospaces.txt
-rw-r--r-- 1 root root 0 2005-03-09 10:30 other\ file.rgf
-rw-r--r-- 1 root root 0 2005-03-09 10:30 otherfile.rgf
The follow code shows how to use 'find' to display them correctly in a for loop, and then with IFS reverted back to it's original state
Code:
#!/bin/bash
# store original value, and set to catch tab(9), newline(A), and CR(D)
#
IFScopy=$IFS
IFS=$'\x09'$'\x0A'$'\x0D'
echo "IFS FIX"
for i in `find /tmp/IFS/ -type f`; do
echo "$i"
done
# revert
#
IFS=$IFScopy
echo "IFS REVERT"
# now other for loops will work as before
#
for i in `find /tmp/IFS/ -type f`; do
echo $i
done
And the output:
Code:
/usr/sbin> q.sh
IFS FIX
/tmp/IFS/other file.rgf
/tmp/IFS/nospaces.txt
/tmp/IFS/otherfile.rgf
/tmp/IFS/has onespace.txt
IFS REVERT
/tmp/IFS/other
file.rgf
/tmp/IFS/nospaces.txt
/tmp/IFS/otherfile.rgf
/tmp/IFS/has
onespace.txt
Happy bashing!
|
|
|
03-09-2005, 02:50 PM
|
#3
|
Member
Registered: Jan 2004
Location: VT, USA
Distribution: Gentoo, Ubuntu - t3h 1337 & the easy, respectively
Posts: 125
Original Poster
Rep:
|
Thanks so much! This will most certainly be handy in the future!
However, in wait for a reply, I found that the way I was going about it was completely in disregard to find's built in -exec function. So I rewrote the code to use that... did a little recursing... tweeked a bit. Works flawlessly, AFAIK.
Code:
#!/bin/bash
# /usr/local/bin/picdir.sh
# Picdir v0.9 rc1
# Copyleft 2005 Alvin A ONeal Jr - This software is OpenSource
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
# Guaranteed to be my personal best!
# ...not guaranteed to work...
# What will this do?
# Search for all JPEG type files within a directory containing EXIF data
# Use the timestamp in the EXIF data to relocate and rename the file
# Using checksums, determine and remove duplicate files
# Uniquely name files which have the same timestamp but different data
# needs BASH FIND GREP CUT FILE JHEAD MD5SUM MV
if [ ! -n "${1}" ]; then
echo "USAGE: ${0} /my/pictures/unsorted/ [/my/pictures/sorted/]"
echo "The trailing '/' is kinda important... btw..."
exit
fi
if [ ! -n "${2}" ]; then
MOVETO="${HOME}/Pictures/"
else
MOVETO=${2}
fi
# Call self recursively
if [ ! "${3}" = "RECUR" ]; then
echo "Finding and organizing all pictures with EXIF timestamps..."
find "${1}" -type f -exec "${0}" {} "${2}" "RECUR" \;
echo "Done!"
else
FILE="${1}"
HAS_EXIF=$(file "${FILE}" | grep 'JPEG' | grep 'EXIF') # Is it better this way?
if [ -n "${HAS_EXIF}" ]; then
TIMESTAMP=`jhead "${FILE}" 2>1 | grep 'Date/Time' | cut -d':' -f2-6` # Or this way?
if [ ! -n "${TIMESTAMP}" ]; then
# '${FILE}' has EXIF but no timestamp!"
exit
fi
# Deciding path
DATE=`echo ${TIMESTAMP} | cut -d' ' -f1`
YEAR=`echo ${DATE} | cut -d':' -f1`
MONTH=`echo ${DATE} | cut -d':' -f2`
DAY=`echo ${DATE} | cut -d':' -f3`
MOVETO="${MOVETO}${YEAR}/${MONTH}/${DAY}/"
mkdir -p ${MOVETO}
# Deciding filename
TIME=`echo ${TIMESTAMP} | cut -d' ' -f2`
HOUR=`echo ${TIME} | cut -d':' -f1`
MINUTE=`echo ${TIME} | cut -d':' -f2`
SECOND=`echo ${TIME} | cut -d':' -f3`
NEWFILE="${HOUR}${MINUTE}${SECOND}.jpg"
# Complete path
ABSPATH="${MOVETO}${NEWFILE}"
# Complete path
if [ -f "${ABSPATH}" ]; then
# 1) File exists with that name"
if [ ! "${ABSPATH}" = "${FILE}" ]; then
# 2) The file isn't itself ... yeah ... that makes sense"
SUM0=`/usr/bin/md5sum "${FILE}" | cut -d' ' -f1`
SUM1=`/usr/bin/md5sum "${ABSPATH}" | cut -d' ' -f1`
if [ ! "${SUM0}" = "${SUM1}" ]; then
# 3) It isn't a duplicate of the same picture"
i=0
ABSPATH_1="d${i}-${ABSPATH}"
ABSPATH_1="${MOVETO}d${i}-${HOUR}${MINUTE}${SECOND}.jpg"
until [ ! -f ${ABSPATH_1} ]; do
(( i++ ))
ABSPATH_1="${MOVETO}d${i}-${HOUR}${MINUTE}${SECOND}.jpg"
done
# Giving it a unique name 'd#-FILE'"
echo "Appending 'd${i}-' to ${ABSPATH}: file exists."
ABSPATH="${ABSPATH_1}"
# 3) non-duplicate of same name was renamed"
else
echo "Checksum match: Duplicate file overwritten"
fi
fi
# 2) if the file is itself, pass-along, mv will handle it"
fi
# 1) All that settled, should be safe to move the file
if [ ! "${FILE}" = "${ABSPATH}" ]; then
echo "Moving ${FILE} to ${ABSPATH}"
mv "${FILE}" ${ABSPATH}
fi
fi
# 0) If that even was a picture, it certainly didn't have EXIF data."
fi
# I tested this baby on my precious photos and it worked for me.
# Sorted about 700 photos from about 34,500 files total in a few minutes. :-D
Last edited by CoolAJ86; 03-09-2005 at 04:26 PM.
|
|
|
All times are GMT -5. The time now is 08:30 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|