LinuxQuestions.org
Latest LQ Deal: Complete CCNA, CCNP & Red Hat Certification Training Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-17-2009, 07:42 AM   #1
arashi256
Member
 
Registered: Jan 2008
Location: Brighton, UK
Distribution: Ubuntu 12.04 / CentOS 6.5
Posts: 394

Rep: Reputation: 61
sed whitespace substitution problem.


Gah - does sed do something special with whitespace in regexs?

For example, a file (file.txt) contains a list of paths with spaces in the directory names such as:-

Code:
/home/directory/my documents/my file.txt
and I'm want to escape the spaces so that the output ends up like: -

Code:
/home/directory/my\ documents/my\ file.txt
so that Linux won't shit itself. The following regex...

Code:
for filename in $(cat ~/file.txt | sed -e 's/ /\\/g')
do
        echo "$filename"
done
...outputs this: -

Code:
/home/directory/my\documents/my\file.txt
So I figured if I just added a space after the '\' in the sed substitution, I should get what I want like so: -

Code:
for filename in $(cat ~/file.txt | sed -e 's/ /\\ /g')
do
        echo "$filename"
done
Instead the output I get is: -

Code:
/home/directory/my\
documents/my\
file.txt
What the hell? Since when did a space become a newline character?
 
Old 07-17-2009, 08:04 AM   #2
berbae
Member
 
Registered: Jul 2005
Location: France
Distribution: Arch Linux
Posts: 540

Rep: Reputation: Disabled
$(cat ~/file.txt | sed -e 's/ /\\ /g')
this creates a list of blank separated items,
and the for loop takes each item and executes the echo command,
so you get three lines, because the echo command has been called three times.

that may produce what you expect :

sed -ibak -e 's/ /\\ /g' ~/file.txt
cat ~/file.txt
 
Old 07-17-2009, 08:12 AM   #3
arashi256
Member
 
Registered: Jan 2008
Location: Brighton, UK
Distribution: Ubuntu 12.04 / CentOS 6.5
Posts: 394

Original Poster
Rep: Reputation: 61
Nope, sorry - same problem.
 
Old 07-17-2009, 08:17 AM   #4
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
Process substitution should manage blank spaces correctly:
Code:
while read filename
do
  echo "$filename"
done < <(cat file.txt)
 
Old 07-17-2009, 08:19 AM   #5
berbae
Member
 
Registered: Jul 2005
Location: France
Distribution: Arch Linux
Posts: 540

Rep: Reputation: Disabled
I tested the commands I gave to you on my machine and the 'cat file.txt' gives :
/home/directory/my\ documents/my\ file.txt

So what do you mean same problem ?
 
Old 07-17-2009, 08:29 AM   #6
rn_
Member
 
Registered: Jun 2009
Location: Orlando, FL, USA
Distribution: Suse, Redhat
Posts: 127
Blog Entries: 1

Rep: Reputation: 25
The problem, as mentioned by berbae, is with the space in the for loop. This is controlled by the IFS env. var which defines the Field separator characters, and space is one of them.

Can you post what was the output from berbae's script? just one small change to it though; remember to put a space after the \\.

However, if I need to do something in a loop with data which has spaces, I usually go with a while read loop, like so:

Code:
sed -e 's/ /\\ /g' ~/file.txt | while read line
do
    echo $line
    # other ops here
done
HTH.
-RN.
 
Old 07-17-2009, 08:35 AM   #7
arashi256
Member
 
Registered: Jan 2008
Location: Brighton, UK
Distribution: Ubuntu 12.04 / CentOS 6.5
Posts: 394

Original Poster
Rep: Reputation: 61
Quote:
Originally Posted by berbae View Post
I tested the commands I gave to you on my machine and the 'cat file.txt' gives :
/home/directory/my\ documents/my\ file.txt

So what do you mean same problem ?
Here is my script with the bit that creates the FILE-LIST.tmp file commented out and the stuff that does things with the output also commented out.

The FILE-LIST.tmp contains the line (could be more, but one will do for now...)

Code:
/home/sambashare/JonWork/Test Directory/Some File.txt
Code:
#!/bin/sh
DAY=`date +%F`
BACKUP_TARGET=/media/LaCie/BACKUP
#find /home/sambashare/ -type f -mtime -1 -print > ~/FILE-LIST.tmp
LINECOUNT=`wc -l ~/FILE-LIST.tmp | sed -e 's/[A-Za-z/./-]*//g'`
if [ "$LINECOUNT" -gt "0" ]; then
        #mkdir ${BACKUP_TARGET}/backup-${DAY}
        sed -ibak -e 's/ /\\ /g' ~/FILE-LIST.tmp
        for filename in $(cat ~/FILE-LIST.tmp)
        do
                DIRECTORY=${filename%/*}
                echo "$filename"
                #mkdir -p ${BACKUP_TARGET}/backup-${DAY}${DIRECTORY}
                #cp -a "$filename" ${BACKUP_TARGET}/backup-${DAY}${DIRECTORY}/
        done
        #mv ~/FILE-LIST.tmp ${BACKUP_TARGET}/backup-$backup${DAY}/FILE-LIST.tmp
fi
When I run this, the script echoes: -

Code:
/home/sambashare/JonWork/Test\
Directory/Some\
File.txt
Interestingly, when I look at the FILE-LIST.tmp file again after running the script, the file looks correct, so I assume the sed part is doing it's job, but I'd expect to see: -

Code:
/home/sambashare/JonWork/Test\ Directory/Some\ File.txt
from the script echo rather than the newline stuff I am seeing.

Last edited by arashi256; 07-17-2009 at 08:42 AM.
 
Old 07-17-2009, 08:37 AM   #8
vonbiber
Member
 
Registered: Apr 2009
Distribution: slackware 14.1 64-bit, slackware 14.2 64-bit, SystemRescueCD
Posts: 444

Rep: Reputation: 98
Quote:
Originally Posted by arashi256 View Post

Code:
/home/directory/my documents/my file.txt
and I'm want to escape the spaces so that the output ends up like: -

Code:
/home/directory/my\ documents/my\ file.txt
so that Linux won't shit itself. The following regex...

Code:
for filename in $(cat ~/file.txt | sed -e 's/ /\\/g')
do
        echo "$filename"
done
...outputs this: -

Code:
/home/directory/my\documents/my\file.txt
Instead the output I get is: -

Code:
/home/directory/my\
documents/my\
file.txt
What the hell? Since when did a space become a newline character?
The problem is not sed. It's your for loop
The separator for the for loop is ' ' (space), so it feeds the
variable filename with the string just before the space and sed
has already replaced ' ' with '\ '

One solution that comes to my mind right now would be to
read the file line by line (using head and tail and looping
until we reach the total number of lines)

e.g., say you store the number of lines of file.txt in the
variable N
then

i=0
while [ $i -lt $N ]
do
i=$((i+1))
head -n$i file.txt | tail -n1 | sed 's/ /\\&/g'
done
 
Old 07-17-2009, 08:44 AM   #9
arashi256
Member
 
Registered: Jan 2008
Location: Brighton, UK
Distribution: Ubuntu 12.04 / CentOS 6.5
Posts: 394

Original Poster
Rep: Reputation: 61
Quote:
Originally Posted by vonbiber View Post
The problem is not sed. It's your for loop
The separator for the for loop is ' ' (space), so it feeds the
variable filename with the string just before the space and sed
has already replaced ' ' with '\ '

One solution that comes to my mind right now would be to
read the file line by line (using head and tail and looping
until we reach the total number of lines)

e.g., say you store the number of lines of file.txt in the
variable N
then

i=0
while [ $i -lt $N ]
do
i=$((i+1))
head -n$i file.txt | tail -n1 | sed 's/ /\\&/g'
done
Sorry, I have no idea what you're doing there.
 
Old 07-17-2009, 09:06 AM   #10
vonbiber
Member
 
Registered: Apr 2009
Distribution: slackware 14.1 64-bit, slackware 14.2 64-bit, SystemRescueCD
Posts: 444

Rep: Reputation: 98
Quote:
Originally Posted by arashi256 View Post
Sorry, I have no idea what you're doing there.
1. first we get the total number of lines of file.txt and store the
result in the variable N
N=$(cat file.txt | wc -l)

2. Next we read line 1, line 2, ..., until I reach line N
we use a loop with a counter i that we initialize to 0 (zero)
i=0
i=$((i+1)) : this increment the i counter by 1
head -n$i file.txt : this reads the '$i' first lines in file.txt
tail -n1 : this reads the last line of the result of head
so that we're exactly at line number $i
sed 's/ /\\&/g' : replace all occurences of ' ' (space)
with '\ ' (escaped space) in the line $i

To recap:

N=$(cat file.txt | wc -l)
i=0
while [ $i -lt $N ]
do
i=$((i+1))
head -n$i file.txt | tail -n1 | sed 's/ /\\&/g'
done
 
Old 07-17-2009, 09:28 AM   #11
arashi256
Member
 
Registered: Jan 2008
Location: Brighton, UK
Distribution: Ubuntu 12.04 / CentOS 6.5
Posts: 394

Original Poster
Rep: Reputation: 61
Quote:
Originally Posted by vonbiber View Post
1. first we get the total number of lines of file.txt and store the
result in the variable N
N=$(cat file.txt | wc -l)

2. Next we read line 1, line 2, ..., until I reach line N
we use a loop with a counter i that we initialize to 0 (zero)
i=0
i=$((i+1)) : this increment the i counter by 1
head -n$i file.txt : this reads the '$i' first lines in file.txt
tail -n1 : this reads the last line of the result of head
so that we're exactly at line number $i
: replace all occurences of ' ' (space)
with '\ ' (escaped space) in the line $i

To recap:

N=$(cat file.txt | wc -l)
i=0
while [ $i -lt $N ]
do
i=$((i+1))
head -n$i file.txt | tail -n1 | sed 's/ /\\&/g'
done
I'm getting command not found doing that. Not sure where. Thanks anyway.
 
Old 07-17-2009, 09:39 AM   #12
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
Have you tried process substitution as I suggested above?
 
Old 07-17-2009, 09:43 AM   #13
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Code:
awk '{
    gsub(/ /,"\\ ")
    print $0
}
' file
 
Old 07-17-2009, 09:46 AM   #14
arashi256
Member
 
Registered: Jan 2008
Location: Brighton, UK
Distribution: Ubuntu 12.04 / CentOS 6.5
Posts: 394

Original Poster
Rep: Reputation: 61
Quote:
Originally Posted by colucix View Post
Have you tried process substitution as I suggested above?
Yep.

Code:
./inc.sh: line 10: syntax error near unexpected token `<'
./inc.sh: line 10: `    done < <(cat FILE-LIST.tmp)'
 
Old 07-17-2009, 09:48 AM   #15
arashi256
Member
 
Registered: Jan 2008
Location: Brighton, UK
Distribution: Ubuntu 12.04 / CentOS 6.5
Posts: 394

Original Poster
Rep: Reputation: 61
Quote:
Originally Posted by ghostdog74 View Post
Code:
awk '{
    gsub(/ /,"\\ ")
    print $0
}
' file
Newbies forum, I believe. Thanks for the reply, but that means exactly zilch to me. An explanation might have been nice?
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] echoing whitespace from a command substitution GahseyFan Linux - General 2 05-16-2009 04:58 AM
Problems with a substitution using sed wtaicken Programming 4 12-15-2008 04:04 AM
sed, replacing underscore with whitespace fjkum Programming 3 10-31-2007 12:09 AM
Whitespace parsing sed? carl.waldbieser Programming 1 12-12-2005 04:24 PM
Using sed in bash to remove whitespace jimieee Programming 3 01-28-2004 10:33 AM


All times are GMT -5. The time now is 08:36 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration