LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   sed whitespace substitution problem. (https://www.linuxquestions.org/questions/linux-newbie-8/sed-whitespace-substitution-problem-740756/)

arashi256 07-17-2009 07:42 AM

sed whitespace substitution problem.
 
Gah - does sed do something special with whitespace in regexs?

For example, a file (file.txt) contains a list of paths with spaces in the directory names such as:-

Code:

/home/directory/my documents/my file.txt
and I'm want to escape the spaces so that the output ends up like: -

Code:

/home/directory/my\ documents/my\ file.txt
so that Linux won't shit itself. The following regex...

Code:

for filename in $(cat ~/file.txt | sed -e 's/ /\\/g')
do
        echo "$filename"
done

...outputs this: -

Code:

/home/directory/my\documents/my\file.txt
So I figured if I just added a space after the '\' in the sed substitution, I should get what I want like so: -

Code:

for filename in $(cat ~/file.txt | sed -e 's/ /\\ /g')
do
        echo "$filename"
done

Instead the output I get is: -

Code:

/home/directory/my\
documents/my\
file.txt

What the hell? Since when did a space become a newline character?

berbae 07-17-2009 08:04 AM

$(cat ~/file.txt | sed -e 's/ /\\ /g')
this creates a list of blank separated items,
and the for loop takes each item and executes the echo command,
so you get three lines, because the echo command has been called three times.

that may produce what you expect :

sed -ibak -e 's/ /\\ /g' ~/file.txt
cat ~/file.txt

arashi256 07-17-2009 08:12 AM

Nope, sorry - same problem.

colucix 07-17-2009 08:17 AM

Process substitution should manage blank spaces correctly:
Code:

while read filename
do
  echo "$filename"
done < <(cat file.txt)


berbae 07-17-2009 08:19 AM

I tested the commands I gave to you on my machine and the 'cat file.txt' gives :
/home/directory/my\ documents/my\ file.txt

So what do you mean same problem ?

rn_ 07-17-2009 08:29 AM

The problem, as mentioned by berbae, is with the space in the for loop. This is controlled by the IFS env. var which defines the Field separator characters, and space is one of them.

Can you post what was the output from berbae's script? just one small change to it though; remember to put a space after the \\.

However, if I need to do something in a loop with data which has spaces, I usually go with a while read loop, like so:

Code:

sed -e 's/ /\\ /g' ~/file.txt | while read line
do
    echo $line
    # other ops here
done

HTH.
-RN.

arashi256 07-17-2009 08:35 AM

Quote:

Originally Posted by berbae (Post 3610684)
I tested the commands I gave to you on my machine and the 'cat file.txt' gives :
/home/directory/my\ documents/my\ file.txt

So what do you mean same problem ?

Here is my script with the bit that creates the FILE-LIST.tmp file commented out and the stuff that does things with the output also commented out.

The FILE-LIST.tmp contains the line (could be more, but one will do for now...)

Code:

/home/sambashare/JonWork/Test Directory/Some File.txt
Code:

#!/bin/sh
DAY=`date +%F`
BACKUP_TARGET=/media/LaCie/BACKUP
#find /home/sambashare/ -type f -mtime -1 -print > ~/FILE-LIST.tmp
LINECOUNT=`wc -l ~/FILE-LIST.tmp | sed -e 's/[A-Za-z/./-]*//g'`
if [ "$LINECOUNT" -gt "0" ]; then
        #mkdir ${BACKUP_TARGET}/backup-${DAY}
        sed -ibak -e 's/ /\\ /g' ~/FILE-LIST.tmp
        for filename in $(cat ~/FILE-LIST.tmp)
        do
                DIRECTORY=${filename%/*}
                echo "$filename"
                #mkdir -p ${BACKUP_TARGET}/backup-${DAY}${DIRECTORY}
                #cp -a "$filename" ${BACKUP_TARGET}/backup-${DAY}${DIRECTORY}/
        done
        #mv ~/FILE-LIST.tmp ${BACKUP_TARGET}/backup-$backup${DAY}/FILE-LIST.tmp
fi

When I run this, the script echoes: -

Code:

/home/sambashare/JonWork/Test\
Directory/Some\
File.txt

Interestingly, when I look at the FILE-LIST.tmp file again after running the script, the file looks correct, so I assume the sed part is doing it's job, but I'd expect to see: -

Code:

/home/sambashare/JonWork/Test\ Directory/Some\ File.txt
from the script echo rather than the newline stuff I am seeing.

vonbiber 07-17-2009 08:37 AM

Quote:

Originally Posted by arashi256 (Post 3610646)

Code:

/home/directory/my documents/my file.txt
and I'm want to escape the spaces so that the output ends up like: -

Code:

/home/directory/my\ documents/my\ file.txt
so that Linux won't shit itself. The following regex...

Code:

for filename in $(cat ~/file.txt | sed -e 's/ /\\/g')
do
        echo "$filename"
done

...outputs this: -

Code:

/home/directory/my\documents/my\file.txt
Instead the output I get is: -

Code:

/home/directory/my\
documents/my\
file.txt

What the hell? Since when did a space become a newline character?

The problem is not sed. It's your for loop
The separator for the for loop is ' ' (space), so it feeds the
variable filename with the string just before the space and sed
has already replaced ' ' with '\ '

One solution that comes to my mind right now would be to
read the file line by line (using head and tail and looping
until we reach the total number of lines)

e.g., say you store the number of lines of file.txt in the
variable N
then

i=0
while [ $i -lt $N ]
do
i=$((i+1))
head -n$i file.txt | tail -n1 | sed 's/ /\\&/g'
done

arashi256 07-17-2009 08:44 AM

Quote:

Originally Posted by vonbiber (Post 3610709)
The problem is not sed. It's your for loop
The separator for the for loop is ' ' (space), so it feeds the
variable filename with the string just before the space and sed
has already replaced ' ' with '\ '

One solution that comes to my mind right now would be to
read the file line by line (using head and tail and looping
until we reach the total number of lines)

e.g., say you store the number of lines of file.txt in the
variable N
then

i=0
while [ $i -lt $N ]
do
i=$((i+1))
head -n$i file.txt | tail -n1 | sed 's/ /\\&/g'
done

Sorry, I have no idea what you're doing there.

vonbiber 07-17-2009 09:06 AM

Quote:

Originally Posted by arashi256 (Post 3610717)
Sorry, I have no idea what you're doing there.

1. first we get the total number of lines of file.txt and store the
result in the variable N
N=$(cat file.txt | wc -l)

2. Next we read line 1, line 2, ..., until I reach line N
we use a loop with a counter i that we initialize to 0 (zero)
i=0
i=$((i+1)) : this increment the i counter by 1
head -n$i file.txt : this reads the '$i' first lines in file.txt
tail -n1 : this reads the last line of the result of head
so that we're exactly at line number $i
sed 's/ /\\&/g' : replace all occurences of ' ' (space)
with '\ ' (escaped space) in the line $i

To recap:

N=$(cat file.txt | wc -l)
i=0
while [ $i -lt $N ]
do
i=$((i+1))
head -n$i file.txt | tail -n1 | sed 's/ /\\&/g'
done

arashi256 07-17-2009 09:28 AM

Quote:

Originally Posted by vonbiber (Post 3610733)
1. first we get the total number of lines of file.txt and store the
result in the variable N
N=$(cat file.txt | wc -l)

2. Next we read line 1, line 2, ..., until I reach line N
we use a loop with a counter i that we initialize to 0 (zero)
i=0
i=$((i+1)) : this increment the i counter by 1
head -n$i file.txt : this reads the '$i' first lines in file.txt
tail -n1 : this reads the last line of the result of head
so that we're exactly at line number $i
: replace all occurences of ' ' (space)
with '\ ' (escaped space) in the line $i

To recap:

N=$(cat file.txt | wc -l)
i=0
while [ $i -lt $N ]
do
i=$((i+1))
head -n$i file.txt | tail -n1 | sed 's/ /\\&/g'
done

I'm getting command not found doing that. Not sure where. Thanks anyway.

colucix 07-17-2009 09:39 AM

Have you tried process substitution as I suggested above?

ghostdog74 07-17-2009 09:43 AM

Code:

awk '{
    gsub(/ /,"\\ ")
    print $0
}
' file


arashi256 07-17-2009 09:46 AM

Quote:

Originally Posted by colucix (Post 3610758)
Have you tried process substitution as I suggested above?

Yep.

Code:

./inc.sh: line 10: syntax error near unexpected token `<'
./inc.sh: line 10: `    done < <(cat FILE-LIST.tmp)'


arashi256 07-17-2009 09:48 AM

Quote:

Originally Posted by ghostdog74 (Post 3610765)
Code:

awk '{
    gsub(/ /,"\\ ")
    print $0
}
' file


Newbies forum, I believe. Thanks for the reply, but that means exactly zilch to me. An explanation might have been nice? :)


All times are GMT -5. The time now is 07:06 PM.