LinuxQuestions.org - Sed question

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - Sed question (https://www.linuxquestions.org/questions/programming-9/sed-question-622700/)

wondergirl

02-21-2008 02:54 AM

Sed question

Hmm...have been struggling with this for awhile...

I have 2 files : file A and file B. File A contains a list of servers in a certain format, and file B contains servernames that need to be removed from File A.

File A :
======

servera yaddayaddablabla
serverb yahdhdhydhhd
serverc dhhdkkdkkdkd
serverd ddkdkkdd

........ you get the idea

File B :
=======

serverc
serverd

Question : how do I use sed to remove / delete lines from file A using the contents of FileB? Servers in File B, that have entries in File A, those entries should be deleted.

I have this code but it doesnt seem to do the work for some erason..ended up with the same file! Appreciate if you can help.

--------------------

#!/bin/sh
for i in `cat fileB`;
do
sed '/^${i}/d' fileA>newfile
mv newfile fileA
echo ${i}
done

jlliagre

02-21-2008 03:14 AM

Instead of

Code:

sed '/^${i}/d' fileA>newfile

try

Code:

sed '/^'${i}'/d' fileA>newfile

wondergirl

02-21-2008 04:20 AM

Quote:

Originally Posted by jlliagre (Post 3064717)

Instead of

Code:

sed '/^${i}/d' fileA>newfile

try

Code:

sed '/^'${i}'/d' fileA>newfile

Done your suggestion but looks like the file is still not edited :( :(

ghostdog74

02-21-2008 06:33 AM

Code:

# join -v 1 file file1

servera yaddayaddablabla

serverb yahdhdhydhhd

pixellany

02-21-2008 07:04 AM

To remove lines beginning with the word "opt" from file "list"

i="opt"

sed /^$i/d list

OR

sed "/^$i/d" list

In this example, quoting is not required, since there is no ambiguous meaning. If quotes are used, they must be double-quotes to allow bash to expand $i.

What was the purpose of ${i}? The curly brackets don't seem necessary.

wondergirl

02-21-2008 07:29 AM

Quote:

Originally Posted by pixellany (Post 3064908)

I'm just used to putting the curly brackets :-P I read somewhere that its good practice..

I did what you mentioned...it worked if I declared i="opt" like, but not if I put all the patterns I want to match, in one file, like my initial example above....

pixellany

02-21-2008 07:45 AM

It seems your basic loop is going to be very inefficient. For every value of "i" it makes one substitution and then writes a new file. If the files are large, it will be slow. Perhaps the solution using "join" is better.

Note:

instead of:
sed /keyword/d file > newfile
mv newfile file

how about:
sed -i /keyword/d file

wondergirl

02-21-2008 08:11 AM

Quote:

Originally Posted by pixellany (Post 3064961)

Hmmm...it seems like it doesnt recognize the -i value..? (I'm on Solaris 5.8 machine).

I have to read a bit about the join command because I'm not sure what that one will do...

Thanks for your comments!

slakmagik

02-21-2008 08:19 AM

random thoughts

Quote:

Originally Posted by wondergirl (Post 3064936)

I'm just used to putting the curly brackets :-P I read somewhere that its good practice..

It's really not, IMO - just extra clutter. Though I suppose you could make an argument for consistency. I only use the full ${var} form when (a) it's required, such as for arrays and parameter expansion or (b) when it's practically required, such as with ambiguous strings - ${foo}bar because 'foobar' doesn't exist.

Quote:

Originally Posted by pixellany (Post 3064961)

Note:

instead of:
sed /keyword/d file > newfile
mv newfile file

how about:
sed -i /keyword/d file

pixellany - I recommend it but always note that it's not standard/portable.

Quote:

Originally Posted by ghostdog74 (Post 3064879)

Code:

# join -v 1 file file1

servera yaddayaddablabla

serverb yahdhdhydhhd

ghostdog74's probably got the best/simplest approach here, as long as the files are sorted.

slakmagik

02-21-2008 08:22 AM

Took me a long time to marshal all those quotes and I missed your reply - you already noted #2 ('-i' is a GNU extension to sed) and are looking into #3 (join). Sorry. I didn't *mean* to be redundant.

wondergirl

02-21-2008 08:38 AM

Quote:

Originally Posted by digiot (Post 3065006)

Took me a long time to marshal all those quotes and I missed your reply - you already noted #2 ('-i' is a GNU extension to sed) and are looking into #3 (join). Sorry. I didn't *mean* to be redundant.

That is OK :)

What do you mean I can do join as long as its sorted? Trying to read about join but would appreciate if you can explain a bit more.

wondergirl

02-21-2008 09:07 AM

Hmmmm this is funny, when I ran this, to check whether it can get the values from fileB and process it :

#!/bin/sh
for i in `cat fileB`;
do
sed '/^'$i'/d' fileA>newfile
echo $i
exit
done

I ended up with a newfile that was minus the line that matched first $i pattern! So why doesnt it work when it keeps going to the end of the loop???? I dotn understand why it work with only the first $i. Grrrrrrrrr...

cicorino

02-21-2008 10:13 AM

well if you insert an 'exit' before 'done'...
[edit: and you rewrite newfile at each iteration]

jlliagre

02-21-2008 10:15 AM

Are you sure you are really running this very script ?

Code:

#!/bin/sh

for i in `cat fileB`

do

  sed '/^'${i}'/d' fileA>newfile

  mv newfile fileA

  echo ${i}

done

There is no reason for it not to remove from fileA all the lines starting with strings from fileB.

Here is something that demonstrates it works:

Code:

#!/bin/sh

cat >fileA <<%

server1 foo

server2 bar

server3 foo

server4 bar

%

cat >fileB <<%

server1

server4

%

echo ---

echo before fileA=

cat fileA

for i in `cat fileB`

do

  sed '/^'${i}'/d' fileA>newfile

  mv newfile fileA

done

echo ---

echo after fileA=

cat fileA

output:

Code:

---

before fileA=

server1 foo

server2 bar

server3 foo

server4 bar

---

after fileA=

server2 bar

server3 foo

wondergirl

02-21-2008 05:12 PM

Quote:

Originally Posted by jlliagre (Post 3065132)

Are you sure you are really running this very script ?

Code:

#!/bin/sh

for i in `cat fileB`

do

  sed '/^'${i}'/d' fileA>newfile

  mv newfile fileA

  echo ${i}

done

There is no reason for it not to remove from fileA all the lines starting with strings from fileB.

Here is something that demonstrates it works:

Code:

#!/bin/sh

cat >fileA <<%

server1 foo

server2 bar

server3 foo

server4 bar

%

cat >fileB <<%

server1

server4

%

echo ---

echo before fileA=

cat fileA

for i in `cat fileB`

do

  sed '/^'${i}'/d' fileA>newfile

  mv newfile fileA

done

echo ---

echo after fileA=

cat fileA

output:

Code:

---

before fileA=

server1 foo

server2 bar

server3 foo

server4 bar

---

after fileA=

server2 bar

server3 foo

Yes I am sure! I feel like I'm going crazy. It should work! The thing is, if I cut off the file and left off only 3 patterns as a test, it worked..

is there some sort of limitation with sed with the numbers of itireation or something?????? I'm beating my head against the wall for this!

osor	02-21-2008 06:22 PM

Quote:

Originally Posted by wondergirl (Post 3065527)

is there some sort of limitation with sed with the numbers of itireation or something?????? I'm beating my head against the wall for this!

There may be some sort of limitation in your shell’s implementation of the for loop (how big is fileB?). Try this instead:

Code:

#!/bin/sh

while read i; do

  sed '/^'${i}'/d' fileA>newfile

  mv newfile fileA

  echo ${i}

done < fileB

wondergirl

02-21-2008 07:48 PM

Quote:

Originally Posted by osor (Post 3065588)

There may be some sort of limitation in your shell’s implementation of the for loop (how big is fileB?). Try this instead:

Code:

#!/bin/sh

while read; do

  sed '/^'${i}'/d' fileA>newfile

  mv newfile fileA

  echo ${i}

done < fileB

YAY!!! This works like a charm :newbie:

Thanks so much for all the comments and help from all of you on this. The join command is interesting so I'm glad that was mentioned.

I didnt know there is a limitation on for loop in the shell....but apparently there is because the while loop works.

osor	02-21-2008 08:18 PM

Quote:

Originally Posted by wondergirl (Post 3065643)

YAY!!! This works like a charm

Except of course that I made a typo ;). The first line should be:

Code:

while read i; do

wondergirl

02-21-2008 09:08 PM

Quote:

Originally Posted by osor (Post 3065666)

Except of course that I made a typo ;). The first line should be:

Code:

while read i; do

That is OK..I caught that and fixed it myself but the while made it work.

slakmagik

02-21-2008 11:22 PM

Quote:

Originally Posted by wondergirl (Post 3065018)

That is OK :)

Thanks. :)

Quote:

Originally Posted by wondergirl (Post 3065018)

What do you mean I can do join as long as its sorted? Trying to read about join but would appreciate if you can explain a bit more.

:cat file1
server1
server2
server3
server4
server5

:cat file2 # out of order
server4
server2

:join -v1 file1 file2
server1
server2
server3
server5

Guessing at the algorithm that requires the files to be sorted: file1's server4 and file2's server4 join, but we've already passed server2 in file1, so it's 'wrongly' (for our purposes) skipped when we get to it in file2.

:join -v1 <(sort file1) <(sort file2)
server1
server3
server5

With the files sorted, we exclude both the joined server4 and server2. (That second example uses process substitution and I don't know if that's supported on Solaris, but it's just an example.)

Hope that helps.

jlliagre

02-22-2008 02:06 AM

Quote:

Originally Posted by digiot (Post 3065778)

That second example uses process substitution and I don't know if that's supported on Solaris, but it's just an example.

It does. It's more a shell feature than an OS one. Solaris has ksh, zsh and bash which all support process substitution.

slakmagik

02-22-2008 02:42 AM

Well, I was thinking it was system-dependent as well as a shell feature. Refreshing my memory with the bash manual, it states, "Process substitution is supported on systems that support named pipes (FIFOs) or the /dev/fd method of naming open files", so it is but of course Solaris supports named pipes and possibly /dev/fd, too, so that was excessive compatibility paranoia. :)

All times are GMT -5. The time now is 06:53 PM.