deleting lines above and below a word!

jacky29 · 03-25-2011, 05:56 AM

hi...
i jst want to delete a host entry from httpd.conf
for eg:

i have entries such as:

<VirtualHost 192.168.1.157:80>
DocumentRoot /home/karthik
ServerName www.karthik.com
</VirtualHost>
<VirtualHost 192.168.1.157:80>
DocumentRoot /home/karthik1
ServerName www.karthik1.com
</VirtualHost>
<VirtualHost 192.168.1.157:80>
DocumentRoot /home/lalz
ServerName lal.com
</VirtualHost>

i want to delete the entry containing the domain name lal.com
ie..the output should be:
<VirtualHost 192.168.1.157:80>
DocumentRoot /home/karthik
ServerName www.karthik.com
</VirtualHost>
<VirtualHost 192.168.1.157:80>
DocumentRoot /home/karthik1
ServerName www.karthik1.com
</VirtualHost>

the entry for lal.com should be deleted!!
anyone..plz help me out!!

repo · 03-25-2011, 06:03 AM

You could use

Code:

sed -i '/lal.com/d' httpd.conf

Make a backup of the file before proceeding.

Kind regards

jacky29 · 03-25-2011, 06:07 AM

thankz for the reply..bt the command doesnt work!!

sed '/lal/d' httpd.conf

only deletes that particular line...
plz help me out!!

druuna · 03-25-2011, 07:12 AM

Hi,

Using sed:

Create a file with the following content:

Code:

:t
/<VirtualHost/,/VirtualHost>/ { # For each line in this range
  /VirtualHost>/!{    # If not at the /VirtualHost>/ marker
    $!{               # nor the last line of the file,
      N;              # add the Next line to the pattern space
      bt
    }                 # and branch (loop back) to the :t label.
  }                   # This line matches the /VirtualHost>/ marker.
  /lal.com/d;         # If /lal.com/ matches, delete the block.
}                     # Otherwise, the block will be printed.

Save file (this example uses sedcmds).

Run sed as follows:

Code:

sed -f sedcmds infile

Example run:

Code:

$ cat infile
<VirtualHost 192.168.1.157:80>
DocumentRoot /home/karthik
ServerName www.karthik.com
</VirtualHost>
<VirtualHost 192.168.1.157:80>
DocumentRoot /home/karthik1
ServerName www.karthik1.com
</VirtualHost>
<VirtualHost 192.168.1.157:80>
DocumentRoot /home/lalz
ServerName lal.com
</VirtualHost>

$ sed -f sedcmds infile
<VirtualHost 192.168.1.157:80>
DocumentRoot /home/karthik
ServerName www.karthik.com
</VirtualHost>
<VirtualHost 192.168.1.157:80>
DocumentRoot /home/karthik1
ServerName www.karthik1.com
</VirtualHost>

Hope this helps.

EDIT:
Or as a one-liner:

Code:

sed ':t /<VirtualHost/,/VirtualHost>/ { /VirtualHost>/!{ $!{ N; bt } }; /lal.com/d; }' infile

Nominal Animal · 03-25-2011, 07:19 AM

Rather than having all Apache virtual hosts configured in a single file, I recommend using instead

Code:

    Include *.vhost

and creating a separate file for each virtual host (in the base Apache configuration file directory, named domainname.vhost). That way you could just delete the file.

However, your current problem can be easily solved via awk. It is not very elegant, but it should work well for you. It is not terribly well tested, but it worked flawlessly with the few VirtualHost configurations I tried it with. It modifies the specified configuration file, but saves the previous version with the suffix .saved.

Code:

#!/bin/bash

# Usage?
if [ $# -lt 2 ] || [ "$1" == "-h" ] || [ "$1" == "--help" ]; then
    echo "" >&2
    echo "Usage: $0 [ -h | --help ]" >&2
    echo "       $0 configfile servername ..." >&2
    echo "" >&2
    echo "This will remove virtual servers containing servername(s)" >&2
    echo "from the specified configfile." >&2
    echo "" >&2
    exit 0
fi

FILE="$1"
if [ ! -f "$FILE" ]; then
    echo "$FILE: No such file." >&2
    exit 1
fi
shift 1

# Create an automatically removed temporary work directory $WORK
WORK="`mktemp -d`" || exit $?
trap 'rm -rf "$WORK"' EXIT

# awk script to omit VirtualHosts with matching ServerNames
if ! awk -v "namelist=$*" '
    BEGIN {
        RS="[\n\r]"
        IFS="[\t\v\f ]+"

        # Parse namelist into name array using whitespace separators.
        namelist = tolower(namelist)
        gsub(/[\t\n\v\f\r ]+/, " ", namelist)
        sub(/^ /, "", namelist)
        sub(/ $/, "", namelist)
        split(namelist, name, " ")

        # Initial mode.
        line = 0
        skip = 0
    }

    (tolower($1) == "<virtualhost") {
        split("", vhost)
        skip = 0
        line = 1
        vhost[line] = $0
        next
    }

    {
        if (line)
            vhost[++line] = $0
        else
            print
    }

    (tolower($1) == "servername") {
        value = tolower($2)
        for (n in name)
            if (index(value, name[n]) > 0)
                skip = 1;
    }

    (tolower($1) == "</virtualhost>") {
        if (!skip)
            for (n = 1; n <= line; n++)
                print vhost[n]

        skip = 0
        line = 0
        next
    }
' "$FILE" > "$WORK/output" ; then
    echo "$FILE: Error processing file, aborted." >&2
    exit 1
fi

# Successful. Replace original file with the work file.
# But first save a backup copy.
mv -f "$FILE" "$FILE.saved"
mv -f "$WORK/output" "$FILE"

The shell script prints the usage information if necessary, creates (and autodeletes) a safe temporary working directory, saves and replaces the original file if success, or prints an error message if necessary. The real work is done in the awk script.

The awk script is a case insensitive line-based state machine. In the default state, it just outputs each input line as is.

When the awk script encounters a line with the first token matching <VirtualHost, it changes state, and starts saving each input line into (a new) array vhost instead of outputting them.

If the awk script encounters an input line with the first token matching ServerName, it checks if any of the specified server names are contained in the second token. If yes, this array is marked to be skipped.

When the script encounters a line with the first token matching </VirtualHost>, it will reset to default mode. If the array was not marked skipped, all lines in the array are output at this point.

There are a few minor tricks in there to make sure the script works with both gawk (GNU awk, my favourite) and mawk, but they're not very important unless you're very interested in awk scripting.

Hope you find this useful.

crts · 03-25-2011, 07:24 AM

Hi,

this worked with your sample data

Code:

sed ':a N; \@</VirtualHost>@ {/lal.com/ d;b}; ba' file

kurumi · 03-25-2011, 07:30 AM

Code:

$ awk 'BEGIN{ ORS=RS="</VirtualHost>"} /lal\.com/{next} RT{print $0} '  file

or Ruby(1.9+)

Code:

$ ruby -0777 -ne '$_.split(/<\/VirtualHost>\n/).each{|x| puts x+"</VirtualHost>" if not x[/lal\.com/]  }' file

carltm · 03-25-2011, 07:32 AM

Hi crts,

I've done a lot with sed, but I don't recognize what this command does.
Would you please explain the syntax?

crts · 03-25-2011, 07:52 AM

Quote:

Originally Posted by carltm

Hi crts,

I've done a lot with sed, but I don't recognize what this command does.
Would you please explain the syntax?

Hi,

I assume you do not recognize the non-standard delimiter in '\@</VirtualHost>@', right? This does the same as

Code:

'/<\/VirtualHost>/'

The backslash before the @ simply indicates that @ shall be used as delimiter instead of the standard '/'.

druuna · 03-25-2011, 07:57 AM

@crts: Nice short solution!!

BTW, the </ part isn't needed for the sample data, so this would also work:

Code:

sed ':a N; /VirtualHost>/ {/lal.com/ d;b}; ba' file

carltm · 03-25-2011, 08:03 AM

Yes, that's the first I've seen that the delimiter.

Also I'm guessing that the ":a N;" and "; ba" set up a loop in
which the "{...}" command is run. Does it simply match from the
first occurrence of "\@</VirtualHost>@" to the next occurrence?

Or does it effectively create sections, and if "/lal.com/" is
matched it deleted the section?

crts · 03-25-2011, 08:15 AM

Hi druuna,

you are right. I just had a look at your solution and noticed that you omit the '</' part. This spares us the obfuscating escape character.
Examining your solution a bit further, you can transform it to

Code:

sed ':t /VirtualHost>/! {N;bt}; /lal.com/ d' infile

which has "negative" check conditions to loop the file. It is also shorter than mine by one instruction.

crts · 03-25-2011, 08:22 AM

Quote:

Originally Posted by carltm

Yes, that's the first I've seen that the delimiter.

Also I'm guessing that the ":a N;" and "; ba" set up a loop in
which the "{...}" command is run. Does it simply match from the
first occurrence of "\@</VirtualHost>@" to the next occurrence?

Or does it effectively create sections, and if "/lal.com/" is
matched it deleted the section?

It does the latter. I reads every line into the pattern buffer. If '\@</VirtualHost>@' is encountered it checks for /lal.com/ and deletes the pattern buffer if it finds lal.com. Regardless of the presence of lal.com it then jumps to the end of the sed script which also triggers sed's default action to print the pattern buffer. The buffer will be empty if lal.com were previously encountered; otherwise the content is printed and the script starts again with processing the next block.

grail · 03-25-2011, 08:53 AM

More awk but kinda the same:

Code:

awk 'BEGIN{ORS=RS="</VirtualHost>\n"}!/lal\.com/ && RT' file

crts · 03-25-2011, 09:31 AM

Quote:

Originally Posted by grail

More awk but kinda the same:

Code:

awk 'BEGIN{ORS=RS="</VirtualHost>\n"}!/lal\.com/ && RT' file

Shouldn't this be 'RS' instead of 'RT'?
Also, if the file has an empty line at the end I get a duplicate output of '</VirtualHost>' in the last line.